The easiest is to use filters
# 查看滤镜帮助
ffplay -h filter=drawbox
# 单个矩形
ffplay -i fpx.gif -vf drawbox:x=10:y=10:w=50:h=50:c=red
# 多个矩形,中间使用逗号连接
ffplay -i fpx.gif -vf drawbox:x=10:y=10:w=50:h=50:c=red,\
drawbox:x=100:y=10:w=50:h=50:c=rgb(0,0,255)
Now the YOLO algorithm detects the position of the object in each frame of the picture (sx, sy, s->w, s->h), and it is necessary to embed the function of drawing a rectangular box in the ffplay source code. First, please refer to ffmpeg-4.3.1/libavfilter/vf_drawbox.c.
image:
Core code (valid only for YUV format):
for (y = FFMAX(yb, 0); y < FFMIN(yb + s->h, frame->height); y++) {
row[0] = frame->data[0] + y * frame->linesize[0];
for (plane = 1; plane < 3; plane++)
row[plane] = frame->data[plane] +
frame->linesize[plane] * (y >> s->vsub);
for (x = FFMAX(xb, 0); x < FFMIN(xb + s->w, frame->width); x++) {
if (pixel_belongs_to_box(s, x, y)) {
row[0][x ] = s->yuv_color[Y];
row[1][x >> s->hsub] = s->yuv_color[U];
row[2][x >> s->hsub] = s->yuv_color[V];
}
}
}
The principle is very simple. Two for loops traverse each pixel of the rectangle, and use the pixel_belongs_to_box() function to determine whether the point is within the rectangular box. If it is, modify the color of the point. The pixel_belongs_to_box() function is as follows, s->thickness represents the thickness of the rectangular box.
static int pixel_belongs_to_box(DrawBoxContext *s, int x, int y)
{
return (y - s->y < s->thickness) || (s->y + s->h - 1 - y < s->thickness) ||
(x - s->x < s->thickness) || (s->x + s->w - 1 - x < s->thickness);
}
In the above core code, s->hsub and s->vsub also appear. These are chrominance sub-sampling values, which are related to the pixel format. The acquisition method is as follows. pix_fmt is the pixel format of the frame.
const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt);
s->hsub = desc->log2_chroma_w;
s->vsub = desc->log2_chroma_h;
====================================================================
The disadvantage is that scanning the entire rectangle causes the speed to be too slow . After all, we only draw the rectangle instead of the entire rectangle, so we make improvements:
for (y = FFMAX(yb, 0); y < FFMIN(yb + s->h, frame->height); y++) {
row[0] = frame->data[0] + y * frame->linesize[0];
for (plane = 1; plane < 3; plane++)
row[plane] = frame->data[plane] +
frame->linesize[plane] * (y >> s->vsub);
if ((y - yb < s->thickness) || (yb + s->h - 1 - y < s->thickness)) {
for (x = FFMAX(xb, 0); x < FFMIN(xb + s->w, frame->width); x++) {
row[0][x ] = s->dst_color[Y];
row[1][x >> s->hsub] = s->dst_color[U];
row[2][x >> s->hsub] = s->dst_color[V];
}
} else {
for (x = FFMAX(xb, 0); x < xb + s->thickness; x++) {
row[0][x ] = s->dst_color[Y];
row[1][x >> s->hsub] = s->dst_color[U];
row[2][x >> s->hsub] = s->dst_color[V];
}
for (x = xb + s->w - s->thickness; x < FFMIN(xb + s->w, frame->width); x++) {
row[0][x ] = s->dst_color[Y];
row[1][x >> s->hsub] = s->dst_color[U];
row[2][x >> s->hsub] = s->dst_color[V];
}
}
}
The principle is also to scan from top to bottom, but by judging the y coordinate, the blank part in the middle of the rectangular frame is skipped, and the speed is greatly accelerated!