FFmpeg source code analysis: Introduction to video filters (Part 2)

FFmpeg provides audio and video filters in the libavfilter module. All video filters are registered in libavfilter/allfilters.c. We can also use the ffmpeg -filters command line to view all the currently supported filters, the front -v stands for video. This article mainly introduces video filters, including: drawing text, edge detection, fade in and fade out, Gaussian blur, left and right mirroring, layer overlay, video rotation.

For a detailed introduction to video filters, see the official document: Video Filters. For the first half of the introduction to video filters, you can view the previous article: Introduction to Video Filters (Part 1).

1、transtext

Draw text, draw text on the video screen. You need to enable the freetype third-party library --enable-freetype. For details, please check: freetype official website. If you want to set the font size and color, you need to enable the fontconfig third-party library --enable-fontconfig. For details, please check: fontconfig official website. If you want to set the font shape, you need to open the libfribidi third-party library --enable-libfribidi. For details, please check: fribidi's GitHub website.

The parameter options are as follows:

box: Whether to use the background color to draw a rectangular box, 1 (on) or 0 (off), the default is off boxborderw: the width of the drawn rectangle border, the default is 0 boxcolor: the color to draw the rectangle border, the default is white line_spacing: the line spacing, the default 0 basetime: start time, in microseconds fontcolor: font color, the default is black font: font, the default is Sans fontfile: font file, the absolute path of the file is required alpha: the alpha value of the mixed transparent channel of the text, the range is [0.0 , 1.0], the default is 1.0 fontsize: the font size, the default is 16 shadowcolor: the shadow color, the default is black shadowx, shadowy: the x, y offset of the text shadow relative to the text timecode: the timecode, the default format is "hh: mm:ss[:;.]ff" text: string text, must be in UTF-8 encoding format textfile: text file, must be in UTF-8 encoding format main_h, h, H: input height main_w, w, W: input Width n: From which frame to draw the text, the default is 0 t: Timestamp expression, the unit is second, NAN is an unknown value text_h, th: text height text_w, tw: text width x, y: text in the video screen xy coordinate point, as the starting position of the rendered text Draw text reference command, specify text, xy coordinate point, font size, font color:

ffmpeg -i in.mp4 -vf drawtext="text='Hello world:x=10:y=20:fontcolor=red" watermark.mp4

The effect of adding a text watermark is shown in the figure below:

2、edgedetect

Edge detection, used to detect and draw edges, using the Canny edge detection algorithm. For details about the principle of Canny edge detection algorithm, please refer to: Wikipedia of Canny edge detection. The steps of edge detection are as follows:

(1) Apply a Gaussian filter to smooth the image to remove noise (2) Compute the gradient and orientation of the image (3) Apply gradient magnitude thresholding or lower cutoff suppression to eliminate spurious responses to edge detection (4) Apply dual thresholding to identify potential edges (5) Lag Tracking Edges: Suppress all other weak edges that are not connected to strong edges parameter options are as follows:

low, high: Canny edge detection threshold, range [0,1], default minimum is 20/255, default maximum is 50/255 mode: drawing mode, default is wire, all modes are as follows: 'wires': Draw white and gray lines on a black background 'colormix': Mix colors, similar to painting cartoon effects 'canny': Canny detection is performed on each plane planes: Whether to enable plane filtering, enabled by default

2.1 Edge detection algorithm

The code of edge detection is located in libavfilter/vf_edgedetect.c. The operation steps include Gaussian filtering, sobel operator, eliminating response, double threshold to find potential edges, and lagging to track edges. The core code is as follows:

static int filter_frame(AVFilterLink *inlink, AVFrame *in)
{
    ......
    // get video frame data from the buffer
    out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
    
    for (p = 0; p < edgedetect->nb_planes; p++) {
        ......
        // Gaussian filter, image noise reduction processing
        gaussian_blur(ctx, width, height,
                      tmpbuf,      width,
                      in->data[p], in->linesize[p]);
 
        // sobel operator: calculate image gradient and direction
        sobel(width, height,
              gradients, width,
              directions,width,
              tmpbuf,    width);
 
        memset(tmpbuf, 0, width * height);
        // Apply gradient thresholding to remove spurious responses
        non_maximum_suppression(width, height,
                                tmpbuf,    width,
                                directions,width,
                                gradients, width);
 
        // apply high and low double thresholding to identify potential edges
        double_threshold(edgedetect->low_u8, edgedetect->high_u8,
                         width, height,
                         out->data[p], out->linesize[p],
                         tmpbuf,       width);
        // color blending, laggy tracking edges
        if (edgedetect->mode == MODE_COLORMIX) {
            color_mix(width, height,
                      out->data[p], out->linesize[p],
                      in->data[p], in->linesize[p]);
        }
    }
 
    if (!direct)
        av_frame_free(&in);
    return ff_filter_frame(outlink, out);
}

2.2 Gaussian filtering

Since all edge detection results are susceptible to image noise, it is necessary to remove noise to avoid false detections. To smooth the image, a Gaussian filter kernel is convolved with the image. The 5x5 Gaussian filter code is as follows:

static void gaussian_blur(AVFilterContext *ctx, int w, int h,
                                uint8_t *dst, int dst_linesize,
                          const uint8_t *src, int src_linesize)
{
    int i, j;
    memcpy(dst, src, w); dst += dst_linesize; src += src_linesize;
    if (h > 1) {
        memcpy(dst, src, w); dst += dst_linesize; src += src_linesize;
    }
    for (j = 2; j < h - 2; j++) {
        dst[0] = src[0];
        if (w > 1)
            dst[1] = src[1];
        for (i = 2; i < w - 2; i++) {
            /* Gaussian mask of size 5x5 with sigma = 1.4 */
            dst[i] = ((src[-2*src_linesize + i-2] + src[2*src_linesize + i-2]) * 2
                    + (src[-2*src_linesize + 1] + src[2*src_linesize + 1]) * 4
                    + (src[-2*src_linesize + i ] + src[2*src_linesize + i ]) * 5
                    + (src[-2*src_linesize + i+1] + src[2*src_linesize + i+1]) * 4
                    + (src[-2*src_linesize + i+2] + src[2*src_linesize + i+2]) * 2
 
                    + (src[ -src_linesize + i-2] + src[ src_linesize + i-2]) * 4
                    + (src[ -src_linesize + i-1] + src[ src_linesize + i-1]) * 9
                    + (src[ -src_linesize + i ] + src[ src_linesize + i ]) * 12
                    + (src[ -src_linesize + i+1] + src[ src_linesize + i+1]) * 9
                    + (src[ -src_linesize + i+2] + src[ src_linesize + i+2]) * 4
 
                    + src[i-2] *  5
                    + src[i-1] * 12
                    + src[i  ] * 15
                    + src[i+1] * 12
                    + src[i+2] *  5) / 159;
        }
        if (w > 2)
            dst[i] = src[i];
        if (w > 3)
            dst[i + 1] = src[i + 1];
 
        dst += dst_linesize;
        src += src_linesize;
    }
    if (h > 2) {
        memcpy(dst, src, w); dst += dst_linesize; src += src_linesize;
    }
    if (h > 3)
        memcpy(dst, src, w);
}

2.3 sobel operator

Edges in an image can point in multiple directions, so the Canny algorithm uses four filters to detect horizontal, vertical, and diagonal edges in blurred images. The sobel operator is used here to determine the gradient and direction of the image, the code is as follows:

static void sobel(int w, int h,
                       uint16_t *dst, int dst_linesize,
                         int8_t *dir, int dir_linesize,
                  const uint8_t *src, int src_linesize)
{
    int i, j;
 
    for (j = 1; j < h - 1; j++) {
        dst += dst_linesize;
        dir += dir_linesize;
        src += src_linesize;
        for (i = 1; i < w - 1; i++) {
            const int gx =
                -1*src[-src_linesize + i+1] + 1*src[-src_linesize + i+1]
                -2*src[                i-1] + 2*src[                i+1]
                -1*src[ src_linesize + i-1] + 1*src[ src_linesize + i+1];
            const int gy =
                -1*src[-src_linesize + 1] + 1*src[ src_linesize + 1]
                -2*src[-src_linesize + i ] + 2*src[ src_linesize + i ]
                -1*src[-src_linesize + i+1] + 1*src[ src_linesize + i+1];
 
            dst[i] = FFABS(gx) + FFABS(gy);
            dir[i] = get_rounded_direction(gx, gy);
        }
    }
}

The edge detection effect is shown in the figure below:

3、fade

Fade effect, apply a fade effect to the video. The parameter options are as follows:

type, t: effect type, "in" represents the fade-in effect, "out" represents the fade-out effect, the default is the fade-in effect start_frame, s: the number of frames at which the effect starts, the default is 0 nb_frames, n: the number of frames the effect lasts, the default is 25 alpha: whether to enable alpha, if it is enabled, it will only apply the effect to the alpha channel, and the default is off start_time, st: the time when the effect starts, and the default starts from 0 duration, d: the duration of the effect color, c: the color of the fade in and out effect, by default Frame-by-frame reference command for black:

fade=t=in:s=0:n=30

Reference commands in time units:

fade=t=in:st=0:d=5.0

4、gblur

Gaussian blur, use Gaussian blur to make mosaics. Regarding the Gaussian blur algorithm, the main idea is to perform a weighted average on the adjacent area of ​​​​the pixel. For details, please check: Wikipedia of the Gaussian blur algorithm. The parameter options are as follows:

sigma: horizontal sigma, Gaussian blur standard deviation, default is 0.5 steps: Gaussian approximation steps, default is 1 planes: select which plane to filter, default all planes sigmaV: vertical sigma, if it is -1, it is the same same direction, default is -1

4.1 Gaussian blur algorithm

The code of Gaussian blur is located in vf_gblur.c, the key code is as follows:

static void gaussianiir2d(AVFilterContext *ctx, int plane)
{
    GBlurContext *s = ctx->priv;
    const int width = s->planewidth[plane];
    const int height = s->planeheight[plane];
    const int nb_threads = ff_filter_get_nb_threads(ctx);
    ThreadData td;
 
    if (s->sigma <= 0 || s->steps < 0)
        return;
 
    td.width = width;
    td.height = height;
    // filter horizontally
    ctx->internal->execute(ctx, filter_horizontally, &td, NULL, FFMIN(height, nb_threads));
    // filter vertically
    ctx->internal->execute(ctx, filter_vertically, &td, NULL, FFMIN(width, nb_threads));
    // processing after scaling
    ctx->internal->execute(ctx, filter_postscale, &td, NULL, FFMIN(width * height, nb_threads));
}

4.2 Horizontal filtering

First, look at the horizontal filtering filter_horizontally() function:

static int filter_horizontally(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs)
{
    ......
    // Filter each slice strip
    s->horiz_slice(buffer + width * slice_start, width, slice_end - slice_start,
                   steps, nu, boundaryscale);
    emms_c();
    return 0;
}

And horiz_slice is a function pointer (assigned at initialization), pointing to the horiz_slice_c() function:

static void horiz_slice_c(float *buffer, int width, int height, int steps,
                          float nu, float bscale)
{
    int step, x, y;
    float *ptr;
    for (y = 0; y < height; y++) {
        for (step = 0; step < steps; step++) {
            ptr = buffer + width * y;
            ptr[0] *= bscale;
 
            // filter to the right
            for (x = 1; x < width; x++)
                ptr[x] += nu * ptr[x - 1];
            ptr[x = width - 1] *= bscale;
 
            // filter to the left
            for (; x > 0; x--)
                ptr[x - 1] += nu * ptr[x];
        }
    }
}

4.3 Vertical filtering

For vertical filtering, the filter_vertically() function is as follows:

static int filter_vertically(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs)
{
    ......
    // Filter one by one along the vertical direction (each step processes 8 columns)
    do_vertical_columns(buffer, width, height, slice_start, aligned_end,
                        steps, nu, boundaryscale, 8);
 
    // Filter one by one for unaligned columns
    do_vertical_columns(buffer, width, height, aligned_end, slice_end,
                        steps, nu, boundaryscale, 1);
    return 0;
}

The internally called do_vertical_columns() function is as follows:

static void do_vertical_columns(float *buffer, int width, int height,
                                int column_begin, int column_end, int steps,
                                float nu, float boundaryscale, int column_step)
{
    const int numpixels = width * height;
    int i, x, k, step;
    float *ptr;
    for (x = column_begin; x < column_end;) {
        for (step = 0; step < steps; step++) {
            ptr = buffer + x;
            for (k = 0; k < column_step; k++) {
                ptr[k] *= boundaryscale;
            }
            // filter down
            for (i = width; i < numpixels; i += width) {
                for (k = 0; k < column_step; k++) {
                    ptr[i + k] += nu * ptr[i - width + k];
                }
            }
            i = numpixels - width;
 
            for (k = 0; k < column_step; k++)
                ptr[i + k] *= boundaryscale;
 
            // filter up
            for (; i > 0; i -= width) {
                for (k = 0; k < column_step; k++)
                    ptr[i - width + k] += nu * ptr[i + k];
            }
        }
        x += column_step;
    }
}

4.4 Post-scaling processing

postscale_slice is also a function pointer, pointing to the postscale_c() function, which mainly performs two steps: scaling and cropping:

static void postscale_c(float *buffer, int length,
                        float postscale, float min, float max)
{
    for (int i = 0; i < length; i++) {
        // zoom
        buffer[i] *= postscale;
        // crop
        buffer[i] = av_clipf(buffer[i], min, max);
    }
}

5、hflip

Flip horizontally, the video is flipped along the horizontal direction. The corresponding is vflip, vertical flip.

The command to flip horizontally is as follows:

ffmpeg -i in.mp4 -vf "hflip" out.mp4

Horizontal flip is also known as left-right mirroring. The before and after mirroring is shown in the figure below:

6、hstack

Horizontal splicing, two videos are spliced ​​up and down in the horizontal direction. Corresponding to it is vstack, left and right splicing.

The commands for horizontal splicing are as follows:

ffmpeg -i one.mp4 -i two.mp4 -vf "hstack" out.mp4

7、rotate

Rotate, rotate the video at any angle expressed in radians, either clockwise or counterclockwise. The parameter options are as follows:

  • angle, a: use radian angle to indicate the angle to be rotated, the default is 0, if it is negative, it means counterclockwise rotation

  • out_w, ow: output video width, the default is the same as the input video, ie "iw"

  • out_h, oh: The height of the output video, the default is the same as the input video, ie "ih"

  • bilinear: Bilinear interpolation, enabled by default, 0 means off, 1 means on

  • fillcolor, c: fill color, default is black

  • n: the serial number of the input video frame

  • t: the time of the input video frame, in seconds

  • hsub, vsub: sub-sampling in the horizontal and vertical directions, for example, if the pixel format is "yuv422p", then hsub=2, vsub=1

  • in_w, iw, in_h, ih: the width and height of the input video

  • out_w, ow, out_h, oh: the width and height of the output video

  • rotw(a), roth(a): the minimum width and height of the rotated video

Taking clockwise rotation 90° as an example, the effect before and after rotation is shown in the figure below:

8、xfade

Transition transition animation, applied to the transition transition from one video to another. It should be noted that the frame rate, pixel format, resolution, and time base of all input videos should be consistent.

Supported transition animations include fade in and fade out, wipe up, down, left, and right, wipe out, up, down, left, and right, circle clipping, rectangle clipping, circle opening, dissolve, blur, zoom, etc., and the default is fade in and out. As shown in the following list:

  • ‘custom’

  • ‘fade’

  • ‘wipeleft’

  • ‘wiperight’

  • ‘wipeup’

  • ‘wipedown’

  • ‘slideleft’

  • ‘slideright’

  • ‘slideup’

  • ‘slidedown’

  • ‘circlecrop’

  • ‘rectcrop’

  • ‘distance’

  • ‘fadeblack’

  • 'fadewhite'

  • ‘radial’

  • ‘smoothleft’

  • ‘smoothright’

  • ‘smoothup’

  • ‘smoothdown’

  • ‘circleopen’

  • ‘circleclose’

  • 'to hide'

  • ‘vertclose’

  • 'horzops'

  • ‘horzclose’

  • ‘dissolve’

  • ‘pixelize’

  • ‘diagtl’

  • ‘diagbl’

  • 'diagbr'

  • ‘hlslice’

  • ‘hrslice’

  • ‘vuslice’

  • ‘vdslice’

  • ‘hblur’

  • ‘fadegrays’

  • 'wipe'

  • ‘wipetr’

  • ‘wipebl’

  • ‘wipebr’

  • ‘squeezeh’

  • ‘squeezev’

  • The 'zoomin' parameter options are as follows:

duration: the duration of the transition animation, the range is [0, 60], the default is 1 offset: the offset time of the transition animation relative to the first video, in seconds, the default is 0

9、overlay

Video overlay, superimpose another layer on top of the video, you can do text watermark, picture watermark, GIF watermark, etc. The parameter options are as follows:

  • x, y: set the xy coordinate point of the overlay layer

  • format: The pixel format of the output video, the default is yuv420, the complete list is as follows:

  • 'yuv420'

  • 'yuv420p10'

  • 'yuv422'

  • 'yuv422p10'

  • 'yuv444'

  • ‘rgb’

  • 'gbrp' (flat RGB)

  • 'auto' (automatic selection)

  • alpha: set the transparency format, straight or premultiplied, the default is straight

  • main_w, W, main_h, H: the width and height of the input video

  • overlay_w, w: overlay_h, h: the width and height of the overlay layer

  • n: the offset number of video frames, the default is 0

  • pos: the position of the input frame in the file

  • t: time stamp

The command to add image watermark is as follows:

ffmpeg -i in.mp4 -i logo.png -filter_complex overlay=10:20 out.mp4

If you want to configure the upper left corner, upper right corner, lower left corner, and lower right corner orientation, you can use the following methods:

    private static String obtainOverlay(int offsetX, int offsetY, int location) {
        switch (location) {
            case 2: // upper right corner
                return "overlay='(main_w-overlay_w)-" + offsetX + ":" + offsetY + "'";
            case 3: // lower left corner
                return "overlay='" + offsetX + ":(main_h-overlay_h)-" + offsetY + "'";
            case 4: // bottom right corner
                return "overlay='(main_w-overlay_w)-" + offsetX + ":(main_h-overlay_h)-" + offsetY + "'";
            case 1: // upper left corner
            default:
                return "overlay=" + offsetX + ":" + offsetY;
        }
    }

The effect of adding image watermark is as follows (use the logo made by Thor to pay homage to and remember Thor):

The reference command for adding GIF animation watermark is as follows (-ignore_loop 0 means looping GIF):

ffmpeg -i in.mp4 -ignore_loop 0 -i in.gif -filter_complex overlay=10:20 out.mp4

Original link: FFmpeg source code analysis: video filter introduction (below)_ffmpeg shadowy_Xu Fuji 456's Blog-CSDN Blog

★The business card at the end of the article can receive audio and video development learning materials for free, including (FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, srs) and audio and video learning roadmaps, etc.

see below!

 

Guess you like

Origin blog.csdn.net/yinshipin007/article/details/130140849