Audio and video development - ffmpeg introduction - series one

Table of contents

1. Introduction

The basic composition of the FFmpeg framework includes:

2. The FFmpeg framework sorts out the audio and video processEdit

basic concept:

3. The difference between ffmpeg, ffplay and ffprobe

     4.1 ffmpeg is an application for transcoding 

4.2 fffplay is an application for playing 

     4.3 ffprobe is an application for viewing file formats

     4.4 ffmpeg is an application for transcoding 

  5. Common file formats and encodings

  5.1 Common video formats and file formats

5.2 Common coded audio transcoding formats

Sixth, compile the ffmepg script

6.1 ffmpeg core tools

6.2 Convert Video

 6.3 Transform cropping

 6.3 Video Mute

6.4 Add watermark to video

 6.5 Video speed change

6.6 Add mosaic to video

6.7 Video screenshot

 6.8 Add watermark to pictures

 6.9 Image Synthesis Video


  • 1. Introduction

 ,   Download FFmpeg  official website address
The full name of FFmpeg is Fast Forward Moving Picture Experts Group (mpeg: Moving Picture Experts Group), which was born in 2000. It is a free and open source audio and video codec tool and development kit. It is powerful and versatile, and is heavily used in video sites and commercial software (such as Youtube and iTunes).
FFmpeg itself is a huge project, including many components and library files, the most commonly used is its command line tool, FFmpeg is not only an audio and video codec tool, but also a set of audio and video codec development kits, as a codec development Kit, which provides developers with a rich calling interface for audio and video processing . FFmpeg provides encapsulation and decapsulation of various media formats, including multiple audio and video encoding, multiple protocol streaming media, multiple color format conversions, multiple sampling rate conversions, multiple code rate conversions, etc.; the FFmpeg framework provides multiple A variety of plug-in modules, including plug-ins for packaging and decapsulation, plug-ins for encoding and decoding, etc.

FFmpeg is a very comprehensive image processing suite.

  1. The basic composition of the FFmpeg framework includes:

    The role of each function library 

    libavcodec: codec library. Support MPEG4, AAC, MJPEG and other built-in media codec formats, etc. * Support third-party codecs: H.264 (AVC) encoding, need to use x264 encoder; H.265 (HEVC) encoding, need to use x265 encoding MP3 (mp3lame) encoding, you need to use libmp3lame encoder If you want to add your own encoding format, or hardware encoding and decoding, you need to add the corresponding encoding and decoding module in AVCodec

    libavformat: Encapsulation and analysis of audio and video container formats and supported protocols. File encapsulation format: MP4, FLV, KV, TS, etc. * Network protocol encapsulation format: RTMP, RTSP, MMS, HLS, etc.

    libavutil: Provides some public functions and tool libraries.

    libavfilter: Audio and video filter library, such as video watermarking, audio voice change, etc.

    libavdevice: supports the input and output of many device data, such as reading camera data and screen recording.

    libswresample, libavresample: Provides audio resampling tool libraries.

    libswscale: Provides color conversion, scaling, and pixel format conversion for video images, such as YUV conversion of images.

    libpostproc: Multimedia post processor.

  2. 2.  The process of combing audio and video by FFmpeg framework

    basic concept:

    Container (Container) A container is a file format, such as flv, mkv, etc. Contains the following 5 streams and file header information.

    Stream (Stream) is a transmission method of video data information, 5 kinds of streams: audio, video, subtitles, attachments, data.

    Frame (Frame) Frame represents a still image, divided into I frame, P frame, B frame.

    Codec (Codec) is to compress or decompress video, CODEC =Code (encoding) +DECode (decoding)

    Multiplexing/demultiplexing (mux/demux) Putting different streams into containers according to the rules of a certain container, this behavior is called multiplexing (mux) Parsing different streams from a certain container, this behavior is called Demultiplexing (demux), whether FFmpeg supports a certain media packaging format depends on whether the packaging library of this format is included during compilation. According to the actual needs, the media packaging format can be extended, adding your own customized packaging format, that is, adding your own packaging processing module in AVFormat

  3. 3. The difference between ffmpeg, ffplay and ffprobe

     4.1 ffmpeg is an application for transcoding 
     4.2 fffplay is an application for playing 
     4.3 ffprobe is an application for viewing file formats

  5. Common file formats and encodings

  5.1  Common video formats and file formats

5.2  Common encoded audio transcoding formats

  • MP4 package: H264 video encoding + AAC audio encoding (relatively mature)

  • WebM package: VP8 video encoding + Vorbis audio encoding (Google solution)

  • OGG package: Theora video encoding + Vorbis audio encoding (open source)

Sixth, compile the ffmepg script

#!/bin/bash
# 以下路径需要修改成自己的NDK目录
TOOLCHAIN=/Users/lh/Library/Android/sdk/ndk/21.4.7075529/toolchains/llvm/prebuilt/darwin-x86_64
# 最低支持的android sdk版本
API=21

function build_android
{
echo "Compiling FFmpeg for $CPU"
./configure \
 --prefix=$PREFIX \
 --disable-shared \
 --enable-static \
 --disable-avdevice \
 --enable-small \
 --disable-muxers \
 --disable-filters \
 --enable-gpl \
 --cross-prefix=$CROSS_PREFIX \
 --target-os=android \
 --arch=$ARCH \
 --cpu=$CPU \
 --cc=$CC \
 --cxx=$CXX \
 --enable-cross-compile \
 --sysroot=$SYSROOT \
 --extra-cflags="-mno-stackrealign -Os $OPTIMIZE_CFLAGS -fPIC" \
 --extra-ldflags="$ADDI_LDFLAGS" \
 $ADDITIONAL_CONFIGURE_FLAG
make clean
make -j16
make install
echo "The Compilation of FFmpeg for $CPU is completed"
}

# armv8-a
ARCH=arm64
CPU=armv8-a
# r21版本的ndk中所有的编译器都在/toolchains/llvm/prebuilt/darwin-x86_64/目录下(clang)
CC=$TOOLCHAIN/bin/aarch64-linux-android$API-clang
CXX=$TOOLCHAIN/bin/aarch64-linux-android$API-clang++
# NDK头文件环境
SYSROOT=$TOOLCHAIN/sysroot
CROSS_PREFIX=$TOOLCHAIN/bin/aarch64-linux-android-
# so输出路径
PREFIX=$(pwd)/android/$CPU
OPTIMIZE_CFLAGS="-march=$CPU"
build_android

# 交叉编译工具目录,对应关系如下
# armv8a -> arm64 -> aarch64-linux-android-
# armv7a -> arm -> arm-linux-androideabi-
# x86 -> x86 -> i686-linux-android-
# x86_64 -> x86_64 -> x86_64-linux-android-

# CPU架构
# armv7-a
ARCH=arm
CPU=armv7-a
CC=$TOOLCHAIN/bin/armv7a-linux-androideabi$API-clang
CXX=$TOOLCHAIN/bin/armv7a-linux-androideabi$API-clang++
SYSROOT=$TOOLCHAIN/sysroot
CROSS_PREFIX=$TOOLCHAIN/bin/arm-linux-androideabi-
PREFIX=$(pwd)/android/$CPU
OPTIMIZE_CFLAGS="-mfloat-abi=softfp -mfpu=vfp -marm -march=$CPU "
build_android

 

run ./buildsh.sh

The product after successful compilation

6.1 ffmpeg core tools

ffmpeg provides the following three tools

____ffmpeg # 用于音视频编解码等等
| |____ffplay # 用于播放音视频文件、流媒体数据等等
| |____ffprobe # 用于查看文件封装格式、音视频编码格式等等详细信息
# ffmpeg [全局参数] [[输入文件参数] -i 输入文件]... {[输出文件参数] 输出文件}...
$ ffmpeg [global_options] {[input_file_options] -i input_url} ... {[output_file_options] output_url} ...

get video info

./ffmpeg -i /Users/lh/Downloads/test.mp4 

This section of information indicates that the file's

Metadata information:

The major_brand field indicates that the encapsulation format of the file is mp42 (substandard of the MP4 format), the file creation time is 2023-07-21T03:32:06.000000Z, the video duration is 00:00:07.86 (71 seconds 86), start playing The time is from 0.000300ms, the bitrate of the file is 1457 kb/s

The first video information:

Before introducing this part of information, you need to know a few technical terms, namely the definition of time base:

tbr indicates the frame rate, and this parameter tends to be a benchmark, often tbr is the same as fps,
tbn indicates the video stream timebase (time base), for example, the timebase of the ts stream is 90000, and the timebase of the flv format video stream is 1000 
tbc indicates the video stream codec timebase, for 264 code stream parameter is obtained indirectly by parsing the sps (the frame rate is obtained through sps).
This part of the information indicates that the first stream of the file is a video stream, the encoding method is H264 format, the encapsulation format is AVC1, and the data format of the frame is yuv420p. The resolution is 480x640 and the bitrate is 1450 kb/s

6.2 Convert Video

Convert video in mp4 format to flv format

./ffmpeg -i /Users/lh/Downloads/test.mp4  /Users/lh/Downloads/aaa.flv

The specific conversion process is listed below

 6.3 Transform cropping

./ffmpeg -ss 00:00:03 -i /Users/lh/Downloads/test.mp4 -vcodec copy -acodec copy -t 00:00:6 /Users/lh/Downloads/output.mp4

Cut test.mp4 from the third second to the sixth second, the following is the cutting process

 6.3 Video Mute

./ffmpeg -i /Users/lh/Downloads/210710171112971120.mp4 -af "volume=enable='between(t,5,10)':volume=0" /Users/lh/Downloads/output.mp4 

Description: The function of this command is to mute the 210710171112971120.mp4 video according to the specified time, and generate a new output.mp4 video. volume=enable='between(t,5,10)':volume=0 Mute from the 5th second to the 10th second, this command can write multiple, that is, multiple mutes, separated by commas

6.4 Add watermark to video

./ffmpeg -i /Users/lh/Downloads/210710171112971120.mp4 -vf "movie=/Users/lh/Downloads/shuiyin.jpeg,colorchannelmixer=aa=0.4,scale=300:300 [watermark]; [in][watermark] overlay" /Users/lh/Downloads/output.mp4

illustrate:

The function of this command is to watermark the input.mp4 video according to the specified command, and generate a new output.mp4 video.

movie=input.png watermark image,

colorchannelmixer=aa=0.4 watermark transparency (if you do not need to change the transparency, remove this paragraph)

scale=300:300 The size of the watermark (if you use the original watermark size, remove this section)

overlay The position of the watermark, the default is the upper left corner

        overlay=Ww upper right corner

        overlay=0:Hh lower left corner

        overlay=Ww:Hh lower right corner

ps: If the watermark does not need to be displayed on the edge, just change the values ​​​​of W and H slightly
 

How to add watermark to video

 If you want to place it in the lower left corner

./ffmpeg -i /Users/lh/Downloads/210710171112971120.mp4 -vf "movie=/Users/lh/Downloads/shuiyin.jpeg,colorchannelmixer=aa=0.4,scale=300:300 [watermark]; [in][watermark] overlay=W-w:H-h" /Users/lh/Downloads/output.mp4

Effect picture below

 6.5 Video speed change

./ffmpeg -i /Users/lh/Downloads/210710171112971120.mp4 -filter_complex "[0:v]setpts=0.5*PTS[v];[0:a]atempo=2.0[a]" -map "[v]" -map "[a]" /Users/lh/Downloads/output.mp4

illustrate:

The function of this command is to convert the 210710171112971120.mp4 video at the specified speed to generate a new output.mp4 video. setpts=0.5*PTS video acceleration (the default is 1, now it is 0.5. It becomes 2 times faster)

atempo=2.0 audio acceleration (the default is 1, now it is 0.5. It becomes 2x speed)

ps: Video acceleration and audio acceleration, the double speed needs to be consistent, otherwise the sound and video will be out of sync

In fact, it is equivalent to playing videos at a speed of 2 times the speed of fast-forwarding videos. For example, we often see x1.2, x1.5, and x2 times playing videos on some video websites.

6.6 Add mosaic to video

If you need to add a mosaic to a video or picture, you can use  boxblur a filter. This filter turns the specified area into a blur effect to achieve a mosaic effect. Here is a simple example:

./ffmpeg -i /Users/lh/Downloads/210710171112971120.mp4 -filter_complex "[0:v]boxblur=10[blur];[blur]crop=200:200:300:300,boxblur=10[cropped];[0:v][cropped]overlay=300:300" /Users/lh/Downloads/output.mp4 

illustrate

Among them, -i 210710171112971120.mp4 means to specify the input file. [0:v]boxblur=10[blur] means to blur the video image, the blur radius is 10 pixels, and it is saved as an intermediate variable blur. [blur]crop=200:200:300:300,boxblur=10[cropped] means to crop the blurred video screen, and only keep the upper left corner with the starting coordinates of (300, 300) and the area with a width and height of 200. And perform fuzzy processing again and save it as an intermediate variable cropped. Finally, use the overlay filter to superimpose the original video and the cropped mosaic image to generate a new video file output.mp4.

If you need to adjust the size, position, shape and other properties of the mosaic, you can add different parameters for setting.

If the watermark and mosaic in the video cannot be removed by software tools, you can try to use FFmpeg or similar tools to add other layers on the video to cover these areas

The following is the coding effect
 

6.7 Video screenshot

./ffmpeg -i /Users/lh/Downloads/210710171112971120.mp4 -y -f mjpeg -ss 30 -t 1  /Users/lh/Downloads/test1.jpg

illustrate:

-f mjpeg specifies that the formatted format is mjpeg,

-ss 30 Start intercepting from the 30th second

-t 1 capture a frame

The effect is as shown below

 6.8 Add watermark to pictures

./ffmpeg -i /Users/lh/Downloads/test1.jpg -i /Users/lh/Downloads/shuiyin.jpeg -filter_complex "overlay=W-w-10:H-h-10:alpha=0.5" /Users/lh/Downloads/output.jpg

illustrate:

Wherein  W and  H represent the width and height of the video picture, w and  h represent the width and height of the watermark image respectively. alpha=0.5 Indicates setting the transparency of the watermark to 0.5

The effect is as shown below 

 6.9 Image Synthesis Video

/ffmpeg -i /Users/lh/Downloads/imgs/img_%1d.jpeg /Users/lh/Downloads/out.mp4

Merge the 6 pictures under the /Users/lh/Downloads/imgs/ directory into one video

 Output result:

6.10 Add subtitles to video

First create the subtitle file

cat winter.srt

1
00:00:01,000 --> 00:00:02,000
大家好,我是测试ffmepg的开发人员,这是第一条字幕

2
00:00:02,000 --> 00:00:05,000
本次我想和大家分享利用ffmpeg制作字幕的方法

3
00:00:05,000 --> 00:00:10,000
本次我想和大家分享利用ffmpeg制作字幕的方法

4
00:00:10,000 --> 00:00:20,000
本次我想和大家分享利用ffmpeg制作字幕的方法

./ffmpeg -i /Users/lh/Downloads/210710171112971120.mp4 -lavfi "subtitles=/Users/lh/Downloads/zimu.srt :force_style='Alignment=2,MarginV=5'" -y /Users/lh/Downloads/output.mp4 

The effect is as follows

6.11 Play online video and set the window title as http stream

./ffplay -window_title "http stream" http://vfx.mtime.cn/Video/2021/07/10/mp4/210710171112971120.mp4

The effect is as follows

6.12 ffplay plays network video and forces the decoder

./ffplay -vcodec h264 -window_title  "http stream" http://vfx.mtime.cn/Video/2021/07/10/mp4/210710171112971120.mp4

Force the decoder to be h264

The effect is as follows

6.13 ffplay plays network video and rotates video 

./ffplay   -window_title  "http stream" http://vfx.mtime.cn/Video/2021/07/10/mp4/210710171112971120.mp4 -vf transpose=1 

6.14 ffplay plays network video and only audio speed changes

 ./ffplay   -window_title  "http stream" http://vfx.mtime.cn/Video/2021/07/10/mp4/210710171112971120.mp4  -af atempo=2

6.15 ffplay plays network video and only video speed changes

./ffplay   -window_title  "http stream" http://vfx.mtime.cn/Video/2021/07/10/mp4/210710171112971120.mp4   -vf setpts=PTS/2

6.16 ffplay plays online video and the audio and video speed changes at the same time

./ffplay   -window_title  "http stream" http://vfx.mtime.cn/Video/2021/07/10/mp4/210710171112971120.mp4   -vf setpts=PTS/2 -af atempo=2

The above operation is what we often call the seek video

6.17 ffprobe displays the information of each stream in json format

./ffprobe -print_format json -show_streams ~/Downloads/out.mp4

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/lh/Downloads/out.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf60.10.100
  Duration: 00:00:00.12, start: 0.000000, bitrate: 12170 kb/s
  Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuvj420p(pc, bt470bg/unknown/unknown, progressive), 1080x1080 [SAR 1:1 DAR 1:1], 12110 kb/s, 25 fps, 25 tbr, 12800 tbn (default)
    Metadata:
      handler_name    : VideoHandler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc60.22.100 libx264
    "streams": [
        {
            "index": 0,//多媒体的stream索引;
            "codec_name": "h264",
            "codec_long_name": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
            "profile": "High",
            "codec_type": "video",  //多媒体类型,例如视频包,音频包等
            "codec_tag_string": "avc1",
            "codec_tag": "0x31637661",
            "width": 1080,
            "height": 1080,
            "coded_width": 1080,
            "coded_height": 1080,
            "closed_captions": 0,
            "film_grain": 0,
            "has_b_frames": 2,
            "sample_aspect_ratio": "1:1",
            "display_aspect_ratio": "1:1",
            "pix_fmt": "yuvj420p",
            "level": 32,
            "color_range": "pc",
            "color_space": "bt470bg",
            "chroma_location": "center",
            "field_order": "progressive",
            "refs": 1,
            "is_avc": "true",
            "nal_length_size": "4",
            "id": "0x1",
            "r_frame_rate": "25/1",
            "avg_frame_rate": "25/1",
            "time_base": "1/12800",
            "start_pts": 0,
            "start_time": "0.000000",
            "duration_ts": 1536,
            "duration": "0.120000",
            "bit_rate": "12110800",
            "bits_per_raw_sample": "8",
            "nb_frames": "3",
            "extradata_size": 53,
            "disposition": {
                "default": 1,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0,
                "captions": 0,
                "descriptions": 0,
                "metadata": 0,
                "dependent": 0,
                "still_image": 0
            },
            "tags": {
                "language": "und",
                "handler_name": "VideoHandler",
                "vendor_id": "[0][0][0][0]",
                "encoder": "Lavc60.22.100 libx264"
            }
        }
    ]
}

6.18 ffprobe displays frame information in json format

./ffprobe -print_format json -show_frames ~/Downloads/out.mp4

 "frames": [
        {
            "media_type": "video",
            "stream_index": 0,
            "key_frame": 1,
            "pts": 0,
            "pts_time": "0.000000",
            "pkt_dts": 0,
            "pkt_dts_time": "0.000000",
            "best_effort_timestamp": 0,
            "best_effort_timestamp_time": "0.000000",
            "pkt_duration": 512,
            "pkt_duration_time": "0.040000",
            "duration": 512,
            "duration_time": "0.040000",
            "pkt_pos": "48",
            "pkt_size": "112313",
            "width": 1080,
            "height": 1080,
            "crop_top": 0,
            "crop_bottom": 0,
            "crop_left": 0,
            "crop_right": 0,
            "pix_fmt": "yuvj420p",
            "sample_aspect_ratio": "1:1",
            "pict_type": "I",
            "coded_picture_number": 0,
            "display_picture_number": 0,
            "interlaced_frame": 0,
            "top_field_first": 0,
            "repeat_pict": 0,
            "color_range": "pc",
            "color_space": "bt470bg",
            "chroma_location": "center",
            "side_data_list": [
                {
                    "side_data_type": "H.26[45] User Data Unregistered SEI message"
                }
            ]
        },
        {
            "media_type": "video",
            "stream_index": 0,
            "key_frame": 0,
            "pts": 512,
            "pts_time": "0.040000",
            "best_effort_timestamp": 512,
            "best_effort_timestamp_time": "0.040000",
            "pkt_duration": 512,
            "pkt_duration_time": "0.040000",
            "duration": 512,
            "duration_time": "0.040000",
            "pkt_pos": "112361",
            "pkt_size": "35468",
            "width": 1080,
            "height": 1080,
            "crop_top": 0,
            "crop_bottom": 0,
            "crop_left": 0,
            "crop_right": 0,
            "pix_fmt": "yuvj420p",
            "sample_aspect_ratio": "1:1",
            "pict_type": "P",
            "coded_picture_number": 1,
            "display_picture_number": 0,
            "interlaced_frame": 0,
            "top_field_first": 0,
            "repeat_pict": 0,
            "color_range": "pc",
            "color_space": "bt470bg",
            "chroma_location": "center"
        },
        {
            "media_type": "video",
            "stream_index": 0,
            "key_frame": 0,
            "pts": 1024,
            "pts_time": "0.080000",
            "best_effort_timestamp": 1024,
            "best_effort_timestamp_time": "0.080000",
            "pkt_duration": 512,
            "pkt_duration_time": "0.040000",
            "duration": 512,
            "duration_time": "0.040000",
            "pkt_pos": "147829",
            "pkt_size": "33881",
            "width": 1080,
            "height": 1080,
            "crop_top": 0,
            "crop_bottom": 0,
            "crop_left": 0,
            "crop_right": 0,
            "pix_fmt": "yuvj420p",
            "sample_aspect_ratio": "1:1",
            "pict_type": "P",
            "coded_picture_number": 2,
            "display_picture_number": 0,
            "interlaced_frame": 0,
            "top_field_first": 0,
            "repeat_pict": 0,
            "color_range": "pc",
            "color_space": "bt470bg",
            "chroma_location": "center"
        }
    ]
}

Guess you like

Origin blog.csdn.net/qq_18757557/article/details/131826726