FFmpeg old and new API encoding

background

The live broadcast SDK used FFmpeg 2.8 at the beginning, and now the latest version of FFmpeg is 4.4. The FFmpeg used by the player editor is all version 4.0; the internal structure of the new version of FFmpeg has also been optimized, and the efficiency and stability have been improved a lot compared with the old version. Therefore, the live broadcast SDK FFmpeg must also be upgraded.

Introduction

FFmpeg is used in three main parts of the live broadcast SDK:

Use libavcodec to encode Audio;
Use libavcodec to encode Video;
Use libavformat to synthesize/push stream;

I will first explain the process of using the old AP Ilibavcodec to encode Audio and Video. Then explain the process of using the new API to encode Audio and Video. Through the comparison before and after, you can easily know where the live broadcast SDK FFmpeg upgrade point is? Here we only compare the update points of our live broadcast SDK by analyzing the use of the old and new APIs. At the same time You can also learn how to encode Audio, Video, Muxer through FFmpeg new and old API. Due to the limited space, we will not explain the internal source code logic of each API in detail here, and you can understand it yourself if you are interested.

libavcodec old API encoding Audio

The text description feels rather boring. Here is the picture above to understand how to encode Audio through the old API of the FFmpeg libavcodec module. The role and function of the API will be introduced in detail later.

GIF cover

Encoder registration

av_register_all() can also use avcodec_register_all() instead. Looking at the source code, you can find that av_register_all() internally calls avcodec_register_all(). Its role is to register all codecs.

find encoder

After the encoder is registered, we can get the encoder we want through avcodec_find_encoder_by_name() and avcodec_find_encoder(). For example: through avcodec_find_encoder(AV_CODEC_ID_AAC), if we want to use the libfdk-aac encoder to encode audio, we must link it in when FFmpeg is cross-compiled. Otherwise, we will use the default AAC encoder inside FFmpeg when obtaining the encoder.

Create AVCodecContext

After the encoder is created, it is necessary to create an AVCodecContext according to the encoder, and initialize the encoding parameters: sampling, number of channels, sampling format, etc.

AVCodecContext *avCodecContext = avcodec_alloc_context3(codec);
avCodecContext->codec_type = AVMEDIA_TYPE_AUDIO;
avCodecContext->sample_rate = 44100；
avCodecContext->bit_rate = 64000；
avCodecContext->sample_fmt = AV_SAMPLE_FMT_S16;
avCodecContext->channel_layout = AV_CH_LAYOUT_STEREO;
avCodecContext->channels = av_get_channel_layout_nb_channels(avCodecContext->channel_layout);
avCodecContext->profile = FF_PROFILE_AAC_LOW;
avCodecContext->flags |= CODEC_FLAG_GLOBAL_HEADER;
avCodecContext->codec_id = codec->id;

open encoder

After the parameters of the encoder are set, the encoder can be turned on.

if (avcodec_open2(avCodecContext, codec, NULL) < 0) {
   return -1;
}

Create an AVFrame and apply for a piece of PCM memory

In FFmpeg, the data before encoding and after decoding is represented by AVFrame; after encoding, the data before decoding is represented by AVPacket;

Here we need to create an AVFrame for the data we are about to encode, and create a space for it to store the data.

// 创建AVFrame
AVFrame *encode_frame = av_frame_alloc();
encode_frame->nb_samples = avCodecContext->frame_size;
encode_frame->format = avCodecContext->sample_fmt;
encode_frame->channel_layout = avCodecContext->channel_layout;
encode_frame->sample_rate = avCodecContext->sample_rate;

// 申请一块PCM内存
int ret = av_samples_alloc_array_and_samples(&pcm_buffer, &src_samples_linesize, avCodecContext->channels, audio_nb_samples, avCodecContext->sample_fmt, 0);
if (ret < 0) {
    return -1;
}

coding

The process of encoding is a continuous loop.

Get a frame of audio data from the PCM queue to pcm_buffer
Fill pcm_buffer into AFrame
Audio encoding, get the encoded AVPacket data

// 从PCM队列中获取一帧音频数据到pcm_buffer
pcm_frame_callback(pcm_buffer);

int ret;
int got_packet;
AVPacket *pkt = av_packet_alloc();
pkt->duration = (int) AV_NOPTS_VALUE;
pkt->pts = pkt->dts = 0;
  
// 把pcm_buffer填充到AFrame中
avcodec_fill_audio_frame(encode_frame, avCodecContext->channels, avCodecContext->sample_fmt, pcm_buffer[0], audioSamplesSize, 0);

// 音频编码，获取到编码后AVPacket数据
ret = avcodec_encode_audio2(avCodecContext, pkt, encode_frame, &got_packet);
if (ret < 0 || !got_packet) {
    av_packet_free(&pkt);
    return ret;
}

// write、enqueue

destroy

if (NULL != pcm_buffer) {
    av_free(pcm_buffer);
}
if (NULL != encode_frame) {
    av_frame_free(&encode_frame);
}
if (NULL != avCodecContext) {
   avcodec_close(avCodecContext);
   av_free(avCodecContext);
}

libavcodec New API to encode Audio

Here is the picture above to understand how to encode Audio through the new API of FFmpeg libavcodec module. The role and function of the core API will be introduced in detail later.

GIF cover

As can be seen from the figure above, the idea of FFmpeg's new API remains unchanged during the total encoding process, but the API that is called has changed. Here we focus on the use of the new API, and will not explain the parts that are the same as the old API.

Create an AVFrame and apply for a piece of PCM memory

The old API creation method: first apply for a block of memory, and then mount the memory to the AVFrame after filling the pcm data.

The new API can directly create memory for AVFrame through av_frame_get_buffer(). However, the sampling rate, number of channels, sampling format, and sampling size must be set for AVFrame before calling av_frame_get_buffer().

// 创建AVFrame
AVFrame *encode_frame = av_frame_alloc();
encode_frame->nb_samples = avCodecContext->frame_size;
encode_frame->format = avCodecContext->sample_fmt;
encode_frame->channel_layout = avCodecContext->channel_layout;
encode_frame->sample_rate = avCodecContext->sample_rate;

// 申请一块PCM内存
int ret = av_frame_get_buffer(encode_frame, 0);
if (ret < 0) {
    return -1;
}

coding

FFmpeg's new API encoding is implemented through avcodec_send_frame() and avcodec_receive_packet(). Their internal implementation principles can refer to the introduction at the bottom of the article.

    AVPacket pkt = { 0 };
    av_init_packet(&pkt);
    pkt.duration = (int) AV_NOPTS_VALUE;
    pkt.pts = pkt.dts = 0;
   
    while (true){
        do{
            ret = avcodec_receive_packet(avCodecContext, &pkt);
            // ret >= 0 获取编码后的视频流
            if(ret >= 0){
                
                av_free_packet(&pkt);
                return ret;
            }
            //
            if (ret == AVERROR(EAGAIN)) {
                // 跳出该循环。
                break;
            }
            // 编码出错
            if (ret < 0) {
                av_free_packet(&pkt);
                return ret;
            }
        }while (true);

        // 获取pcm数据
        pcm_frame_callback(encode_frame->data);
        ret = avcodec_send_frame(avCodecContext, encode_frame);
        if(ret >= 0){
//            LOGI("avcodec_send_frame success");
        }else{
            LOGI("avcodec_send_frame error: %s\n", av_err2str(ret));
        }
        av_packet_unref(&pkt);
    }

destroy

if (NULL != encode_frame) {
    av_frame_free(&encode_frame);
}
if (NULL != avCodecContext) {
   avcodec_free_context(avCodecContext)；
}

libavcodec old API encoding Video

FFmpeg encodes Video through libavcodec old API, and its process is very similar to libavcodec old API encoding Audio. I only give the API call flow chart here, and each step in the flow chart process will not be analyzed in detail. As long as you understand the above analysis, it is very simple here.

GIF cover

libavcodec new API to encode Video

FFmpeg encodes Video through libavcode's new API, and its process is very similar to libavcodec's new API for encoding Audio. I only give the API call flow chart here, and each step in the flow chart process will not be analyzed in detail. As long as you understand the above analysis, it is very simple here.

GIF cover

Basic principles of audio and video codecs

In FFMPEG, avcodec_send_frame() and avcodec_receive_packet() are usually used at the same time. First call avcodec_send_frame() to send the audio and video frame to be encoded, and then call avcodec_receive_packet() to obtain the encoded data packet. However, it should be noted that there is buffer data processing inside the encoder, so it does not guarantee that every time an audio and video frame is sent, there must be a corresponding encoded data packet output. These two functions are for data processing, in terms of timing It is not synchronized, which requires special attention.

Usually when decoding starts, dozens of audio and video frames are sent through avcodec_send_frame(), and the corresponding avcodec_receive_packet() has no data packet output. After enough frames are sent in, avcodec_receive_packet() starts to output the data packets sent in for encoding at the beginning. In the last few dozens, no data frames were sent in, and avcodec_send_frame() should also be called to send empty frames to drive the encoding module to continue encoding the data in the buffer. At this time, avcodec_receive_packet() will still output data packets until AVERROR_EOF is returned. All audio and video frames are encoded.

GIF cover

for example:

There are a total of 100 video frames to be sent to the encoder for encoding, and the final output video frame is also output in 100 data packets.

The first 20 calls to avcodec_send_frame() can continuously send the first to 20th video frame, but the first 20 calls to the avcodec_receive_packet() function always return AVERROR(EAGAIN), and no data packet is output.

From the 21st call to avcodec_send_frame() to send the 21st video frame, this time calling the avcodec_receive_packet() function can return 0, and there is a data packet output, but the output data packet pts is 0 (that is, the first data The video frame corresponding to the packet), and then avcodec_send_frame() continuously sends the 22nd, 23rd... video frames, and avcodec_receive_packet() continuously outputs the 1st, 2nd... data

Finally, the 100th data packet is sent through avcodec_send_frame(), but at this time avcodec_receive_packet() only gets the 82nd frame output data packet. At this time, it is necessary to continue to call avcodec_send_frame() to send empty frames, and at the same time call avcodec_receive_packet( ) Get the output data packet until AVERROR_EOF is returned, and the last 100th output data packet can be obtained.

GIF cover

Audio and video codec basic principle reference document:

Fast development of FFMPEG into the pit - audio and video encoding processing-Knowledge

Original FFmpeg new and old API encoding_ffmpeg version difference_BetterDaZhang's Blog-CSDN Blog

★The business card at the end of the article can receive audio and video development learning materials for free, including (FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, srs) and audio and video learning roadmaps, etc.

see below!