Live 1


The transmission of video data in video streaming media occupies most of the bandwidth. How to improve encoding efficiency, use less bandwidth, and provide better picture quality is the focus of audio and video developers.

The introduction of HEVC (High Efficiency Video Coding, also known as H.265) encoding format has brought a breakthrough in this direction, but due to its high algorithm complexity, it has not been widely used in the early stage, and with the computing power of mobile devices As more and more devices began to support HEVC hardware encoding/decoding, live broadcast platforms also began to gradually introduce the HEVC video format.


HEVC is a video coding level standard. If it is applied in video streaming media, it also needs the support of the corresponding encapsulation format and streaming media protocol. Since most of the push-pull streaming protocols of live broadcast are based on RTMP, this article mainly introduces how to add support for the HEVC video encoding format in the RTMP protocol. Other protocols or private protocols can be added by referring to this article. In addition, in addition to making changes on the streaming end and the playback end, the server should also make corresponding changes synchronously to ensure the normal use of HEVC in live broadcast.



A typical live broadcast framework usually consists of three parts, as shown in the following figure


1. Streaming end: responsible for the collection, processing, encoding and packaging of audio and video data and push the data to the source station;

2. Server: covering the source station and CDN, receiving audio and video data from the streaming end, and then distributing the data to each playback end;

3. Playing side: pulls live data from CDN, demultiplexes, decodes and renders audio and video data;

Introducing HEVC encoding, the changes involved are marked in red fonts in the above figure:

1. Encoding module: It needs to support the encoding and decoding of HEVC format. This part does not belong to the scope of this article. We have introduced how to perform hard-coding and hard-coding of HEVC on iOS11 in other articles. Interested friends can refer to them by themselves;

2. Encapsulation/transmission module: RTMP, HTTP-FLV streaming media protocol needs to add support for HEVC video encoding format, this part is the focus of this article.

02

 Brief analysis of FFmpeg


FFmpeg started from scratch, and it has become more and more powerful and has more and more codes. Many beginners have been dissuaded from continuing to learn by its numerous source files, huge structures and complex algorithms. This chapter will briefly analyze FFmpeg and teach you how to read the FFmpeg source code.



2.1 General description

FFmpeg includes the following class libraries:

•   libavformat - used for generation and parsing of various audio and video encapsulation formats, including functions such as obtaining the information required for decoding and reading audio and video data. Various streaming media protocol codes (such as rtmpproto.c, etc.) and (de)multiplexing codes of audio and video formats (such as flvdec.c, flvenc.c, etc.) are located in this directory.

• libavcodec - codec for various audio and video formats. Codec codes of various formats (such as aacenc.c, aacdec.c, etc.) are located in this directory.

• libavutil - contains a library of common utility functions, including arithmetic operations, character operations, etc.

• ibswscale - Provides scaling, colormap conversion, image color space or format conversion of the original video.

• libswresample - Provides audio resampling, sample format conversion and mixing.

• libavfilter - various audio and video filters.

• libpostproc - for post-effect processing such as image deblocking.

• libavdevice - for hardware audio and video capture, acceleration and display.


If you have no experience in reading FFmpeg code before, it is recommended to read the code below libavformat, libavcodec and libavutil first. They provide the most basic functions of audio and video development and have the widest application range.


2.2 Common structures

The most commonly used data structures in FFmpeg can be roughly divided into the following categories according to their functions (the following code lines are subject to branch: origin/release/3.4):


1. Package format

• AVFormatContext - describes the composition and basic information of media files. It is the basic structure that controls the overall situation. Throughout the program, many functions use it as a parameter;

• AVInputFormat - Demultiplexer object, each input package format (such as FLV, MP4, TS, etc.) corresponds to this structure, such as ff_flv_demuxer of libavformat/flvdec.c;

• AVOutputFormat - muxer object, each output package format (such as FLV, MP4, TS, etc.) corresponds to this structure, such as ff_flv_muxer of libavformat/flvenc.c;

• AVStream - related data information used to describe a video/audio stream.


2. Codec

•   AVCodecContext - a data structure describing the codec context, including parameter information required by many codecs;

• AVCodec - Codec object, each codec format (such as H.264, AAC, etc.) corresponds to this structure, such as ff_aac_decoder in libavcodec/aacdec.c. Each AVCodecContext contains an AVCodec;

• AVCodecParameters - Codec parameters, each AVStream contains an AVCodecParameters, which is used to store the codec parameters of the current stream.


3. Network Protocol

• AVIOContext - a structure for managing input and output data;

• URLProtocol - describes the protocol used for audio and video data transmission, each transmission protocol (such as HTTP, RTMP), etc., will correspond to a URLProtocol structure, such as ff_http_protocol in libavformat/http.c;

• URLContext - encapsulates protocol objects and protocol operation objects.


4. Data storage

•   AVPacket - stores the compressed data after encoding and before decoding, that is, ES data;

•   AVFrame - store the original data before encoding and decoding, such as video data in YUV format or audio data in PCM format, etc .;

The relationship diagram of the above structure is shown below (arrows indicate derivation):


Figure 2. FFmpeg structure diagram



2.3 Code structure

The following code completes the basic function of reading audio and video data in a media file. This section uses this as an example to analyze the calling logic of FFmpeg's internal code.


2.3.1 Registration

The role of the av_register_all function is to register a series of (de)multiplexers, encoders/decoders, etc. It is almost always called first in all FFmpeg-based applications, and only when this function is called can multiplexers, encoders, etc. be used.

REGISTER_MUXDEMUX actually calls av_register_input_format and av_register_output_format. Through these two methods, the (de)multiplexer is added to the last position of the global variable first_iformat and first_oformat linked list respectively.

The registration process of encoding/decoding is the same, and will not be repeated here.


2.3.2 File open

The process of FFmpeg reading media data begins with avformat_open_input, which completes the functions of media file opening and format detection. But how does FFmpeg find the right streaming protocol and demultiplexer? You can see that the init_input function is called in the avformat_open_input method, where the work of finding the streaming media protocol and demultiplexer is completed.

1. s->io_open actually calls io_open_default, which finally calls the url_find_protocol method.

ffurl_get_protocols can get all the streaming media protocols supported by the currently compiled FFmpeg, and get the correct protocol by comparing the scheme of the url with the protocol->name. For example, URLProtocol in this example finally points to ff_http_protocol in libavformat/http.c.


2. av_probe_input_buffer2 finally calls av_probe_input_format3, which traverses all demultiplexers, that is, all nodes in the first_iformat linked list, and calls their read_probe() function to calculate the matching score, and the function finally returns the best matching demultiplexer found. . In this example, AVInputFormat finally points to ff_flv_demuxer in libavformat/flvdec.c.


2.3.3 Data read

The function of av_read_frame is to read each audio and video frame in the media data. The most critical part of this method is to call the read_packet() method of AVInputFormat. The read_packet() of AVInputFormat is a function pointer that points to the function of reading data of the current AVInputFormat. In this example, AVInputFormat is ff_flv_demuxer, which means that read_packet finally points to flv_read_packet.




2.1 General description

FFmpeg includes the following class libraries:

•   libavformat - used for generation and parsing of various audio and video encapsulation formats, including functions such as obtaining the information required for decoding and reading audio and video data. Various streaming media protocol codes (such as rtmpproto.c, etc.) and (de)multiplexing codes of audio and video formats (such as flvdec.c, flvenc.c, etc.) are located in this directory.

• libavcodec - codec for various audio and video formats. Codec codes of various formats (such as aacenc.c, aacdec.c, etc.) are located in this directory.

•  libavutil - 包含一些公共的工具函数的使用库,包括    算数运算,字符操作等。

•  ibswscale - 提供原始视频的比例缩放、色彩映射    转换、图像颜色空间或格式转换的功能。

•  libswresample - 提供音频重采样,采样格式转换和    混合等功能。

•  libavfilter - 各种音视频滤波器。

•  libpostproc - 用于后期效果处理,如图像的去块效    应等。

•  libavdevice - 用于硬件的音视频采集、加速和显      示。


如果您之前没有阅读FFmpeg代码的经验,建议优先阅读libavformat、libavcodec以及libavutil下面的代码,它们提供了音视频开发的最基本功能,应用范围也是最广的。


2.2  常用结构

FFmpeg里面最常用的数据结构,按功能可大致分为以下几类(以下代码行数,以branch: origin/release/3.4为准):


1.  封装格式

•  AVFormatContext - 描述了媒体文件的构成及基本    信息,是统领全局的基本结构体,贯穿程序始终,很    多函数都要用它作为参数;

•  AVInputFormat - 解复用器对象,每种作为输入的    封装格式(例如FLV、MP4、TS等)对应一个该结构    体,如libavformat/flvdec.c的ff_flv_demuxer;

•  AVOutputFormat - 复用器对象,每种作为输出的    封装格式(例如FLV, MP4、TS等)对应一个该结构    体,如libavformat/flvenc.c的ff_flv_muxer;

•  AVStream - 用于描述一个视频/音频流的相关数据    信息。


2.  编解码

•  AVCodecContext - 描述编解码器上下文的数据结    构,包含了众多编解码器需要的参数信息;

•  AVCodec - 编解码器对象,每种编解码格式(例如    H.264、AAC等)对应一个该结构     体,如                libavcodec/aacdec.c的ff_aac_decoder。每个        AVCodecContext中含有一个AVCodec;

•  AVCodecParameters - 编解码参数,每个            AVStream中都含有一个                              AVCodecParameters,用来存放当前流的编解码参    数。


3.  网络协议

•  AVIOContext - 管理输入输出数据的结构体;

•  URLProtocol - 描述了音视频数据传输所使用的协    议,每种传输协议(例如HTTP、RTMP)等,都会对    应一个URLProtocol结构,如libavformat/http.c中的    ff_http_protocol;

•  URLContext - 封装了协议对象及协议操作对象。


4.  数据存放

•  AVPacket - 存放编码后、解码前的压缩数据,即ES数据;

•  AVFrame - 存放编码前、解码后的原始数据,如      YUV格式的视频数据或PCM格式的音频数据等

上述结构体的关系图如下所示(箭头表示派生出):


图2. FFmpeg结构体关系图



2.3  代码结构

下面这段代码完成了读取媒体文件中音视频数据的基本功能,本节以此为例,分析FFmpeg内部代码的调用逻辑。


2.3.1  注册

av_register_all函数的作用是注册一系列的(解)复用器、编/解码器等。它在所有基于FFmpeg的应用程序中几乎都是第一个被调用的,只有调用了该函数,才能使用复用器、编码器等。

REGISTER_MUXDEMUX实际上调用的是av_register_input_format和av_register_output_format,通过这两个方法,将(解)复用器分别添加到了全局变量first_iformat与first_oformat链表的最后位置。

编/解码其注册过程相同,此处不再赘述。


2.3.2  文件打开

FFmpeg读取媒体数据的过程始于avformat_open_input,该方法中完成了媒体文件的打开和格式探测的功能。但FFmpeg是如何找到正确的流媒体协议和解复用器呢?可以看到avformat_open_input方法中调用了init_input函数,在这里面完成了查找流媒体协议和解复用器的工作。

1. s->io_open实际上调用的就是io_open_default,它最终调用到url_find_protocol方法。

ffurl_get_protocols可以得到当前编译的FFmpeg支持的所有流媒体协议,通过url的scheme和protocol->name相比较,得到正确的protocol。例如本例中URLProtocol最终指向了libavformat/http.c中的ff_http_protocol。


2. av_probe_input_buffer2最终调用到av_probe_input_format3,该方法遍历所有的解复用器,即first_iformat链表中的所有节点,调用它们的read_probe()函数计算匹配得分,函数最终返回计算找到的最匹配的解复用器。本例中AVInputFormat最终指向了libavformat/flvdec.c中的ff_flv_demuxer。


2.3.3  数据读取

av_read_frame作用是读取媒体数据中的每个音视频帧,该方法中最关键的地方就是调用了AVInputFormat的read_packet()方法。AVInputFormat的read_packet()是一个函数指针,指向当前的AVInputFormat的读取数据的函数。在本例中,AVInputFormat为ff_flv_demuxer,也就是说read_packet最终指向了flv_read_packet。


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325954437&siteId=291194637