Regarding the analysis of webrtc video encoding, this will be a series of articles, mainly analyzing the video encoding module from the two aspects of code structure and design ideas, and will not enter the code details. The purpose is to learn from it when implementing video encoding . This article is the first in this series and mainly introduces the outline of the video encoding module.
Article directory
In webrtc, the video encoding function is implemented by a series of functional classes in cooperation with each other, because the encoding function not only involves encoding raw video data, but has at least the following functions:
- Initialize encoding parameters according to the configuration or the result of video negotiation
- According to the bandwidth or configuration, dynamically change the encoding parameters, such as dynamically changing the encoding bit rate, resolution, frame rate, etc.
- Simultaneously generate multiple streams
The entire video encoding function includes the following functional classes:
- Video encoding interface class and related factory classes
VideoEncoder
The video encoding interface class is an abstract interface class. It has multiple instances, encoding h264 encoder, vp8/9 encoder VideoEncoderFactory
factory class, VideoEncoder
specific instance for creation
- specific encoding class
H264Encoder
H264 encoding interface class, there is a Create
method for creating h264 encoding class instance (good to go around -_-!) H264EncoderImp
The specific implementation of h264 encoding, H264Encoder
the instance
- Encoder parameter configuration related classes
Including VideoStream
, VideoCode
, ViCodecInitializer
the relationship between them will be explained in the article.
create encoder
VideoEncoderFactory
Create a factory class for an encoder, which is an abstract class that defines the method for creating an encoder
std::unique_ptr<VideoEncoder> CreateVideoEncoder(
const SdpVideoFormat& format);
BuiltinVideoEncoderFactory
and InternalEncoderFactory
yes VideoEncoderFactory
implementing subclasses
Completed in BuiltinVideoEncoderFactory 类中有一个
InternalEncoderFactory InternalEncoderFactory`, the following code:的成员变量,具体的创建编码器的工作在
std::unique_ptr<VideoEncoder> InternalEncoderFactory::CreateVideoEncoder(
const SdpVideoFormat& format) {
if (absl::EqualsIgnoreCase(format.name, cricket::kVp8CodecName))
return VP8Encoder::Create();
if (absl::EqualsIgnoreCase(format.name, cricket::kVp9CodecName))
return VP9Encoder::Create(cricket::VideoCodec(format));
if (absl::EqualsIgnoreCase(format.name, cricket::kH264CodecName))
return H264Encoder::Create(cricket::VideoCodec(format));
if (kIsLibaomAv1EncoderSupported &&
absl::EqualsIgnoreCase(format.name, cricket::kAv1CodecName))
return CreateLibaomAv1Encoder();
RTC_LOG(LS_ERROR) << "Trying to created encoder of unsupported format "
<< format.name;
return nullptr;
}
VideoEncoderFactory
The interface is business oriented, by calling into InternalEncoderFactory
the CreateVideoEncoder
method to create the concrete encoder.
Create the coder's call stack
Create the encoder with the following stack
peerconnection_client.exe!webrtc::
anonymous namespace'::BuiltinVideoEncoderFactory::CreateVideoEncoder() 行 59 C++ peerconnection_client.exe!webrtc::VideoStreamEncoder::ReconfigureEncoder() 行 633 C++ peerconnection_client.exe!webrtc::VideoStreamEncoder::MaybeEncodeVideoFrame() 行 1264 C++ peerconnection_client.exe!webrtc::VideoStreamEncoder::OnFrame(const webrtc::VideoFrame &)::(anonymous class)::operator()() 行 1046 C++ peerconnection_client.exe!webrtc::webrtc_new_closure_impl::ClosureTask<
lambda at …/…/video/video_stream_encoder.cc:1032:7’>::Run() 行 33 C++
peerconnection_client.exe!webrtc::anonymous namespace'::TaskQueueWin::RunPendingTasks() 行 272 C++ peerconnection_client.exe!webrtc::
anonymous namespace’::TaskQueueWin::RunThreadMain() 行 285 C++
peerconnection_client.exe!webrtc::`anonymous namespace’::TaskQueueWin::ThreadMain() 行 280 C++
peerconnection_client.exe!rtc::PlatformThread::Run() 行 130 C++
peerconnection_client.exe!rtc::PlatformThread::StartThread() 行 62 C++
Create the encoder in VideoStreamEncoder
the ReconfigureEncoder
method (the class here VideoStreamEncoder
is in the video_stream_encoder.h file in the video directory)
VideoEncoder
webrtc
It includes a variety of encoders, such as h264
, vp8
etc. The functions of each encoder are similar, so a base class is abstracted VideoEncoder
, and encoders of different formats are its specific subclasses. H264Encoder
The following class diagram (only subclasses are listed here )
Encoder configuration system
As mentioned earlier, the encoder will have encoding parameters. These parameter information may be generated by business configuration or media capability negotiation. In the end, they all need to be set in the encoder, so there is usually a set of configuration function classes to manage These parameters.
VideoEncoderConfig
Configuration classes that deal directly with businessVideoEncoderConfig
VideoStream
The |VideoStream| struct describes a simulcast layer, or “stream”
According to its comment description, it represents a code stream (configuration). It contains some basic parameters of the code stream, which is VideoEncoderConfig
a further refinement of the configuration information in .
struct VideoStream {
VideoStream();
~VideoStream();
VideoStream(const VideoStream& other);
std::string ToString() const;
// Width in pixels.
size_t width;
// Height in pixels.
size_t height;
// Frame rate in fps.
int max_framerate;
// Bitrate, in bps, for the stream.
int min_bitrate_bps;
int target_bitrate_bps;
int max_bitrate_bps;
double scale_resolution_down_by;
// Maximum Quantization Parameter to use when encoding the stream.
int max_qp;
absl::optional<size_t> num_temporal_layers;
absl::optional<double> bitrate_priority;
// If this stream is enabled by the user, or not.
bool active;
};
Including resolution, maximum/minimum bit rate, qp value, etc., num_temporal_layers
representing the number of layers in the time domain , which is the frame rate. In the supported temporal layer
encoders (such as vp8/9), a code stream contains different frame rates.
Simulcast refers to the simultaneous generation of multiple code streams with different resolutions, frame rates, and bit rates, and one VideoStream
represents the configuration of one of the code streams.
VideoCodec
Common video codec properties
VideoCodec
It is the configuration information of the encoder, and the formal parameter of the type InitEncode
is required in the method . VideoCodec
After converting the configuration information VideoStream
in and VideoEncoderConfig
to VideoCodec
, set it to the encoder.
VideoStream
Just VideoCodec
a subset of . If it is **simulcast** mode, VideoCodec
it contains multiple VideoStream
configuration information.
//这就是对应的videoStream的配置
SpatialLayer simulcastStream[kMaxSimulcastStreams];
VideoCodecInitalizer
Convert VideoEncoderConfig
the sum VideoStream
toVideoCodec
relation
The classes related to the configuration of these encoder parameters are as follows:
The business layer generates encoder parameters according to the configuration or media negotiation results VideoEncoderConfig
, and then takes out the information VideoEncoderConfig
in it to generate , and each code stream corresponds to one . Convert and parameter configuration information into middle configuration information, which will be used for initialization . The specific parameter conversion strategy can be seen, the method in the classsimulcast
VideoStream
simulcast
VideoSteram
VideoCodeceInitalizer
VideoEncoderConfig
VideoStream
VideoCodec
VideoCodec
VideoEncoder
VideoStreamFactoryInterface
CreateEncoderStreams
std::vector<VideoStream> CreateEncoderStreams(
int width,
int height,
const VideoEncoderConfig& encoder_config)
VideoCodecInitializer
and SetupCodec
methods in the class
bool SetupCodec(const VideoEncoderConfig& config,
const std::vector<VideoStream>& streams,
VideoCodec* codec)
Among the methods VideoStreamEncoder
in the class ReconfigureEncoder
, there is the entire process from creating the encoder to converting the encoder configuration, you can see the specific calling method.
Summarize
- The encoder functions of different encoding formats are basically the same, and usually a base class is abstracted, such as
VideoEncoder
- The features supported by the encoder are different, so the parameter configuration of the encoder is often hierarchical, from abstract to concrete, from coarse to fine. For
VideoEncoderConfig
example , it is the case of this kind of progressive hierarchyVideoStream
.VideoCodec
- The encoder configuration system in webrtc is relatively complicated and cumbersome, which is determined by its functional characteristics. It needs to support different encoders, simulcast and TemporalLayer , and the frame rate, bit rate, and resolution of each code stream in simulcast It is dynamically applied according to bandwidth and CPU performance, so additional parameter configuration is definitely required. If we implement the video encoding function ourselves and do not need these features, we can simplify this configuration system