Video encoding in webrtc (1) Video encoding module outline

Regarding the analysis of webrtc video encoding, this will be a series of articles, mainly analyzing the video encoding module from the two aspects of code structure and design ideas, and will not enter the code details. The purpose is to learn from it when implementing video encoding . This article is the first in this series and mainly introduces the outline of the video encoding module.

Article directory

In webrtc, the video encoding function is implemented by a series of functional classes in cooperation with each other, because the encoding function not only involves encoding raw video data, but has at least the following functions:

Initialize encoding parameters according to the configuration or the result of video negotiation
According to the bandwidth or configuration, dynamically change the encoding parameters, such as dynamically changing the encoding bit rate, resolution, frame rate, etc.
Simultaneously generate multiple streams

The entire video encoding function includes the following functional classes:

Video encoding interface class and related factory classes

VideoEncoderThe video encoding interface class is an abstract interface class. It has multiple instances, encoding h264 encoder, vp8/9 encoder
VideoEncoderFactoryfactory class, VideoEncoderspecific instance for creation

specific encoding class

H264EncoderH264 encoding interface class, there is a Createmethod for creating h264 encoding class instance (good to go around -_-!)
H264EncoderImpThe specific implementation of h264 encoding, H264Encoderthe instance

Encoder parameter configuration related classes

Including VideoStream, VideoCode, ViCodecInitializerthe relationship between them will be explained in the article.

create encoder

VideoEncoderFactory

Create a factory class for an encoder, which is an abstract class that defines the method for creating an encoder

std::unique_ptr<VideoEncoder> CreateVideoEncoder(
      const SdpVideoFormat& format);

BuiltinVideoEncoderFactoryand InternalEncoderFactoryyes VideoEncoderFactoryimplementing subclasses

insert image description here

Completed in BuiltinVideoEncoderFactory 类中有一个InternalEncoderFactory InternalEncoderFactory`, the following code:的成员变量，具体的创建编码器的工作在

std::unique_ptr<VideoEncoder> InternalEncoderFactory::CreateVideoEncoder(
    const SdpVideoFormat& format) {
    
    
  if (absl::EqualsIgnoreCase(format.name, cricket::kVp8CodecName))
    return VP8Encoder::Create();
  if (absl::EqualsIgnoreCase(format.name, cricket::kVp9CodecName))
    return VP9Encoder::Create(cricket::VideoCodec(format));
  if (absl::EqualsIgnoreCase(format.name, cricket::kH264CodecName))
    return H264Encoder::Create(cricket::VideoCodec(format));
  if (kIsLibaomAv1EncoderSupported &&
      absl::EqualsIgnoreCase(format.name, cricket::kAv1CodecName))
    return CreateLibaomAv1Encoder();
  RTC_LOG(LS_ERROR) << "Trying to created encoder of unsupported format "
                    << format.name;
  return nullptr;
}

VideoEncoderFactoryThe interface is business oriented, by calling into InternalEncoderFactorythe CreateVideoEncodermethod to create the concrete encoder.

Create the coder's call stack

Create the encoder with the following stack

peerconnection_client.exe!webrtc::anonymous namespace'::BuiltinVideoEncoderFactory::CreateVideoEncoder() 行 59 C++ peerconnection_client.exe!webrtc::VideoStreamEncoder::ReconfigureEncoder() 行 633 C++ peerconnection_client.exe!webrtc::VideoStreamEncoder::MaybeEncodeVideoFrame() 行 1264 C++ peerconnection_client.exe!webrtc::VideoStreamEncoder::OnFrame(const webrtc::VideoFrame &)::(anonymous class)::operator()() 行 1046 C++ peerconnection_client.exe!webrtc::webrtc_new_closure_impl::ClosureTask<lambda at …/…/video/video_stream_encoder.cc:1032:7’>::Run() 行 33 C++
peerconnection_client.exe!webrtc::anonymous namespace'::TaskQueueWin::RunPendingTasks() 行 272 C++ peerconnection_client.exe!webrtc::anonymous namespace’::TaskQueueWin::RunThreadMain() 行 285 C++
peerconnection_client.exe!webrtc::`anonymous namespace’::TaskQueueWin::ThreadMain() 行 280 C++
peerconnection_client.exe!rtc::PlatformThread::Run() 行 130 C++
peerconnection_client.exe!rtc::PlatformThread::StartThread() 行 62 C++

Create the encoder in VideoStreamEncoderthe ReconfigureEncodermethod (the class here VideoStreamEncoderis in the video_stream_encoder.h file in the video directory)

VideoEncoder

webrtcIt includes a variety of encoders, such as h264, vp8etc. The functions of each encoder are similar, so a base class is abstracted VideoEncoder, and encoders of different formats are its specific subclasses. H264EncoderThe following class diagram (only subclasses are listed here )

insert image description here

Encoder configuration system

As mentioned earlier, the encoder will have encoding parameters. These parameter information may be generated by business configuration or media capability negotiation. In the end, they all need to be set in the encoder, so there is usually a set of configuration function classes to manage These parameters.

VideoEncoderConfig

Configuration classes that deal directly with businessVideoEncoderConfig

VideoStream

The |VideoStream| struct describes a simulcast layer, or “stream”

According to its comment description, it represents a code stream (configuration). It contains some basic parameters of the code stream, which is VideoEncoderConfiga further refinement of the configuration information in .

struct VideoStream {
    
    
  VideoStream();
  ~VideoStream();
  VideoStream(const VideoStream& other);
  std::string ToString() const;
  // Width in pixels.
  size_t width;
  // Height in pixels.
  size_t height;
  // Frame rate in fps.
  int max_framerate;
  // Bitrate, in bps, for the stream.
  int min_bitrate_bps;
  int target_bitrate_bps;
  int max_bitrate_bps;
  double scale_resolution_down_by;
  // Maximum Quantization Parameter to use when encoding the stream.
  int max_qp;
  absl::optional<size_t> num_temporal_layers;
  absl::optional<double> bitrate_priority;
  // If this stream is enabled by the user, or not.
  bool active;
};

Including resolution, maximum/minimum bit rate, qp value, etc., num_temporal_layersrepresenting the number of layers in the time domain , which is the frame rate. In the supported temporal layerencoders (such as vp8/9), a code stream contains different frame rates.
Simulcast refers to the simultaneous generation of multiple code streams with different resolutions, frame rates, and bit rates, and one VideoStreamrepresents the configuration of one of the code streams.

VideoCodec

Common video codec properties

VideoCodecIt is the configuration information of the encoder, and the formal parameter of the type InitEncodeis required in the method . VideoCodecAfter converting the configuration information VideoStreamin and VideoEncoderConfigto VideoCodec, set it to the encoder.

VideoStreamJust VideoCodeca subset of . If it is **simulcast** mode, VideoCodecit contains multiple VideoStreamconfiguration information.

//这就是对应的videoStream的配置
SpatialLayer simulcastStream[kMaxSimulcastStreams];

VideoCodecInitalizer

Convert VideoEncoderConfigthe sum VideoStreamtoVideoCodec

relation

The classes related to the configuration of these encoder parameters are as follows:
Encoder parameter setting.png

The business layer generates encoder parameters according to the configuration or media negotiation results VideoEncoderConfig, and then takes out the information VideoEncoderConfigin it to generate , and each code stream corresponds to one . Convert and parameter configuration information into middle configuration information, which will be used for initialization . The specific parameter conversion strategy can be seen, the method in the classsimulcastVideoStreamsimulcastVideoSteramVideoCodeceInitalizerVideoEncoderConfigVideoStreamVideoCodecVideoCodecVideoEncoder
VideoStreamFactoryInterfaceCreateEncoderStreams

std::vector<VideoStream> CreateEncoderStreams(
        int width,
        int height,
        const VideoEncoderConfig& encoder_config)

VideoCodecInitializerand SetupCodecmethods in the class

bool SetupCodec(const VideoEncoderConfig& config,
                         const std::vector<VideoStream>& streams,
                         VideoCodec* codec)

Among the methods VideoStreamEncoderin the class ReconfigureEncoder, there is the entire process from creating the encoder to converting the encoder configuration, you can see the specific calling method.

Summarize

The encoder functions of different encoding formats are basically the same, and usually a base class is abstracted, such asVideoEncoder
The features supported by the encoder are different, so the parameter configuration of the encoder is often hierarchical, from abstract to concrete, from coarse to fine. For VideoEncoderConfigexample , it is the case of this kind of progressive hierarchy VideoStream.VideoCodec
The encoder configuration system in webrtc is relatively complicated and cumbersome, which is determined by its functional characteristics. It needs to support different encoders, simulcast and TemporalLayer , and the frame rate, bit rate, and resolution of each code stream in simulcast It is dynamically applied according to bandwidth and CPU performance, so additional parameter configuration is definitely required. If we implement the video encoding function ourselves and do not need these features, we can simplify this configuration system