The overall framework of webrtc

1 Introduction

In the past and present of WebRTC, Google acquired GlobalIpSolutions in 2010 and made it open source in 2011. WebRTC is mainly a multimedia real-time communication technology for web browsers, which can realize peer-to-peer communication without other intermediaries. The main purpose of this article is to understand the overall framework of WebRTC and lay the foundation for the subsequent in-depth study of the WebRTC framework.

2 Introduction to the overall framework

The current general framework description of WebRTC is shown in the figure below. The overall architecture of WebRTC is divided into three layers from top to bottom. The top layer is the WbeAPI layer, which is exposed to developers for developing WebRTC applications. JavaScript API; the middle The first layer is the most critical core layer of WebRTC technology. It consists of three modules, namely audio engine, video engine and network transmission; the bottom layer is a layer independently developed by each manufacturer, which is used to realize audio and video collection. and network IO.
insert image description here

2.1 Audio Engine

The audio engine (VoiceEngine) is responsible for the audio communication of WebRTC. Through a complete audio processing framework, it solves the audio processing problem of reading data from an external device such as a microphone and then transmitting it through the network. It is mainly divided into two modules: audio codec and speech signal processing. Its core is echo cancellation (AcousticEchoCancceler, AEC) and noise reduction (NoiseReduction, NR). Echo cancellation is a method of improving sound quality, eliminating echoes that are produced, or preventing them from occurring. Noise reduction is the process of removing noise from a signal. The audio mechanism is mainly divided into two types of codecs: iSAC and iLBC. iLBC codec This narrowband audio codec is suitable for voice communication over IP.

2.2 Video Engine

The video engine (VideoEngine) is responsible for the video communication of WebRTC. Through a complete set of video processing framework, it solves the problem of video processing in which video is collected from external devices such as cameras and then transmitted through the network to finally display the video. It is mainly divided into two modules: video image codec and video image processing. In terms of video image codec, the default codec is VP8, which is more suitable for video codec in real-time communication scenarios. In terms of video image processing, two methods are used to ensure the high quality and aesthetics of the transmitted video image. On the one hand, the video jitter buffer is used to reduce the impact caused by jitter and packet loss; The image is processed by color enhancement, noise reduction, etc. to improve image clarity.

3.3 Network transmission

Network transmission is responsible for the transmission of audio and video data. Through a complete transmission framework, it solves the problems of encrypted transmission of audio and video data and firewall penetration. On the one hand, the SRTP protocol is used to ensure that the audio and video data are transmitted in an encrypted state; on the other hand, the ICE protocol that integrates STUN and TURN is used to ensure that the audio and video data can break through the restrictions of firewalls and NAT networks.

3 directory structure

The WebRTC directory structure is roughly as follows. The core module is the modules layer. This part of the function is relatively independent and can be used separately. It is similar to an audio and video toolbox, covering a relatively comprehensive audio and video components. The audio and video QOS strategy is mainly concentrated in this part. . The network P2P module is another core function, which is very helpful for peer-to-peer network connection research. Follow-up will be further in-depth research based on each module and data flow.
insert image description here
The framework structure of webrtc can be divided into 5 layers. The general function of each layer is as follows:
Interface layer: This layer is mainly acted by the mediaengine module, and its main function is to connect the Java layer and the C++/C layer, and provide the JNI interface to the Java layer. Called at the upper layer; at the same time provide callback definition for callback at the lower layer.
Logic layer: This layer is mainly composed of audio_engine and video_engine modules, which mainly maintain the logical processing relationship of audio/video channels. The mediaengine module calls the logic layer through VoiceEngineData/VideoEngineData The logic layer manages the respective channels and switching logic through VoEBase/ViEBase, and calls the related interfaces of the component layer through other classes. Component
layer: This layer is composed of modules, and provides all audio/video related operation components upwards , the logic layer can abstractly call codec functions or underlying hardware devices and files through these components. This layer can also be regarded as an abstraction of specific platform operations. General operation layer: This layer is composed of common_audio and common_video modules, and its main function
is It provides some common tool functions for public calls between different components. It can be regarded as a tool class.
Platform encapsulation layer: This layer is composed of the system_wrapper module, and its function is mainly a cross-platform interface, which integrates common The system interface is encapsulated and abstracted to facilitate porting between different platforms.

Guess you like

Origin blog.csdn.net/qq_38731735/article/details/123338738
Recommended