One-to-one voice live broadcast system source code - how to solve the technical difficulties of audio and video live broadcast

As an audio and video application scenario that requires high real-time and interactivity, live broadcast has many technical difficulties, even in one-to-one live broadcast mode. Issues such as low latency, fluency, echo cancellation, domestic and international interoperability, and large-scale concurrency are all difficulties in the development process. However, if there is a high-quality source code of a one-to-one voice broadcast system during the development process, these difficulties may be solved to a certain extent.

1. Low latency

To ensure low latency, the front-end and back-end of the entire chain must be done very carefully. Like some encoding algorithms or frame dropping strategies on the front end do a good job. In addition, the choice of encoder will be different in different business scenarios, which will bring different degrees of encoding delay, so different business scenarios can achieve different degrees of delay. In addition, for the push-pull network option, most solutions will allow users who require real-time interaction to transmit through the core voice and video network, such as high-quality nodes such as BGP. Transcoding, transport protocols, or hybrid processes can also be performed and then distributed through Nie Rong's network. In this way, when accessing the core voice and video network, intelligent scheduling strategies are needed to complete nearby access.

2. Fluency

Fluency is an aspect that is prone to many technical difficulties during the live broadcast process, and it also needs to be paid attention to.

(1) JitterBuffer can be used as a dynamically expanded JitterBuffer. When network conditions are poor or network jitter is severe, the Jitterbuffer can be appropriately increased to reduce the delay in responding to network jitter.

(2) When the network environment is poor, fast broadcast and full broadcast technology can slightly reduce the playback speed without the user's awareness, and then solve the slowdown caused by temporary network jitter. When the network recovers, it can quickly catch up. It should be noted that this method does not work in all applications.

(3) Code rate adaptation, that is, selecting an appropriate code rate for dynamic transmission. To ensure smoothness, the resolution and frame rate can be adjusted appropriately. Of course, the voice and video engine will dynamically adjust the bit rate, frame rate, and resolution based on the current network speed measurement results and the bit rate required by the application to achieve a smooth viewing user experience.

(4) Do some hierarchical coding on the push side, allowing the pull side to dynamically pull out different data for rendering based on the detected network bandwidth. Layered encoding allows the streamer to choose different levels of video encoding data. When the network conditions are good, more levels of data will be selected, and when the network conditions are poor, the basic level of data will be selected.

(5) When the quality of the current push-pull stream is poor and the quality cannot be guaranteed even if the code rate, resolution and frame rate are reduced, the link can be abandoned.

3. Echo cancellation

First, let’s give a brief introduction to the principle of echo cancellation. After the signal is sent to the echo cancellation module, it is used as a reference signal to eliminate it. Later, the signal is sent to the speaker. After the echo is formed due to the reflection of the surrounding environment, the real audio is passed through Microphone input, and then use echo to collect the input signal. The echo cancellation module generates a filter based on the previous reference signal, eliminates the echo, and then sends it out. Regarding the issue of echo cancellation, Google's open source WebRTC provides an echo cancellation module, but it is designed to implement audio and video interaction scenarios on PCs, and has poor adaptability on mobile terminals, especially on Android.

 

4. Domestic and international exchanges

This applies to users operating overseas. Streaming data and control signaling requires cross-border communication. Therefore, some relay nodes should be reasonably arranged around the world. The choice of data path depends on the business. In other words, it is necessary to establish a service routing table based on the physical routing of the link, and determine the service routing table based on user scenarios such as user distribution, access frequency, and high-frequency peaks. Maybe the route is different every time.

5. Massive concurrency

This is a problem that all Internet-related products will encounter. The main considerations are load balancing, how to smoothly expand capacity, agent scheduling for places that cannot be covered, and even disaster recovery and access layer design, etc., which will not be discussed here.

It can be seen that during the development process, we not only need high-quality one-to-one voice live broadcast system source code as "auxiliary", but also need to consider various factors and possible problems. Only in this way can we develop a truly high-quality live streaming app. Otherwise, it will "disappear" in the live broadcast field.

Guess you like

Origin blog.csdn.net/Fxhddg/article/details/126425013