Let’s talk about the live broadcast tool, the mixing scheme behind Lianmai Interaction: how to mix?

 

Take the famous saying of Zeng Zhiwei's black boss in "Infernal Affairs" as an opening remark: If you come out and hang out, you will have to pay it back sooner or later.

Although he said that, but where he is, with the right boss, his fate will be very different. In the Lianmai interactive live broadcast, where to mix (stream) is also very important. If you make the right choice, the user experience will be very different.

Without further ado, let me begin.

 

Where can I hang out?

Before deciding where to hang out, let's figure out where you can hang out.

In the last article, we discussed whether to mix the flow. In this article, we will discuss where to mix the flow.

 

 

                                    Figure 1 System topology diagram of Lianmai interactive live broadcast solution

Figure 1 is the system topology diagram of the current mainstream Lianmai interactive live broadcast solution in the industry.

 

 The topology map contains the following entities:

1) Streaming end (host end): The working environment of the host, including mobile phone hardware configuration and network environment. The computing power and uplink network of mobile phones often become the bottleneck of Lianmai interactive live broadcast.

2) Server (server cluster): A huge and complex server cluster that realizes the scheduling and computing capabilities of the audio and video cloud. Specifically, it will include signaling servers, media server clusters, mixed flow scheduling centers, and mixed flow server clusters.

3) CDN network: a third-party independent public service network that provides buffering, storage and forwarding capabilities.

4) Streaming end (viewer end): The environment where viewers watch the live broadcast, including mobile phone hardware configuration and network environment. The streaming end generally pulls the stream from the CDN, and does not participate in the Lianmai interaction. The computing power of the mobile phone and the downlink network will not become the bottleneck of the live broadcast.

 

Activities between entities in a topology map include:

1) Streaming: The streaming end pushes the original audio and video streams to the media server cluster.

2) Streaming: There are two cases: the streaming end pulls the audio and video streams of other hosts from the server cluster; the streaming end pulls the audio and video streams from the edge nodes of the CDN network for playback, which can be single-stream or multi-stream.

3) Retweeting CDN: There are two situations: if the stream is mixed on the server side, the server cluster will push all the mixed audio and video streams to the CDN network; The video stream is pushed to the CDN network.

After figuring out which hills are on the rivers and lakes, we can choose wisely where to hang out.

First, the CDN network is excluded because it is a third-party service that is outside the control of the Lianmai Interactive Live Cloud Service Platform.

Then, the mixed flow at the pulling end is actually the non-mixed flow scheme mentioned in the previous article . The pulling end pulls multiple streams, and flexibly controls multiple streams according to the needs of the service side during playback. The advantage of the stream-pulling solution is that it is highly flexible and easy to control, while the disadvantage is that the cost of network bandwidth is relatively high. Because it has been discussed in depth in the previous article, it will not be discussed here.

The last remaining options are the streaming side and the server side : the streaming side uses the SDK of the audio and video cloud service, and the server side is the server cluster that provides the audio and video cloud service, both of which can be mixed.

 

 Where do you decide to hang out?

We now have to decide where to hang out, and there are two choices in front of us: the push side vs the server side.

We need to understand the advantages and disadvantages of mixing streams on the push side and on the server side before we can make a wise choice.

 

 Mix flow on the push end

If you want to understand the advantages and disadvantages of streaming on the push side, you must first understand the technical logic of streaming on the push side.

 

                                                Figure 2 The logic diagram of the mixed flow technology at the push end

 

Figure 2 is based on Figure 1, with the addition of the technical logic of the push-end mixing (blue part):

1) The streaming end pushes the original audio and video streams to the server cluster.

2) The push end pulls the audio and video streams pushed by other push ends from the server. Mixing will be done on the streaming end of the first host. The first anchor cannot start mixing until the audio and video streams of all other anchors arrive.

3) The content of stream mixing at the streaming end includes decoding, image alignment (audio and video synchronization), jitter buffering, and re-encoding.

4) The streaming end (the first host) pushes the mixed single stream to the CDN network, so that the streaming end can pull the stream for playback.

Next, let's see how much additional work is involved in muxing on the push side.

 

There are the following steps for mixing flow at the push end:

1) The push end pulls the stream, waiting for the audio and video streams of other hosts to arrive, 2) decoding, 3) mixing, 4) encoding, and 5) retweeting.

Among them, steps 1 to 2 are the things that the push end must do when the stream is not mixed, and steps 3 to 5 are additional work when the stream is to be mixed. The workload of mixing two streams into one stream is less than half of that of decoding one stream, and the workload of decoding one stream is about half of that of encoding one stream. However, it should be considered that with the increase in the number of mixed flows, the workload of mixed flows will also increase accordingly.

 Then, let's look at the requirements of mixing and the characteristics of the push side. Mixed flow is a resource-intensive thing, and the push end is a place that lacks resources. These two are inherently incompatible. Now it is suddenly said that to make them together, we must first study their requirements and characteristics.

 

(Push end) requirements for mixed flow

1) Better upstream network bandwidth, because the streaming end (the first host) has to push two streams, the original stream and the mixed stream. In addition, the network should remain relatively stable, because the jitter buffer during the mixing process will be lengthened due to network instability, thereby increasing the delay.

2) Better mobile phone hardware configuration, because it is necessary to transcode and mix multiple audio and video streams, which consumes more computing resources. If the sum of the bit rate of the audio and video streams to be transcoded is relatively high, some Android platforms that use software programming may cause the phone to be too hot, resulting in frame loss during camera capture.

 

 Features of the push end

1) The network environment of the host may be home broadband or 4G. The downlink bandwidth is about 100M bps, and the uplink bandwidth is about 1M bps. The stability and speed of home broadband can vary with busy times. People who live in our country know this.

2) The host's live broadcast terminal is the host's personal smartphone. At present, the mainstream mobile phone configuration is quad-core, which can carry out interactive live broadcast with microphone. However, the hardware configuration of mobile phones is difficult to compare with PCs, let alone servers.

3) Uncontrollable. Neither the live broadcast business platform nor the live broadcast cloud service platform can control the mobile phone configuration and network environment of the streaming end. The traditional live broadcast platform of the show or the live broadcast platform combined with traditional media will provide the anchor with a live broadcast room with better hardware configuration and network environment. The anchors of other entertainment live broadcast platforms generally use personal mobile phones and home broadband for live broadcasts.

If you compare them like this, you can see at a glance that these two are really incompatible: one is a Virgo, and the other is weak.

Finally, let's summarize the advantages and disadvantages of push-side mixing.

 

Advantages of mixing streams on the push side

1) low cost

On the whole, the mixed flow at the push end is a low-cost solution. It reduces costs in two areas: computing resources and network bandwidth. In essence, mixing streams on the push side means that the server transfers the cost of stream mixing to the push side. The computing resources and network bandwidth on the server side are relatively expensive, while the computing resources and network bandwidth on the streaming side are both dead costs. If the streaming end is mixed, the cost of the server will be reduced, and the resources that can be shared by the streaming end will be fully utilized.

2) Low pressure on the server

Mixing flow on the server is a relatively centralized mode, which will increase the pressure on the server. Mixing on the push side is a completely distributed mode, which can reduce the pressure on the server.

3) Locally output the mixed data

After the streaming end is mixed, the mixed audio and video streams will be output, which is convenient for local recording, or directly push the audio and video streams to the CDN network for distribution.

 

Disadvantages of mixing streams on the push side

1) Add extra delay

First, mixing on the push side will add extra delay, mainly because it has to wait for the audio and video streams from all other push sides to arrive before starting the mixing. From Figure 2, we can see that when mixing streams on the server side, as long as the audio and video streams of all the anchors reach the server side, they can start mixing streams; when mixing streams on the streaming side, on the basis of the former, the audio and video streams of all other streaming sides can be mixed. It is then pulled to the push end to start mixing. This is extra time overhead. Second, after the streaming end is mixed, the delay of streaming to the CDN network is relatively large, because the hardware configuration and upstream bandwidth quality of the streaming end cannot be compared with those of the server. In the end, given all the instability on the push side, the extra latency will only increase, not decrease.

2) Mobile phone hardware configuration bottleneck

Mixing streams on the streaming end requires better mobile phone hardware configuration. Generally speaking, the current mainstream quad-core mobile phones can meet the requirements of Lianmai interactive live broadcast. However, if the workload of mixing streams is counted, the hardware configuration of the mobile phone will become a bottleneck. For example, if an Android phone uses software editing and the number of audio and video streams to be mixed is relatively large, the phone will heat up due to the large amount of calculation, which may cause the camera (closer to the CPU) to lose video when capturing video. frame phenomenon.

3) Uplink network bandwidth bottleneck

Mixing streams on the streaming end requires relatively good uplink network bandwidth. If the downlink network bandwidth is 100M bps, then the corresponding uplink network bandwidth is generally 1M bps, or 4M bps for better ones. According to the experience of Instantaneous Technology, the average bit rate of audio and video streams is 800k bps. The streaming end will push two streams: the original audio and video stream and the mixed audio and video stream, so the total streaming bit rate is about 1.6M bps. Considering that the network bandwidth will be discounted during the time period when there are a lot of people online, and the network is unstable, the upstream network bandwidth of the streaming end is often unable to meet the requirements of the streaming end mixed streaming.

4) The push-end environment is uncontrollable

Combining the second and third points above, the environment of the push-stream end is uncontrollable. Neither the live broadcast business platform nor the live broadcast cloud service platform can control the hardware configuration, usage habits, network signal and network bandwidth of the streaming end. Therefore, the effect of mixing flow at the push end is also uncontrollable.

5) Difficult to scale

In the design stage of the audio and video cloud service solution, we expect the solution to be easy to expand. With the development of the live broadcast service platform, there will be upgrade requirements for the computing power and network bandwidth of the streaming end. However, the push-end environment is uncontrollable and difficult to scale. Relatively speaking, if you want to increase the CPU of the server or increase the network bandwidth when doing streaming on the server, it is all within the control of the audio and video cloud service platform.

 To sum up, the push end is not an ideal place for mixing, but it provides a low-cost mixing solution. Streaming on the push side can meet the business needs of a considerable number of live broadcast service platforms at a certain stage of development. This market demand should be fully explored and satisfied.

 

Streaming on the server

If you want to understand the advantages and disadvantages of mixing streams on the server side, you must first understand the technical logic of stream mixing on the server side.

 

                                            Figure 3 Logic diagram of server-side mixing technology

 

Figure 3 is based on Figure 1, adding the technical logic of server-side mixing (blue part):

1) The streaming end pushes the original stream to the server cluster respectively.

2) After the server waits for the arrival of all the audio and video streams of the streaming end, it starts mixing. Mixing work also includes decoding, aligning the picture (audio and video synchronization), jitter buffering, and re-encoding.

3) The server pushes the mixed single stream to the CDN network, so that the stream-pulling end can pull the stream for playback.

 Next, let's see how much extra work mushing on the server would bring.

 

There are the following steps for mixing streams on the server:

1) The push end pushes the stream, and the server waits for the arrival of the audio and video streams of all the hosts 2) Decoding 3) Mixing 4) Encoding 5) Reposting the stream.

All steps are almost the same as mixing streams on the push stream, but the working environment is different. All steps are additional workload on the server side. The work content of server-side mixing is different from that of push-side mixing: the push-side decoding is something that needs to be done regardless of whether the stream is mixed or not, while the server-side decoding is an extra work that needs to be mixed.

 Then, let's look at the requirements of muxing and the characteristics of the server. Mixed streaming is a resource-intensive thing, and the server is a place with abundant resources. The two seem to be a good match. Now that we suddenly say that we want them to be together, then we still have to give them a horoscope first.

 The requirements for stream mixing on the push side have been analyzed above, and only the requirements for stream mixing on the server side are discussed here.

 

(Server) Requirements for Mixed Streaming

1) Better uplink network bandwidth. All the audio and video streams launched by the streaming end are concentrated on the server, and then reposted to the CDN network after mixing. Each Lianmai live broadcast room corresponds to one channel of mixed flow, so this centralized mixed flow method will cause certain pressure on the upstream network of the server.

 2) Better server hardware configuration. This centralized mixing method will cause certain pressure on the computing resources of the server.

 

Features of the server

1) The network bandwidth resources are relatively sufficient and support expansion.

2) The computing resources are relatively sufficient, and expansion is supported.

3) Fully controllable. The audio and video cloud service platform can adjust the configuration of the server according to the pressure of the network and computing.

4) It can be expanded. The server side generally adopts the design method of server cluster, which is flexible and can be expanded. For the growing demand for network bandwidth and computing resources, it can be upgraded flexibly, and even dynamically allocated.

 

If you compare them like this, you can see at a glance that the two are really a good match: one likes to buy and buy, and the other is rich.

 

Finally, let's summarize the advantages and disadvantages of server-side muxing.

 Advantages of mixing streams on the server side

 1) Low latency

Mixing streams on the server side naturally has the characteristics of low latency. Mixing streams on the server side only needs to wait for the audio and video streams of all other anchors to arrive at the server side to start mixing streams; on this basis, when mixing streams on the push stream side, it is necessary to pull the stream from the server side to the push stream side, and wait for all other anchors. The audio and video streams can be mixed before the audio and video streams are pulled down. The system design of mixing streams on the server side naturally reduces the network transmission time of this segment compared to mixing streams on the push stream side. In addition, the computing power and network bandwidth of the server side are several orders of magnitude higher than that of the streaming side. The process of mixing and the time it takes to push the stream to the CDN network will be less on the server side than on the streaming side. Taken together, server-side mixing can achieve lower latency than push-side mixing.

2) Sufficient computing resources

The computing resources on the server side are relatively sufficient, and can be expanded and scheduled without becoming a bottleneck.

3) Sufficient network bandwidth resources

The network bandwidth of the server is relatively sufficient, and it can be expanded and scheduled without becoming a bottleneck.

 4) Controllable and scalable

In fact, this is the biggest advantage of the server. On the server side, there are abundant resources of the cloud service platform, which can be adjusted and expanded flexibly, with professional services and strong support from a professional team. This is the advantage of the cloud service platform; this is the way the army fights; this is the concept of fighting tough battles by organizing. To put it simply, according to the experience of Instantaneous Technology, 1 core can support 5 streams, and 8 cores can support 40 streams. As the streams continue to increase, I will continue to increase the CPU to enhance the computing power without perception. If you change to a terminal mobile phone, there is no way to increase the CPU. Either you need to change the mobile phone, or you can only wait for it to burn.

 

Disadvantages of mixing streams on the server side

1) High cost

Mixing streams on the server side will cause the server side to bear additional computing costs and network bandwidth costs, thereby driving up operating costs.

2) high pressure

Mixing on the server side is also called centralized mixing. The bandwidth pressure of audio and video streams, as well as the computational pressure of transcoding and stream mixing, will all converge on the server, which naturally increases the pressure on the server. This situation also poses a challenge to the architecture design of the server, requiring the server to be scalable and able to cope with pressure in a distributed and clustered manner.

To sum up, the server is an ideal place to mix streaming, it has the advantages of low latency and high service quality, but its cost is relatively high. It can meet the business needs of a considerable number of mature or high-quality live broadcast business platforms. This market demand is the mainstream and the future trend.

After the above discussion, let's go back and compare the advantages and disadvantages of streaming on the push side and on the server side. We will find that these two solutions actually have their own advantages. All represent considerable market demand. At all stages of industry development, both needs should be respected and satisfied to promote the healthy development and maturity of the industry.

However, from a medium and long-term perspective, the advantages of cloud service platforms have been recognized and fully developed. The philosophy of the cloud service platform is to provide the industry with high-quality professional services through the resources and capabilities of the cloud service platform and a professional team. In the process of serving a large number of customer groups, Jigou Technology has observed such a trend: more and more live broadcast business platforms, especially the first-tier platforms, are very good at using the advantages of cloud service platforms to ensure high-quality user experience , and then rapidly expand market share.

Well, after figuring out the advantages and disadvantages of various hills on the rivers and lakes, we can choose wisely where to hang out. All in all, there are three places where you can mix (stream): the push side, the server side, and the pull side. Instant Technology focuses on user experience and service quality, and provides two solutions: server-side mixing and pull-side mixing, and will provide push-side mixing solutions at appropriate times according to market demand. 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324397672&siteId=291194637