The application practice of real-time audio and video technology in celebrities accompanying live broadcasts

  //  

Editor's note: iQIYI's celebrity-watching live streaming service launched in recent years has created a new experience for real-life celebrities to interact with audiences in close real-time around film, television, drama and variety shows, and has gradually attracted the attention of users. In terms of technology implementation, iQiyi has done its best through in-depth cooperation with third-party audio and video service providers, ultimately minimizing costs and maximizing effects. LiveVideoStackCon 2023 Shanghai Station invited Shi Xingdong from iQiyi to share with you the overall technical architecture of iQiyi’s star-watching live broadcast business, as well as iQiyi’s drama copyright management, reuse of existing infrastructure, high availability guarantee, etc. Some optimization considerations have been made.

Text/Shi Xingdong

Organize/LiveVideoStack

Hello everyone, the topic I share this time is the application practice of real-time audio and video technology in live broadcasts accompanied by celebrities.

f89687493d377d4c2853969a72c8e038.png

First, give a brief self-introduction. I joined iQiyi in 2013 and have been engaged in the architecture design and R&D implementation of instant messaging, WebRTC, live video and other technologies for a long time.

This sharing will be divided into the following parts: first, explain the starting point of Paikan Live Broadcasting business, then introduce the general technical architecture of Paikan Live Broadcasting, then introduce our adjustments to the general architecture from practical perspectives such as copyright, cost, and high availability, and finally Put forward some of our considerations for the front-end technology architecture.

-01-

The business starting point of the celebrity accompanying model

9f9e0cd66886440a029871a22ea3a112.png

As shown in the picture above, iQiyi's live broadcast mode allows celebrities to watch movies, TV dramas and variety shows with the audience in the form of live interaction. For streaming media platforms, this model is a means of promoting variety shows, movies and TV series. For celebrities, this model can help increase exposure and attract more fans while promoting their participation in the show. The audience can further shorten the distance with the real celebrities through this mode.

Major streaming media platforms have developed different business forms for celebrities to watch with them. For example, talk shows allow celebrities and audiences to watch film and television drama clips with the audience in the studio. Our companion viewing model focuses on lightweighting and automation, and strives to free up operational manpower. Celebrities only need to use the APP provided by iQiyi to start live broadcasting without venue restrictions. At present, we organize 20 to 30 live streaming activities every month, with the frequency reaching one per day.

-02-

The general technical architecture of the live broadcast

17654ba955b15e9271c81a1bd0fba06a.png

Next, we will introduce the general technical architecture of the live broadcast. First of all, the picture above shows the structure of a simple scenario such as "only celebrities chatting, no companions watching on-demand dramas".

Since the celebrity side has real-time communication requirements, the only transmission protocol can be WebRTC. The base of downstream users on the audience side is very large (tens to hundreds of thousands). Although the delay is low when using WebRTC, the cost pressure is high. The delay when using the lower-cost HLS is too high, which is not conducive for celebrities to receive feedback from the audience in real time. Therefore, after comprehensive consideration, we chose HTTP-FLV as the viewer-side transmission protocol.

2a279bf9ab3dda45031ececb2188beb9.png

As mentioned above, considering cost factors, we cannot use the RTC protocol on the entire link, which results in a large delay gap between the celebrity side and the audience side.

Then when it comes to the choice of media server, streaming involves synchronization between streams. The multi-stream mode that requires high delay error is no longer applicable. Only the single-stream mode can be used to combine the multiple streams on the star side into one. This mode is less affected by network conditions, but the media server needs to have video confluence capabilities and is more complex in structure.

b5fa70e3a37506078c0c21c798d40b4a.png

For the "celebrity chat + watching on-demand" scenario, based on the aforementioned architecture, we additionally introduced a VOD video file processing module and a control module. The former module is responsible for converting VOD local video files into real-time video streams, and converts them through the confluence module. The chat stream with the celebrity is combined and then pushed to the audience side.

With the help of the control module, the screen layout and volume ratio of drama series and celebrity live broadcasts can be optimized according to the actual situation, making the screen layout more reasonable and the chat sound clearer, thus improving the viewing experience.

d90282ea9bd48744221774e53f5e20a8.png

Although iQiyi itself has real-time communication capabilities. However, considering that live streaming activities are currently only held during part of the night, and the network environment where celebrities live is generally poor, from the perspective of minimizing construction and maintenance costs, we finally chose a public cloud platform to provide media communication services. .

However, as the core asset of iQiyi, video files are stored on the intranet and cannot be directly accessed by the public cloud, so a transmission solution needs to be implemented.

-03-

Adjustments to the architecture from a copyright perspective

d3d753428fdc1b6704697666667fd632.png

Existing transmission methods are divided into offline (such as FTP) and real-time transmission (such as RTMP). Offline transmission does not require too much early development, but it takes up a lot of manpower during live broadcast, which also increases the risk of leakage of media assets files. Real-time transmission realizes machine automation and takes up less manpower. The disadvantage is that the early development workload is large. From the perspective of minimizing the risk of leakage, we chose a real-time transmission solution.

9a007bd4e9bc544a7942e7e754017f20.png

Since RTMP is a common real-time media transmission protocol, we decided to push the processed real-time video stream to the public cloud through the RTMP server to avoid leaking media assets files in advance. The green color in the picture represents work completed by iQiyi itself.

Although increasing RTMP transmission will increase the delay of video playback, a delay of several seconds will have little impact on the experience. Celebrity chat interactions are generated based on video content, so generally there will be no out-of-sync problems after the merge.

-04-

Adjustment of architecture from a cost perspective

957528d5da96fa005f09d324618b4693.png

Currently, public clouds that provide real-time video communication services generally charge based on actual traffic. From a cost perspective, given that the current live broadcast does not occupy much bandwidth, iQiyi's existing CDN can be used to complete the media Distribution only requires setting up an RTMP server between the public cloud and iQiyi CDN.

In the end, only the real-time stream on the star side actually generates traffic in the public cloud. This part of the stream does not last long and has low definition, thus greatly saving network costs.

-05-

Adjusting the architecture from the perspective of high availability

631b3b07087d2b435f9791ea26e4428f.png

In order to further improve the stability of the architecture, we first analyzed the existing risk points. Taken together, public cloud servers and iQiyi's own CDN have been operating for many years and are relatively mature. However, as new links, the processing module and RTMP server responsible for processing and distributing real-time video streams need to focus on unstable factors.

a63d84cea0c8bf3fe7c39ce50357c6e1.png

In order to avoid single points of failure in new links, we adopted a multi-instance solution. Since RTMP video forwarding is performed in real time and does not involve retaining progress status, different instances can be seamlessly resumed without synchronization. The work of the video file processing module has progress attributes, so once an interruption occurs, progress synchronization is required.

254c23a0d87a7f1449ac2df1c9bcfa2e.png

At the same time, we are also considering making full use of the public cloud infrastructure and adding an RTMP server to the public cloud as further insurance. This opens up a variety of new transmission paths and minimizes the risk of failure. The RTMP server in the public cloud is not actually activated and no additional costs are incurred.

705da9f792505d2c7f3cfc2c274dadc3.png

In order to facilitate switching between different transmission links, we finally added a link scheduling module, which can be used to switch various transmission instances from the star side to the public cloud, iQiyi to the public cloud, and within iQiyi.

-06-

Considerations for front-end technical architecture

35922b3f75b57455b30aee50a323cb82.png

For the front end, we currently provide celebrities with the iQiyi Broadcaster APP, which integrates the SDK corresponding to the RTC cloud service. It can collect audio and video streams through the mobile phone camera and microphone and complete real-time communication. The APP is responsible for picture rendering.

432db065646b52a985a5cbaf5f4e9042.png

In order to meet user needs (such as beauty), we have launched in-depth cooperation with RTC SDK providers to allow APP to participate in the audio and video communication process. The images captured by the RTC SDK will first be beautified by the APP and then communicated. The final effect that the star sees on the mobile phone screen is the image of himself and the other party after beautification.

b905e4d21949349fa46e0f0f697b5d95.png

Overall, our live broadcast business fully leverages the technical expertise of both iQiyi and public cloud providers, while reducing overall costs by reusing infrastructure as much as possible. The last step is to identify risks and improve corresponding plans.

That’s all my sharing today, thank you all!


bad95ec2333426c79d0a11f2302a4cc3.jpeg

Scan the QR code in the picture or click " Read the original text " 

Direct access to LiveVideoStackCon 2023 Shenzhen Station 10% off ticket purchase channel

Guess you like

Origin blog.csdn.net/vn9PLgZvnPs1522s82g/article/details/133326387