//
Editor's note: iQIYI's celebrity-watching live streaming service launched in recent years has created a new experience for real-life celebrities to interact with audiences in close real-time around film, television, drama and variety shows, and has gradually attracted the attention of users. In terms of technology implementation, iQiyi has done its best through in-depth cooperation with third-party audio and video service providers, ultimately minimizing costs and maximizing effects. LiveVideoStackCon 2023 Shanghai Station invited Shi Xingdong from iQiyi to share with you the overall technical architecture of iQiyi’s star-watching live broadcast business, as well as iQiyi’s drama copyright management, reuse of existing infrastructure, high availability guarantee, etc. Some optimization considerations have been made.
Text/Shi Xingdong
Organize/LiveVideoStack
Hello everyone, the topic I share this time is the application practice of real-time audio and video technology in live broadcasts accompanied by celebrities.
First, give a brief self-introduction. I joined iQiyi in 2013 and have been engaged in the architecture design and R&D implementation of instant messaging, WebRTC, live video and other technologies for a long time.
This sharing will be divided into the following parts: first, explain the starting point of Paikan Live Broadcasting business, then introduce the general technical architecture of Paikan Live Broadcasting, then introduce our adjustments to the general architecture from practical perspectives such as copyright, cost, and high availability, and finally Put forward some of our considerations for the front-end technology architecture.
-01-
The business starting point of the celebrity accompanying model
As shown in the picture above, iQiyi's live broadcast mode allows celebrities to watch movies, TV dramas and variety shows with the audience in the form of live interaction. For streaming media platforms, this model is a means of promoting variety shows, movies and TV series. For celebrities, this model can help increase exposure and attract more fans while promoting their participation in the show. The audience can further shorten the distance with the real celebrities through this mode.
Major streaming media platforms have developed different business forms for celebrities to watch with them. For example, talk shows allow celebrities and audiences to watch film and television drama clips with the audience in the studio. Our companion viewing model focuses on lightweighting and automation, and strives to free up operational manpower. Celebrities only need to use the APP provided by iQiyi to start live broadcasting without venue restrictions. At present, we organize 20 to 30 live streaming activities every month, with the frequency reaching one per day.
-02-
The general technical architecture of the live broadcast
Next, we will introduce the general technical architecture of the live broadcast. First of all, the picture above shows the structure of a simple scenario such as "only celebrities chatting, no companions watching on-demand dramas".
Since the celebrity side has real-time communication requirements, the only transmission protocol can be WebRTC. The base of downstream users on the audience side is very large (tens to hundreds of thousands). Although the delay is low when using WebRTC, the cost pressure is high. The delay when using the lower-cost HLS is too high, which is not conducive for celebrities to receive feedback from the audience in real time. Therefore, after comprehensive consideration, we chose HTTP-FLV as the viewer-side transmission protocol.
As mentioned above, considering cost factors, we cannot use the RTC protocol on the entire link, which results in a large delay gap between the celebrity side and the audience side.
Then when it comes to the choice of media server, streaming involves synchronization between streams. The multi-stream mode that requires high delay error is no longer applicable. Only the single-stream mode can be used to combine the multiple streams on the star side into one. This mode is less affected by network conditions, but the media server needs to have video confluence capabilities and is more complex in structure.
For the "celebrity chat + watching on-demand" scenario, based on the aforementioned architecture, we additionally introduced a VOD video file processing module and a control module. The former module is responsible for converting VOD local video files into real-time video streams, and converts them through the confluence module. The chat stream with the celebrity is combined and then pushed to the audience side.
With the help of the control module, the screen layout and volume ratio of drama series and celebrity live broadcasts can be optimized according to the actual situation, making the screen layout more reasonable and the chat sound clearer, thus improving the viewing experience.
Although iQiyi itself has real-time communication capabilities. However, considering that live streaming activities are currently only held during part of the night, and the network environment where celebrities live is generally poor, from the perspective of minimizing construction and maintenance costs, we finally chose a public cloud platform to provide media communication services. .
However, as the core asset of iQiyi, video files are stored on the intranet and cannot be directly accessed by the public cloud, so a transmission solution needs to be implemented.
-03-
Adjustments to the architecture from a copyright perspective
Existing transmission methods are divided into offline (such as FTP) and real-time transmission (such as RTMP). Offline transmission does not require too much early development, but it takes up a lot of manpower during live broadcast, which also increases the risk of leakage of media assets files. Real-time transmission realizes machine automation and takes up less manpower. The disadvantage is that the early development workload is large. From the perspective of minimizing the risk of leakage, we chose a real-time transmission solution.
Since RTMP is a common real-time media transmission protocol, we decided to push the processed real-time video stream to the public cloud through the RTMP server to avoid leaking media assets files in advance. The green color in the picture represents work completed by iQiyi itself.
Although increasing RTMP transmission will increase the delay of video playback, a delay of several seconds will have little impact on the experience. Celebrity chat interactions are generated based on video content, so generally there will be no out-of-sync problems after the merge.
-04-
Adjustment of architecture from a cost perspective
Currently, public clouds that provide real-time video communication services generally charge based on actual traffic. From a cost perspective, given that the current live broadcast does not occupy much bandwidth, iQiyi's existing CDN can be used to complete the media Distribution only requires setting up an RTMP server between the public cloud and iQiyi CDN.
In the end, only the real-time stream on the star side actually generates traffic in the public cloud. This part of the stream does not last long and has low definition, thus greatly saving network costs.
-05-
Adjusting the architecture from the perspective of high availability
In order to further improve the stability of the architecture, we first analyzed the existing risk points. Taken together, public cloud servers and iQiyi's own CDN have been operating for many years and are relatively mature. However, as new links, the processing module and RTMP server responsible for processing and distributing real-time video streams need to focus on unstable factors.
In order to avoid single points of failure in new links, we adopted a multi-instance solution. Since RTMP video forwarding is performed in real time and does not involve retaining progress status, different instances can be seamlessly resumed without synchronization. The work of the video file processing module has progress attributes, so once an interruption occurs, progress synchronization is required.
At the same time, we are also considering making full use of the public cloud infrastructure and adding an RTMP server to the public cloud as further insurance. This opens up a variety of new transmission paths and minimizes the risk of failure. The RTMP server in the public cloud is not actually activated and no additional costs are incurred.
In order to facilitate switching between different transmission links, we finally added a link scheduling module, which can be used to switch various transmission instances from the star side to the public cloud, iQiyi to the public cloud, and within iQiyi.
-06-
Considerations for front-end technical architecture
For the front end, we currently provide celebrities with the iQiyi Broadcaster APP, which integrates the SDK corresponding to the RTC cloud service. It can collect audio and video streams through the mobile phone camera and microphone and complete real-time communication. The APP is responsible for picture rendering.
In order to meet user needs (such as beauty), we have launched in-depth cooperation with RTC SDK providers to allow APP to participate in the audio and video communication process. The images captured by the RTC SDK will first be beautified by the APP and then communicated. The final effect that the star sees on the mobile phone screen is the image of himself and the other party after beautification.
Overall, our live broadcast business fully leverages the technical expertise of both iQiyi and public cloud providers, while reducing overall costs by reusing infrastructure as much as possible. The last step is to identify risks and improve corresponding plans.
That’s all my sharing today, thank you all!
▲Scan the QR code in the picture or click " Read the original text " ▲
Direct access to LiveVideoStackCon 2023 Shenzhen Station 10% off ticket purchase channel