How to build a video website like Youku and iQiyi, what are the technical difficulties?

Too many people have asked me this question:

How to build a video website like Youku, is it very different from ordinary graphic websites, and what pitfalls need to be paid attention to?

Next, I will generally answer this question as a Youku platform development engineer.

First of all, I want to popularize it to some lay developers (people outside the video field). Video websites are fundamentally different from many graphic websites. Don't confuse the two.

At the beginning of Youku's founding (2006), the boss Yongqiang Koo set the tone for Youku's technology platform, that is, "the fastest is king". Later facts proved that this principle helped Youku stand out among many platforms and was generally welcomed by end users.

Therefore, as the former head of the R&D department of Youku platform, I also strictly follow this idea to implement. If you really want to build a video platform of this size, the technical difficulty is actually quite large (for those who have stepped on the pit, it seems nothing).

First of all, to truly achieve the goal of "the fastest is king", you must have sufficient capital reserves to support your development.

After Youku’s first anniversary (2007), Koo Yongqiang once accepted an exclusive interview with Tencent and explained to the public Youku’s strategy of “the fastest is king”.

You can open the history and see for yourself, so I won’t repeat it here.

https://www.cyzone.cn/article/920.html  

Second, you need to have a strong technical development team.

Gu Yongqiang came out of Sohu back then. Since he got an investment of 13 million in the initial stage, he attracted many big names in the video industry to work for him in 2006. These people are all streaming I am an elite who has worked in the media industry for more than 6 years. As a member of the core team after 3 years, I have also gained a lot of practice in Youku, a large group.

To practice the four words "fast is king", you need to achieve "fast release, fast search, and fast playback" .

At the core technical level, the following professional technologies are required:

1. Rapid Release Technology

This aspect requires fast upload and fast transcoding of video programs.

In order to upload videos quickly , you need to achieve large concurrent uploads and resumable uploads of video content. In this regard, we use the C language to implement the upload server independently. Currently, a single server supports 1,000 concurrent users to upload at the same time. The support is based on H5 resume transmission;

To do a good job of fast transcoding , this aspect has a higher technical content. You can try to use the current Intel CPU after 13 years to convert 1080P Blu-ray high-definition video to see how many times faster it can be transcoded.

A few years ago, the Youku platform had more than 3,000 servers, and nearly half of them were used for video transcoding. After technical improvement and optimization, a great technological leap has been achieved in this area. The double-speed transcoding has become the current 30-speed transcoding, and the specific details can be added (1918098288), which greatly improves the real-time performance of program release and effectively reduces operating costs.

2. Fast retrieval technology

On December 21, 2007, Youku's daily video views (VV) exceeded 100 million.

This shows what? At least 100 million people search your video content every day.

In the face of the rapid growth of user visits, Youku's content retrieval technology has also been improved several times, from the initial separation of database reading and writing, to vertical sub-databases, to horizontal sub-tables, and then to memory-based databases, full-text search, and distributed databases. The joint application of , Hadoop, caching and other technologies enables the response time of massive content retrieval to be realized in milliseconds.

3. Fast playback technology

In the face of hundreds of millions of netizens across the country and a huge content resource library, it is a severe challenge to the platform architecture and core technical strength to achieve fast content playback.

As we all know, video data is the largest type of data transmitted on the Internet. Unlike text and pictures, the amount of data is only a few hundred KB at most, and the traffic of video is several MB per second, and its database per second is more than ten times that of pictures.

Moreover, to transfer such a large database, you need to achieve continuity and stability without interruption, which puts extremely high requirements on server performance and streaming media software performance.

In terms of function, the streaming media server needs to support multi-terminal and multi-protocol publishing (such as HLS, HTTP-TS, HTTP-FLV, DASH, RTMP), and needs to support multi-stream adaptation (to provide users with the best viewing experience), Need to support multi-server load balancing and high concurrent performance.

Therefore, our streaming media technology has gone through N times of technical iterations from the initial single server supporting 200 concurrency to now supporting 5,000 concurrency.

In addition, the performance improvement of the streaming media server only represents the processing power of a single server. To access hundreds of millions of users across the country, thousands of servers need to be deployed. These servers must be distributed in major nodes of the entire domestic Internet, including various IDC computer rooms in large central cities, provincial capital cities, and second-tier eastern cities. Details can be added (1918098288). After these streaming media servers are deployed in place, they need to be connected into an organic whole, and use content distribution technology to form a super-large-scale CDN content distribution network, so as to realize the rapid distribution of video content and users' nearby on-demand, such a On the one hand, it greatly improves the user's viewing experience (playback response is fast), on the other hand, it also greatly saves the capital investment of the platform (the bandwidth in second-tier cities is much lower than that in first-tier cities).

In terms of server cluster load balancing and CDN construction, in order to control investment costs, Youku platform did not choose a hardware solution like F5, but used self-developed software to achieve it. The overall effect has been seen by everyone, and it has done a very good job , has surpassed some professional CDN service providers (such as Lanxun and Wangsu) in terms of video distribution.

In addition to the above parts, there are many other technical details that need to be considered, such as:

A large concurrent real-time messaging system . This system will be used to provide real-time subtitles when users watch videos, or real-time text chat. This kind of concurrency is often at the level of millions of concurrency.

The load balancing of the WEB server , because the WEB server is mainly used to provide the display of program metadata information, mostly pictures and text information, how to achieve multi-server load balancing in the case of large concurrent access, how to do distributed deployment, this is related to The final user experience is directly related.

The security protection of the platform mainly includes the security of the CMS system of the website and the anti-attack capability of the public network server (such as DDOS attack).

The read and write performance of the program storage device is also a key factor to test the overall service performance of a platform. Due to the huge data volume of video content, different storage media (SATA hard disk, SAS hard disk, solid-state hard disk), and different storage architectures are closely related to the response speed and data throughput of content reading, so the carrier-level platform is in this respect. Both need a perfect design, and Youku has been updating and iterating its technology in this regard.

The automatic extraction of program metadata information is also crucial for large-scale operation platforms, because it is related to your later operation and maintenance costs. A good operating platform will follow such a principle when designing the system, that is, the work that can be done automatically by machines should never be done by manpower, because the cost of manual work is much higher than that of machine work. The efficiency is much lower than that of the machine. On the Youku platform, the basic metadata of all programs (program name, program duration, program posters, previews of playback tracks) are automatically extracted through the program, and the background also has the function of manual one-click interception of posters , which greatly reduces labor costs.

The above are just some key points considered from the macro level. It can be said that there will be many pitfalls in every technical link, so a strong technical development team is extremely important for the efficient operation of the basic operation support platform.

I will write so much for today, if you still have questions, you can add them later.

 

Guess you like

Origin blog.csdn.net/zhiboshequ/article/details/104485987