MP4 large file virtual HLS fragmentation technology avoids massive file fragmentation on the server

MP4 large file virtual HLS fragmentation technology to avoid file fragmentation on the on-demand server

This article mainly introduces the use of virtual fragmentation technology to map MP4 files into small TS fragment files in the HLS protocol, thereby realizing the playback of MP4 files through the HLS protocol without actually splitting the MP4 files. It avoids the problem of long-term buffering of MP4 header data for MP4 on-demand playback, especially for large MP4 files. At the same time, it can solve the disadvantage of splitting MP4 files, which will create a large number of file fragments on the server. Moreover, with this technology, there is almost no need to modify the streaming media server (HTTP service).

background

  • HLS, the full name of Http Live Streaming, is a widely used live broadcast and on-demand technology. It was first implemented by Apple. Like MPEG-DASH, it is based on HTTP protocol transmission and can be embedded in HTML5 tags for playback. It is now widely supported by browsers on mobile platforms. , it can also be implemented on PC through js and MSE (Media Source Extensions). The characteristics of HLS are that the fragmented files are small, fast to load, use TS containers, have a simple format, and are transmitted through HTTP. There is basically no need to consider firewall issues, so it was quickly promoted.
  • Everyone is more familiar with MP4 , which is a media file container composed of boxes, so I won’t go into details. One thing that needs to be pointed out is that MP4 can basically be divided into ordinary MP4 and FragmentMP4. The main features of the latter include many moof boxes that divide mp4 files into many units that can be individually decoded. It is more suitable for data transmission in streaming media scenarios. The first application I saw was in the "SilverLight + Smooth Streaming" technology launched by Microsoft in the early years (has Microsoft abandoned it? ). The application of FMP4 is also gradually developing, but one disadvantage compared to TS is that its format is a bit complicated. I will write an article to introduce FMP4-related technologies later.

For the ordinary MP4 that everyone often sees and uses, it is very good as a storage container for movie and TV files. However, for streaming media on demand, the biggest disadvantage is that its media information and key frame index are concentrated. Stored in the moov box, the larger the file, the larger the moov box. For the player, the moov box cannot be obtained and there is no way to decode it. Therefore, MP4 files on demand need to be buffered for a long time and the header data is loaded. Of course, a common solution is to split files. Cut large MP4 files into smaller MP4 files, so that each piece of MP4 will be loaded much faster. This is also a solution for many video websites. Such splitting also Fortunately, the number of shards is not very large. However, in the HLS era, in order to support the HLS protocol, large MP4 files need to be converted into smaller HLS-TS fragmented files. This causes problems. The server has too many fragmented TS files, which is difficult to manage and also affect performance. How to solve it? That is virtual HLS sharding technology.

technical analysis

1. Virtual sharding logic

A common mp4 file structure is shown below. The most important of them is MoovBox, which records very critical data such as decoding information, timestamps, and positions of all subsequent audio frames and video frames. It is called index data in the picture, and among video frames, key frames are the most important. node, the player will refresh the entire image at the key frame position, which can be considered as the starting point of image decoding.

Insert image description here

Virtual HLS slicing, as the name suggests, does not actually slice. It only records the direct data correspondence between the actual MP4 file and the TS slices that need to be split. Then when the player actually requests playback, the corresponding audio and video are processed through the corresponding relationship. The data is assembled into TS files in memory. For example, if you request 0.2 seconds of data for the above MP4 file , you need to find the 0.2 seconds of data through the corresponding records , combine it into MPEG-TS format, and generate HLS fragment files. Of course, you need to pay attention to the segmentation process, that is, the starting point of the segmentation must be the node of the video key frame, otherwise the generated file cannot be decoded normally.

Insert image description here

2. Design plan

According to the fragmentation logic described in the previous analysis, the entire mp4 file can be divided into virtual fragments based on the key frames as boundaries based on the audio and video frame indexes listed in the moov box. Each fragment corresponds to a ts file. , and write this correspondence to the index file (I define it as xxx.index file here). The schematic diagram of the entire scheme is as follows, which is very clearly described.

Insert image description here

In the picture above, Sample1 Sample2... refers to audio and video frames. There is no distinction here and does not affect understanding.

To briefly explain:

xxx.mp4 is the original file to be played, and xxx.m3u8 is the playback address file used by the HLS player, which lists all ts fragment addresses. (For a more detailed introduction to m3u8 and HLS, please see my other article " HTTP Live Streaming (iOS live broadcast) technology analysis and implementation "). xxx.index is a description file, or index file, generated based on the virtual fragmentation situation, which internally records the distribution address of each TS fragment (recorded as segment in the index file) in the real MP4 file. In this way, xxx.mp4, xxx.m3u8 and xxx.index together constitute all related files of this program. In the actual application process, the client or server can easily calculate the actual data locations corresponding to the TS fragments requested by the HLS player based on the contents of the m3u8 file and the index file, thereby assembling the data and realizing HLS on-demand streaming.

3. Process

The process in the figure below shows the logical process from when the HLS player requests the m3u8 address to when the HLS player obtains the first TS fragment file. In addition to the server side and the player side, there is also an "adapter end" that I defined. The main job of this adapter end is to calculate the real data location based on the index file and m3u8 file, and then send a Range request to the server. , and composes the data returned by the server into TS fragment files, and then sends them back to the HLS player. This adapter is the key to the entire process.

The adapter can be placed on the server or on the client. If placed on the client side, the server side requires almost no changes to implement virtual HLS sharding technology. If integrated into the server, the client will basically not need any changes.

Insert image description here

accomplish

  1. The first is to process the mp4 file and generate the corresponding index file and m3u8 file.

Insert image description here

The process of slicing and calculating the corresponding relationship between segments and ts in the index file is as follows:

Insert image description here

  1. Then write the adapter to combine the data for the requested TS fragments

Insert image description here


Please contact QQ for cooperation. (Please indicate the author and source when reprinting~)


Guess you like

Origin blog.csdn.net/haibindev/article/details/84101081