Audio and video HLS protocol and m3u8 format analysis

What is HLS

HLS protocol draft history: https://datatracker.ietf.org/doc/rfc8216/
can be viewed directly: https://www.rfc-editor.org/rfc/rfc8216.html
second edition: https://datatracker. ietf.org/doc/html/draft-pantos-hls-rfc8216bis-00

HTTP Live Streaming (abbreviated as HLS) is an HTTP-based streaming media network transmission protocol proposed by Apple. It is part of Apple's QuickTime X and iPhone software system.
It works by breaking the entire stream into small HTTP-based files for download, only a few at a time. When starting a streaming session, the client downloads an extended M3U (m3u8) playlist file containing metadata for finding available media streams.

Principle of HLS

insert image description here
In a typical implementation of the HLS protocol, audio and video inputs are generally collected, encoded into H.264 and AAC formats, and processed into sliced ​​streams through server-side slicing.
Taking ts as an example, in this process, the MPEG-2 data stream will be processed into a series of continuous small slice files (.ts) and stored on the Web server, and the server will generate an index file to index these slice files .m3u8, and publish this index file. The client requests and reads the index file, and requests and obtains the slice files therein to obtain corresponding media resource data for processing and displaying.

HLS provides an m3u8 address, the difference between them can be said that HLS is a protocol, m3u8 is a file format; a bit like the relationship between rtmp and flv;

m3u8 format analysis

An M3U8 file is actually a playlist, which may be a Media Playlist or a Master Playlist.

  • Media Playlist Media Playlist
    When an M3U8 file is used as a media playlist (Meida Playlist), the information contained in it records a series of multimedia resource slices, and the multimedia resources can be fully presented by playing these slices sequentially.
    #EXTM3U on the first line indicates the file format. #EXT-X-TARGETDURATION:10 in the second line indicates that the duration of each subsequent resource slice is less than or equal to 10 seconds. Next, we see that there are 3 resource slices with durations of 9.009 seconds, 9.009 seconds, and 3.003 seconds.
    When on-demand, the client first downloads the M3U8 file, and then downloads each resource slice according to the M3U8 list to play in sequence. During the live broadcast, the client needs to periodically re-request the M3U8 file to check whether there are new media slices to download and play. All these data are transferred via HTTP protocol.
    #EXTM3U
    #EXT-X-TARGETDURATION:10
    #EXT-X-VERSION:3
    #EXTINF:9.009,
    http://media.example.com/first.ts
    #EXTINF:9.009,
    http://media.example.com/second.ts
    #EXTINF:3.003,
    http://media.example.com/third.ts
    #EXT-X-ENDLIST
    
  • Master Playlist Master Playlist
    When the M3U8 file is used as the Master Playlist (Master Playlist), it contains Master Playlists of various bit rates. The information it contains is a list of multi-stream resources of the same media resource. Different streams may have different bit rates, different formats, and different resolutions. Different streams can also specify audio in different languages, video in different perspectives, and so on. Each URI corresponds to a media playlist.
    The client should select the appropriate stream to play according to the network conditions, and should also select the stream with the appropriate language and viewing angle to play according to the user's preference.
    #EXTM3U
    #EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1280000
    http://example.com/low.m3u8
    #EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=2560000
    http://example.com/mid.m3u8
    #EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=7680000
    http://example.com/hi.m3u8
    #EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=65000,CODECS="mp4a.40.5"
    http://example.com/audio-only.m3u8
    

Supported Media Segment Formats

  • MPEG-2 Transport Streams, the most common ts file
    Each TS segment must contain an MPEG-2 Program.
    Each TS segment contains a PAT and PMT, preferably at the beginning of the segment, or through an EXT-X- MAP tag to specify.
  • Fragmented MPEG-4
  • Packed Audio
  • WebVTTs

Common tags

  • Basic Tags:

    • EXTM3U, indicating that the file is inherited from the M3U standard. This tag must be placed on the first line of the file.
    • EXT-X-VERSION indicates the version of the file, which is related to the media and server related to the file.
  • Media Segment Tags: Media Segment tags can only appear in the Media Playlist.

    • EXTINF, indicating the duration of the subsequent media slice (in seconds). This tag must be specified before each media slice.
    • EXT-X-BYTERANGE: used to specify the sub-range of the URI
    • EXT-X-DISCONTINUITY: Indicates discontinuity.
    • EXT-X-KEY: Indicates that the Media Segment is encrypted, and this value is used for decryption.
    • EXT-X-MAP: Used to specify the Media Initialization Section.
    • EXT-X-PROGRAM-DATE-TIME: Determine the timestamp together with the first sample of the Media Segment.
    • EXT-X-DATERANGE: Combines a time range with a set of attribute key-value pairs.
  • Media Playlist Tags Media Playlist Tags: Media Playlist tags can only appear in the Media Playlist.

    • EXT-X-TARGETDURATION: Used to specify the maximum Media Segment duration.
    • EXT-X-MEDIA-SEQUENCE: Used to specify the Media Sequence Number of the first Media Segment.
    • EXT-X-DISCONTINUITY-SEQUENCE: Used for synchronization between different Variant Streams.
    • EXT-X-ENDLIST: Indicates the end.
    • EXT-X-PLAYLIST-TYPE: Optional, specify the type of the entire Playlist.
    • EXT-XI-FRAMES-ONLY: Indicates that each Media Segment describes a single I-frame.
  • Master Playlist Tags Master Playlist Tags: Master Playlist tags can only appear in the Master Playlist.

    • EXT-X-MEDIA: Various renditions for multiple Media Playlists associated with the same content.
    • EXT-X-STREAM-INF: Used to specify a Variant Stream:
      • BANDWIDTH: The value of BANDWIDTH is the highest bit rate value, which is the maximum bit rate occupied when playing the corresponding M3U8 under EXT-X-STREAM-INF (necessary parameter).
      • AVERAGE-BANDWIDTH: The value of AVERAGE-BANDWIDTH is the average bit rate value, which is the average bit rate occupied when playing the corresponding M3U8 under EXT-X-STREAM-INF. (optional parameter).
      • CODECS: The value of CODECS is used to declare the information corresponding to the audio and video encoding and video encoding in M3U8 under EXT-X-STREAM-INF (optional parameters).
      • RESOLUTION: Description of the width and height information of the video in M3U8 (optional parameter).
      • FRAME-RATE: Video frame rate in sub-M3U8 (optional parameter).
    • EXT-XI-FRAME-STREAM-INF: Used to specify a Media Playlist containing media I-frames.
    • EXT-X-SESSION-DATA: store some session data.
    • EXT-X-SESSION-KEY: For decryption.
  • Media or Master Playlist Tags: The tags here can appear in Media Playlist or Master Playlist. But if they appear in the same Master Playlist and Media Playlist at the same time, they must be the same value.

    • EXT-X-INDEPENDENT-SEGMENTS: Indicates that each Media Segment can be decoded independently.
    • EXT-X-START: Identify a preferred point to play this Playlist.

HLS example in live broadcast scenario

#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:8
#EXT-X-MEDIA-SEQUENCE:2680

#EXTINF:7.975,
https://priv.example.com/fileSequence2680.ts
#EXTINF:7.941,
https://priv.example.com/fileSequence2681.ts
#EXTINF:7.975,
https://priv.example.com/fileSequence2682.ts
  • Key features of live playlists:
    Does not contain EXT-X-ENDLIST tags.
    The EXT-X-PLAYLIST-TYPE tag is not included.
  • The live playlist is an M3U8 file that will be updated dynamically. The server will transcode the live stream in real time to generate slices of the live stream, and update the M3U8 file regularly. This M3U8 file generally includes 3-5 slices.
  • When the URI of any slice in the live playlist is removed, the value of the EXT-X-MEDIA-SEQUENCE tag needs to be updated (+1). The slice URI must be removed in order to ensure that the client gets continuous slice data through the updated M3U8 file.

HLS Latency Analysis

HLS 理论延时 = 转码1个切片的耗时时长 + 0-1个EXT-X-TARGETDURATION+ 0-n 个启动切片(苹果官方建议是请求到 3 个片之后才开始播放) + 播放器最开始请求的片的网络延时(网络连接耗时)

  1. When the server-side encoder and stream splitter generate slice files
  2. 0-1 EXT-X-TARGETDURATION: It can be simply understood as the interval for the player to fetch movies; before the client starts downloading, it must wait for the server-side encoder and stream splitter to generate at least one TS file.
  3. Start Slice: The client player needs to load several slices and start playing
  4. Finally, the connection request of the client player is time-consuming

In order to pursue low-latency effects, slices can be cut smaller, and the slice interval can be made smaller, but this will increase the storage pressure on the server, and the slice service will also increase the load;

Advantages and disadvantages of HLS protocol

  • Advantages of HLS

    1. The client support is simple, only need to support HTTP request, the HTTP protocol is stateless, only need to download media fragments in order, CDN support is good;
    2. Comes with multi-bit rate adaptation, when Apple proposed HLS, it has already considered the problem of bit rate adaptation;
    3. Apple's full range of product support, since HLS is proposed by Apple, so Apple's full range of products, including iphone, ipad, and safari, can natively support HLS without installing any plug-ins;
  • Disadvantages of HLS

    1. The delay is high, and it is difficult to use the interactive live broadcast scene;
    2. Startup is slower than http-flv, because it needs to download m3u8 index first;

HLS Applicable Scenarios

In actual application scenarios, since the HLS/M3U8/TS solution is not ideal for controlling live broadcast delay, the M3U8 media format will not be used in general real-time live broadcast scenarios.
However, for the scene of live playback, since the M3U8/TS solution can continuously generate and store slices during the live broadcast, the live playback will basically choose the M3U8 media format.

Detailed explanation of HLS protocol How
M3U8 format
shortens HLS delay to 4 seconds, detailed explanation of HLS+ technology

Guess you like

Origin blog.csdn.net/u014099894/article/details/126697403