Audio and video - video stream H264 encoding format

1 Introduction to H264

We understand what a macro is, and as the smallest part of compressed video, a macro needs to be organized and then transmitted between networks.

H264 is deeper - "macroblock is too shallow

​ If you simply use macros to send data, it is messy , as if before there was no container , the goods were always randomly stacked on the ship.

It is very painful to load (code) and unload. When the container appeared, everything changed, and the transmission efficiency was greatly increased.

​ The container can be understood as the H264 coding standard . It has formulated a format for mutual transmission, and forms a series of code streams in an organized, structured, and orderly manner. This code stream can be transmitted through the data of the InputStream network stream, or can be packaged into a file for storage

**H264: H264/AVC is a widely used encoding method. **The main function is to transmit

1.1 H264 stream composition

The structure that makes up the H264 code stream includes the following parts. The sequence from large to small is
H264 video sequence, image, slice group, slice, NALU, macroblock, and pixel.
similar to earth country cities towns and villages

insert image description here

1.1.1 H264 coding layer

  • NAL layer: (Network Abstraction Layer, video data network abstraction layer) : Its function is that as long as H264 is transmitted on the network, each packet of Ethernet is 1500 bytes during the transmission process, and the frame of H264 is often larger than 1500 bytes , so unpacking is required to split a frame into multiple packets for transmission. All unpacking or grouping are processed through the NAL layer.
  • VCL layer: (Video Coding Layer, video data coding layer) : Compress the original video data

1.1.2 Transmission of H264

​H264 is a code stream similar to a river with no head or tail . How to get the data you want from the stream ,

In the H264 standard brick, there is such an encapsulation format called "Annex-B" byte stream format. It is the main byte stream format for H264 encoding.

Almost all encoders on the market output in this format. The start code 0x 00 00 00 01 or 0x 00 00 01 is used as the separator .

The byte data between two 0x 00 00 00 01 represents a NAL Unit
insert image description here

1.1.3 Coding structure

insert image description here

Slice header : Contains a set of slice information, such as the number of slices, order, etc.

1.1.4 H264 code stream hierarchical structure diagram

insert image description here
The coded video sequence of H.264 includes a series of NAL units, and each NAL unit contains an RBSP. An original H.264 is composed of N NALU units . NALU units are often composed of [StartCode] [NALU Header] [NALU Payload], where Start Code is used to mark the beginning of a NALU unit and must be "00 00 00 01" or "00 00 01".

1.1.5 H.264 network transmission

The encoded video sequence of H.264 includes a series of NAL units , each NAL unit contains an RBSP

See Table 1. Encoding slices (including data partition slices IDR slices) and sequence RBSP terminators are defined as VCL NAL units, and the rest are NAL units.

​A typical RBSP unit sequence is shown in Figure 2.

RBSP SF Head SF Company Tail

Each unit is transmitted as an independent NAL unit. The header of the unit (one byte) defines the type of RBSP unit, and the rest of the NAL unit is RBSP data.
insert image description here

insert image description here
Start code: If the Slice corresponding to the NALU is the start of a frame, it is represented by 4 bytes, that is, 0x00000001; otherwise, it is represented by 3 bytes, 0x000001. NAL Header: forbidden_bit, nal_reference_bit (priority), nal_unit_type (type). Unpacking operation: In order to make the NALU body not include the start code, whenever two bytes (continuous) of 0 are encountered during encoding, a byte 0x03 is inserted to distinguish it from the start code. When decoding, the corresponding 0x03 is deleted.
insert image description here
The nal_reference_idc (NRI) of the H.264 decoded NAL header information is used to mark the importance of a NAL unit during the reconstruction process,

  1. A value of 0 indicates that this NAL unit is not used for prediction, so it can be discarded by the decoder without error propagation;
  2. Values ​​higher than 0 indicate that the NAL unit is to be used for drift-free reconstruction, and the higher the value, the greater the impact of this NAL unit loss.
  3. The hidden bit of the NAL header information is 0 by default in the H.264 encoder. When the network recognizes that there is a bit error in the unit, it can be set to 1. Hidden bits are mainly used to adapt to different types of network environments (such as a combination of wired and wireless environments).
    insert image description here
    The process of decoding the NAL unit is as follows: first extract the RBSP syntax structure from the NAL unit, and then process the RBSP syntax structure according to the flow shown in FIG. 4 . The input is the NAL unit, and the output result is the sample point of the decoded current image. The NAL unit contains a sequence parameter set and an image parameter set respectively. The image parameter set and sequence parameter set are used as a reference during the transmission of other NAL units. In the headers of these data NAL units, the image parameter set numbers used by them are set through the syntax element pic_parameter_set_id; and corresponding to each image parameter set, Set the sequence parameter set number they use through the syntax element seq_paramter_set_id

Guess you like

Origin blog.csdn.net/qq_39431405/article/details/131938485