[Video] Introduction to H.264 NALU

1. Introduction

         In the H.264/AVC video coding standard, the entire system framework is divided into two levels: video coding level (VCL) and network abstraction level (NAL). Among them, the former is responsible for effectively representing the content of the video data, while the latter is responsible for formatting the data and providing header information to ensure that the data is suitable for transmission on various channels and storage media. Therefore, every frame of data we usually use is a NAL unit (except for SPS and PPS). In actual H264 data frames, there are often 00 00 00 01 or 00 00 01 separators in front of the frame. Generally speaking, the first frame of data compiled by the encoder is PPS and SPS, followed by I frame...

As shown below:

 

 

H264 transmits NALU on the network. The structure of NALU is: NAL header + RBSP. The data flow in actual transmission is shown in the figure:

The NALU header is used to identify what type of data the following RBSP is, whether it will be referenced by other frames, and whether there is an error in network transmission.

There are several types of RBPS:

 

 

2、NAL Header

     The NALU type is a powerful tool for us to judge the frame type, as shown in the following figure from the official document:

 

 

H264 frame is composed of NALU header and NALU body.

The NALU header consists of one byte, and its syntax is as follows:

      +------------------+
      |0|1| 2 |3|4|5|6|7|
      +-+-+-+-+-+-+-+
      |F|NRI|  Type     |
      +------------------+

F: 1 bit
 forbidden_zero_bit. It is specified in the H.264 specification that this bit must be 0.

NRI: 2 bits
 nal_ref_idc. Take 00~11, it seems to indicate the importance of this NALU, such as 00 NALU decoder can discard it without affecting the playback of the image, 0~3, the larger the value, the more important the current NAL, Need to be protected first. If the current NAL is a slice belonging to the reference frame, or a sequence parameter set, or an important unit such as an image parameter set, this syntax element must be greater than 0.

Type: 5 bits
 nal_unit_type. The type of this NALU unit, 1 to 12 are used by H.264, and 24 to 31 are used by applications other than H.264. The brief description is as follows:

 

The next byte after being divided by 00 00 00 01 is the NALU type. After converting it to binary data, the order of interpretation is from left to right.

For example, there are 67, 68 and 65 after 00000001 above

The binary code of 0x67 is:
0110 0111
4-8 is 00111, converted to decimal number 7. Refer to the second picture: 7 corresponds to the sequence parameter set SPS

The binary code of 0x68 is:
0110 1000
4-8 is 01000, converted to decimal 8, refer to the second picture: 8 corresponds to the image parameter set PPS

The binary code of 0x65 is:
0110 0101
4-8 is 00101, converted to decimal number 5. Refer to the second picture: 5 corresponds to the slice in the IDR image (I frame)

 

Therefore, the method to determine whether it is an I frame is: (NALU type & 0001 1111) = 5 That is,    NALU type   & 0x1f = 5

For example, 0x65 & 1f = 5 

Judge P frame 0x61 & 0x1f = 1  

A few examples:

SPS and PPS

 

SPS and PPS contain the information parameters needed to initialize the H.264 decoder.

The SPS contains parameters for a continuous coded video sequence, such as the identifier seq_parameter_set_id, the number of frames and the constraints of the POC, the number of reference frames, the decoded image size, and the frame field coding mode selection flag.

PPS corresponds to a certain image or a few images in a sequence, with parameters such as identifier pic_parameter_set_id, optional seq_parameter_set_id, entropy coding mode selection flag, number of slice groups, initial quantization parameters, and deblocking filter coefficient adjustment flags.

 

The beginning and end of NAL

        The encoder puts each NAL into a packet independently and completely. Because the packet has a header, the decoder can easily detect the boundary of the NAL and take out the NAL for decoding in turn.

Before each NAL, there is a start code 0x00 00 01 (or 0x00 00 00 01). The decoder detects each start code as a NAL start identifier. When the next start code is detected, the current NAL ends .

        At the same time, H.264 stipulates that when 0x000000 is detected, it can also signify the end of the current NAL. So what should I do when 0x000001 or 0x000000 appears in the data in NAL? H.264 introduces a competition prevention mechanism. If the encoder detects the presence of 0x000001 or 0x000000 in NAL data, the encoder will insert a new byte 0x03 before the last byte, like this:

0x000000->0x00000300
0x000001->0x00000301
0x000002->0x00000302
0x000003->0x00000303 When the
decoder detects 0x000003, it discards 03 and restores the original data (unpacking operation). When the decoder is decoding, it first reads the NAL data byte by byte, counts the length of the NAL, and then starts decoding.

Guess you like

Origin blog.csdn.net/wdglhack/article/details/109812806