H264 encoding overview seven (SPS analysis)

1. Concept

SPS is Sequence Paramater Set, also known as sequence parameter set. A set of global parameters of a coded video sequence (Coded video sequence) is stored in the SPS.

2. Definition

The SPS format specified in the H.264 standard protocol is located in 7.3.2.1.1 of the document, as shown in the following figure:

1、profile_idc

According to the definition in Annex A.2 of "T-REC-H.264-201402-I!!PDF-E", profiles have the following types:

The value of profile_idc is used to determine which profile the code stream conforms to. According to the definition of the Annex A protocol, the finishing table is as follows:

profiles profile_idc
baseline profile 66
main profile 77
extended profile 88
High profile 100
High 10 profile 110
High 4:2:2 profile 122
High 10 Intra profile

profile_idc=110 && 

constraint_set3_flag=1  

High 4:2:2 Intra profile profile_idc = 122 && 
constraint_set3_flag = 1 
High 4:4:4 Intra profile profile_idc = 244 &&
constraint_set3_flag =1  
CAVLC 4:4:4 Intra profile 44

2、constraint_set0_flag - constraint_set3_flag

Auxiliary profile_idc identifies encoded profiles.

constraint_set0_flag equal to 1 means that the bitstream complies with all the provisions in clause A.2.1. constraint_set0_flag equal to 0 means that the bitstream may or may not comply with all the provisions in Section A.2.1.

 3、level_idc

Identify the Level of the current code stream. The encoding level defines parameters such as the maximum video resolution and the maximum video frame rate under certain conditions, and the level that the code stream complies with is specified by level_idc.

For example, in the code stream, level_idc = 0x29 = 41, so the level of the code stream is 4.1. 

 4、seq_parameter_set_id

Used to identify the sequence parameter set that the image parameter set refers to. The seq_parameter_set_id value is in the range of 0-31, inclusive.

Syntax elements encoded using unsigned integer exponential Golomb codes (for details, refer to 9.1 Parsing process of Exponential Golomb codes) correspond to the pic_parameter_set_id of the Slice head.

 When we use ffmpeg for analysis, we often encounter non-existing SPS/PPS, which is printed as follows:

 

 In fact, the SPS and PPS of the GOP are lost, and then the video frame is parsed, and an error cannot be parsed is reported. The ffmpeg code analysis is as follows:

ff_h264_decode_seq_parameter_set, ff_h264_decode_picture_parameter_set will read sps_id, pps_id. Apply for the corresponding cache

 Then when h264_slice_header_parse parses the video frame, it will judge whether the cache is normally applied:

If there is no normal application, it means that the SPS and PPS are lost, the frame cannot be decoded normally, and the frame is lost. 

5. The value range of log2_max_frame_num_minus4 is 0-12, including 0 and 12

According to the following formula, the value of the variable MaxFrameNum related to frame_num can be obtained:

6、pic_order_cnt_type

Refers to the counting method of the decoded picture order picture order count (POC) (as described in Section 8.2.1). POC is another way to measure the image number, which has a different calculation method from frame_num. The value of this syntax element is 0, 1 or 2.

7、log2_max_pic_order_cnt_lsb_minus4

Indicates the value of the variable MaxPicOrderCntLsb used in the decoding process of the picture order number specified in Section 8.2.1, the formula is as follows: 

8、num_ref_frames

num_ref_frames specifies the maximum number of short-term reference frames and long-term reference frames, complementary reference field pairs and unpaired reference fields that may be used in the decoding process of any image inter prediction in the video sequence. The num_ref_frames field also determines the size of the sliding window operation specified in Section 8.2.5.3. The value of num_ref_frames shall be in the range 0 to MaxDpbSize (see definition in A.3.1 or A.3.2), inclusive.

9、gaps_in_frame_num_value_allowed_flag

The gaps_in_frame_num_value_allowed_flag indicates the allowed values ​​of frame_num given in 7.4.3 and the decoding process in case of speculative differences between the frame_num values ​​given in 8.2.5.2.

10、pic_width_in_mbs_minus1

  11、pic_height_in_map_units_minus1

 12、frame_mbs_only_flag

frame_mbs_only_flag=0: Indicates that the coded pictures of the coded video sequence may be coded fields or coded frames. frame_mbs_only_flag=1: Indicates that each coded picture of the coded video sequence is a coded frame containing only frame macroblocks. 

13、direct_8x8_inference_flag

direct_8x8_inference_flag indicates the method used in the calculation process of B_Skip, B_Direct_16x16 and B_Direct_8x8 luma motion vector specified in Section 8.4.1.2. direct_8x8_inference_flag shall be equal to 1 when frame_mbs_only_flag is equal to 0.

14、frame_cropping_flag

frame_cropping_flag=1: Indicates that the frame cropping offset parameter follows the next value in the video sequence parameter set. frame_cropping_flag=0: Indicates that there is no frame cropping offset parameter. 

15、vui_parameters_present_flag

vui_parameters_present_flag=1: Indicates that there is a vui_parameters() syntax structure as mentioned in Appendix E.
vui_parameters_present_flag=0: Indicates that there is no vui_parameters() syntax structure as mentioned in Appendix E.

Guess you like

Origin blog.csdn.net/CrystalShaw/article/details/129375250