【Video codec · study notes】11. Extract SPS information program

1. Preparation:

Go back to the previous SimpleH264Analyzerprogram, find the SPS information, and parse it

Adjust the project directory structure:

Modify Global.hthe code in the file and add a new data type UINT16. In the previously written project, UINT8 and UINT32 are both lowercase. In order to be more in line with programming specifications, change them to all uppercase (you can use ctrl+H to replace within the entire solution ).

typedef unsigned char  UINT8;
typedef unsigned short UINT16;
typedef unsigned int   UINT32;

The program written later will have more and more output, and if all input to the console, it will be very messy. So the output becomes two ways, one to the console and the other to the log file. Proceed as follows:

1 Create a new Configuration.hfile, put 1.Applicationit in the directory, and add the code:

#ifdef _CONFIGURATION_H_
#define _CONFIGURATION_H_

#include <fstream>

#define TRACE_CONFIG_CONSOLE 1
#define TRACE_CONFIG_LOGOUT  1

extern std::ofstream g_traceFile;

#endif

2 Create a new one Configuration.cpp, put 1.Applicationit in the directory, and add the code:

#include "stdafx.h"
#include "Configuration.h"

#if TRACE_CONFIG_LOGOUT

std::ofstream g_traceFile;

#endif

3 stdafx.hAdd the reference library in:

#include <string>
#include "Configuration.h"

4 Whether to write to the log file is defined in Stream.cppthe constructor in: Add
in CStreamFile::CStreamFile(TCHAR * fileName) :

#if TRACE_CONFIG_LOGOUT
    g_traceFile.open(L"trace.txt");
    if (!g_traceFile.is_open())
    {
        file_error(1);
    }
    g_traceFile << "Trace file:" << endl;
#endif

In the destructor CStreamFile::~CStreamFile()add:

#ifdef TRACE_CONFIG_LOGOUT
    if (g_traceFile.is_open())
    {
        g_traceFile.close();
    }
#endif

When the log file fails to open, the function file_error(1) is called , so modify the void CStreamFile::file_error(int idx) function and add the scheme of error code 1 in it:

case 1:
        wcout << L"Error: opening trace file failed." << endl;
        break;

After completing the above configuration, compile and run the program, a trace.txtfile will be generated in the \bin\Debug directory, and the string "Trace file:" will be written.

In order to replace the previous direct output in the console, create a new function in the CStreamFile class, first declare the function (private) in the Stream.h file

void    dump_NAL_type(UINT8 nalType);

Add the Stream.cppimplementation of this function in

void CStreamFile::dump_NAL_type(UINT8 nalType)
{
#if TRACE_CONFIG_CONSOLE
    wcout << L"NAL Unit Type: " << nalType << endl;
#endif

#if TRACE_CONFIG_LOGOUT
    g_traceFile << "NAL Unit Type: " << to_string(nalType) << endl;
#endif
}

Change the wcout output in the Parse_h264_bitstream() function to call the new function:

dump_NAL_type(nalType);

Recompile and run. Since the console and log file output switches are both turned on at this time, you can see the output of NAL Unit Type in the console and trace.txt

Second, define the SPS class:

Create a new class CSeqParamSet, and put the generated one in the CSeqParamSet.h" CSeqParamSet.cpp3.NAL Unit" directory.
According to the coding structure mentioned in the official document in the previous note, define all the syntax elements one by one, and set the setter function:
Modify SeqParamSet.h:

#ifndef _SEQ_PARAM_SET_H_
#define _SEQ_PARAM_SET_H_

class CSeqParamSet
{
public:
    CSeqParamSet();
    ~CSeqParamSet();

    void  Set_profile_level_idc(UINT8 profile, UINT8 level);
    void  Set_sps_id(UINT8 spsID);
    void  Set_chroma_format_idc(UINT8 chromaFormatIdc);
    void  Set_bit_depth(UINT8 bit_depth_luma, UINT8 bit_depth_chroma);

    void  Set_max_frame_num(UINT32 maxFrameNum);
    void  Set_poc_type(UINT8 pocType);
    void  Set_max_poc_cnt(UINT32 maxPocCnt);
    void  Set_max_num_ref_frames(UINT32 maxRefFrames);
    void  Set_sps_multiple_flags(UINT32 flags);
    void  Set_pic_reslution_in_mbs(UINT16 widthInMBs, UINT16 heightInMapUnits);
    void  Set_frame_crop_offset(UINT32 offsets[4]);

private:
    UINT8  m_profile_idc;
    UINT8  m_level_idc;
    UINT8  m_sps_id;

    // for uncommon profile...
    UINT8  m_chroma_format_idc;
    bool   m_separate_colour_plane_flag;
    UINT8  m_bit_depth_luma;
    UINT8  m_bit_depth_chroma;
    bool   m_qpprime_y_zero_transform_bypass_flag;
    bool   m_seq_scaling_matrix_present_flag;
    // ...for uncommon profile

    UINT32 m_max_frame_num;
    UINT8  m_poc_type;
    UINT32 m_max_poc_cnt;
    UINT32 m_max_num_ref_frames;
    bool   m_gaps_in_frame_num_value_allowed_flag;
    UINT16 m_pic_width_in_mbs;
    UINT16 m_pic_height_in_map_units;
    UINT16 m_pic_height_in_mbs; // 图像实际高度 not defined in spec, derived...
    bool   m_frame_mbs_only_flag;
    bool   m_mb_adaptive_frame_field_flag;
    bool   m_direct_8x8_inference_flag;
    bool   m_frame_cropping_flag;
    UINT32 m_frame_crop_offset[4];
    bool   m_vui_parameters_present_flag;

    // UINT32 m_reserved;
};

#endif

Implementing all setter functions in the SeqParamSet.cppfile is a simple assignment process:

#include "stdafx.h"
#include "SeqParamSet.h"

CSeqParamSet::CSeqParamSet()
{
}

CSeqParamSet::~CSeqParamSet()
{
}

void CSeqParamSet::Set_profile_level_idc(UINT8 profile, UINT8 level)
{
    m_profile_idc = profile;
    m_level_idc = level;
}

void CSeqParamSet::Set_sps_id(UINT8 sps_id)
{
    m_sps_id = sps_id;
}

void CSeqParamSet::Set_chroma_format_idc(UINT8 chromaFormatIdc)
{
    m_chroma_format_idc = chromaFormatIdc;
}

void CSeqParamSet::Set_bit_depth(UINT8 bit_depth_luma, UINT8 bit_depth_chroma)
{
    m_bit_depth_luma = bit_depth_luma;
    m_bit_depth_chroma = bit_depth_chroma;
}

void CSeqParamSet::Set_max_frame_num(UINT32 maxFrameNum)
{
    m_max_frame_num = maxFrameNum;
}

void CSeqParamSet::Set_poc_type(UINT8 pocType)
{
    m_poc_type = pocType;
}

void CSeqParamSet::Set_max_poc_cnt(UINT32 maxPocCnt)
{
    m_max_poc_cnt = maxPocCnt;
}

void CSeqParamSet::Set_max_num_ref_frames(UINT32 maxRefFrames)
{
    m_max_num_ref_frames = maxRefFrames;
}

void CSeqParamSet::Set_sps_multiple_flags(UINT32 flags)
{
    m_separate_colour_plane_flag = flags & (1 << 21);
    m_qpprime_y_zero_transform_bypass_flag = flags & (1 << 20);
    m_seq_scaling_matrix_present_flag = flags & (1 << 19);

    m_gaps_in_frame_num_value_allowed_flag = flags & (1 << 5);
    m_frame_mbs_only_flag = flags & (1 << 4);
    m_mb_adaptive_frame_field_flag = flags & (1 << 3);
    m_direct_8x8_inference_flag = flags & (1 << 2);
    m_frame_cropping_flag = flags & (1 << 1);
    m_vui_parameters_present_flag = flags & 1;
}

void CSeqParamSet::Set_pic_reslution_in_mbs(UINT16 widthInMBs, UINT16 heightInMapUnits)
{
    m_pic_width_in_mbs = widthInMBs;
    m_pic_height_in_map_units = heightInMapUnits;
    m_pic_height_in_mbs = m_frame_mbs_only_flag ? m_pic_height_in_map_units : 2 * m_pic_height_in_map_units;
}

void CSeqParamSet::Set_frame_crop_offset(UINT32 offsets[4])
{
    for (int idx = 0; idx < 4; idx++)
    {
        m_frame_crop_offset[idx] = offsets[idx];
    }
}

3. Unsigned exponential Columbus data decoding:

It is exactly the same as the unsigned exponential Columbus decoding part implemented in study note 9 , only the code is placed below (detailed in note 9): In the 0.Global directory, create a new one and define the two necessary functions in exponential Columbus encoding :
Utils.h

#ifndef _UTILS_H_
#define _UTILS_H_
#include "Global.h"

int Get_bit_at_position(UINT8 *buf, UINT8 &bytePosition, UINT8 &bitPosition);
int Get_uev_code_num(UINT8 *buf, UINT8 &bytePosition, UINT8 &bitPosition);

#endif

In the 0.Global directory, create a new Utils.cppone and implement the above two functions:

#include "stdafx.h"
#include "Utils.h"

// 根据bytePosition和bitPosition 获取当前比特位二进制数值  返回0/1
int Get_bit_at_position(UINT8 * buf, UINT8 & bytePosition, UINT8 & bitPosition)
{
    UINT8 mask = 0, val = 0;

    mask = 1 << (7 - bitPosition);
    val = ((buf[bytePosition] & mask) != 0);
    if (++bitPosition > 7)
    {
        bytePosition++;
        bitPosition = 0;
    }

    return val;
}

// 将接下来一个指数哥伦布编码 转换成十进制数值
int Get_uev_code_num(UINT8 * buf, UINT8 & bytePosition, UINT8 & bitPosition)
{
    assert(bitPosition < 8);
    UINT8 val = 0, prefixZeroCount = 0;
    int prefix = 0, surfix = 0;

    while (true)
    {
        val = Get_bit_at_position(buf, bytePosition, bitPosition);
        if (val == 0)
        {
            prefixZeroCount++;
        }
        else
        {
            break;
        }
    }
    prefix = (1 << prefixZeroCount) - 1;
    for (size_t i = 0; i < prefixZeroCount; i++)
    {
        val = Get_bit_at_position(buf, bytePosition, bitPosition);
        surfix += val * (1 << (prefixZeroCount - i - 1));
    }

    prefix += surfix;

    return prefix;
}

You can copy the code in the main function of study notes 9 for testing, and you can output the decoding result correctly.

4. Parse the SPS data in NALUnit:

Parse the grammatical elements in UALUnit into the value of each member variable in SPS according to the protocol . Add functions to and, NALUnit.hParse_as_seq_param_set () is used to parse grammatical elements, the code is as follows. (All can be parsed in the order of the official documents in Study Notes 10 )NALUnit.cpp

int CNalUnit::Parse_as_seq_param_set(CSeqParamSet * sps)
{
    UINT8  profile_idc = 0;
    UINT8  level_idc = 0;
    UINT8  sps_id = 0;

    UINT8  chroma_format_idc = 0;
    bool   separate_colour_plane_flag = 0;
    UINT8  bit_depth_luma = 0;
    UINT8  bit_depth_chroma = 0;
    bool   qpprime_y_zero_transform_bypass_flag = 0;
    bool   seq_scaling_matrix_present_flag = 0;

    UINT32 max_frame_num = 0;
    UINT8  poc_type = 0;
    UINT32 max_poc_cnt = 0;
    UINT32 max_num_ref_frames = 0;
    bool   gaps_in_frame_num_value_allowed_flag = 0;
    UINT16 pic_width_in_mbs = 0;
    UINT16 pic_height_in_map_units = 0;
    UINT16 pic_height_in_mbs = 0;   // 图像实际高度 not defined in spec, derived...
    bool   frame_mbs_only_flag = 0;
    bool   mb_adaptive_frame_field_flag = 0;
    bool   direct_8x8_inference_flag = 0;
    bool   frame_cropping_flag = 0;
    UINT32 frame_crop_offset[4] = { 0 };
    bool   vui_parameters_present_flag = 0;

    UINT8 bytePosition = 3, bitPosition = 0;
    UINT32 flags = 0;   //会检索到各种flag元素，每个元素占一个比特，最终按先后顺序放到flags中

    profile_idc = m_pSODB[0];
    // 第二个字节是constraint_set_flag 暂时用不到，空过去m_pSODB[1]
    level_idc = m_pSODB[2];
    sps_id = Get_uev_code_num(m_pSODB, bytePosition, bitPosition);  //这里是一个无符号指数哥伦布编码，用前面写好的函数提取

    if (profile_idc == 100 || profile_idc == 110 || profile_idc == 122 || profile_idc == 244 || profile_idc == 44 ||
        profile_idc == 83 || profile_idc == 86 || profile_idc == 118 || profile_idc == 128)
    {
        chroma_format_idc = Get_uev_code_num(m_pSODB, bytePosition, bitPosition);
        if (chroma_format_idc == 3)
        {
            separate_colour_plane_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
            // 提取到的单个flag，放到flag集合中的（可用的最高位上）
            flags |= (separate_colour_plane_flag << 21);
        }
        bit_depth_luma = Get_uev_code_num(m_pSODB, bytePosition, bitPosition) + 8;
        bit_depth_chroma = Get_uev_code_num(m_pSODB, bytePosition, bitPosition) + 8;

        qpprime_y_zero_transform_bypass_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
        flags |= (qpprime_y_zero_transform_bypass_flag << 20);

        seq_scaling_matrix_present_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
        flags |= (seq_scaling_matrix_present_flag << 19);
        if (seq_scaling_matrix_present_flag)
        {
            // 这个部分暂时用不到，先返回一个错误码代替
            return -1;
        }
    }

    // 下面不求log2_max_frame_num，而是直接将原来的数字求出来
    max_frame_num = 1 << (Get_uev_code_num(m_pSODB, bytePosition, bitPosition) + 4);
    poc_type = Get_uev_code_num(m_pSODB, bytePosition, bitPosition);
    if (0 == poc_type)
    {
        max_poc_cnt = 1 << (Get_uev_code_num(m_pSODB, bytePosition, bitPosition) + 4);
    }
    else
    {
        // 暂时不考虑这种情况
        return -1;
    }

    max_num_ref_frames = Get_uev_code_num(m_pSODB, bytePosition, bitPosition);
    gaps_in_frame_num_value_allowed_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
    flags |= (gaps_in_frame_num_value_allowed_flag << 5);   //中间跳过了好多位，为本该有却没实现的flag留出位置

    pic_width_in_mbs = Get_uev_code_num(m_pSODB, bytePosition, bitPosition) + 1;
    pic_height_in_map_units = Get_uev_code_num(m_pSODB, bytePosition, bitPosition) + 1;
    frame_mbs_only_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
    flags |= (frame_mbs_only_flag << 4);
    if (!frame_mbs_only_flag)
    {
        mb_adaptive_frame_field_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
        flags |= (mb_adaptive_frame_field_flag << 3);
    }

    direct_8x8_inference_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
    flags |= (direct_8x8_inference_flag << 2);
    frame_cropping_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
    flags |= (direct_8x8_inference_flag << 1);
    if (frame_cropping_flag)
    {
        for (int idx = 0; idx < 4; idx++)
        {
            frame_crop_offset[idx] = Get_uev_code_num(m_pSODB, bytePosition, bitPosition);
        }
    }
    vui_parameters_present_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
    flags |= vui_parameters_present_flag;
    // 解析码流完成

    sps->Set_profile_level_idc(profile_idc, level_idc);
    sps->Set_sps_id(sps_id);
    sps->Set_chroma_format_idc(chroma_format_idc);
    sps->Set_bit_depth(bit_depth_luma, bit_depth_chroma);
    sps->Set_max_frame_num(max_frame_num);
    sps->Set_poc_type(poc_type);
    sps->Set_max_poc_cnt(max_poc_cnt);
    sps->Set_max_num_ref_frames(max_num_ref_frames);
    sps->Set_sps_multiple_flags(flags);
    sps->Set_pic_reslution_in_mbs(pic_width_in_mbs, pic_height_in_map_units);
    if (frame_cropping_flag)
    {
        sps->Set_frame_crop_offset(frame_crop_offset);
    }
    return 0;
}

5. Add the calling part:

Go back Stream.cppto and find the Parse_h264_bitstream() function. In study note 6 , the extraction of nalType has been completed, and the SODB data has been obtained, and the part of parsing the sequence parameter set sps is added later.

CNalUnit nalUint(&m_nalVec[1], m_nalVec.size() - 1);
switch (nalType)
{
    case 7:
        // 解析SPS NAL 数据
        if (m_sps)
        {
            delete m_sps;
        }
        m_sps = new CSeqParamSet;
        nalUint.Parse_as_seq_param_set(m_sps);
        break;
    default:
        break;
}

It can be single-stepped to debug it, focusing on the two parameters pic_width_in_mbs and pic_height_in_map_units , which are the width and high resolution in macroblock units. The video used for this debugging is still the video used in Study Notes 3. The parameters previously set are:

SourceWidth           = 176    # Image width in Pels, must be multiple of 16
SourceHeight          = 144    # Image height in Pels, must be multiple of 16

The macroblock resolution should be divided by 16 on the original basis, that is, the width is 11 and the height is 9. These two parameters match, basically indicating that there is no problem with the program.