1. Preparation:
Go back to the previous SimpleH264Analyzer
program, find the SPS information, and parse it
Adjust the project directory structure:
Modify Global.h
the code in the file and add a new data type UINT16. In the previously written project, UINT8 and UINT32 are both lowercase. In order to be more in line with programming specifications, change them to all uppercase (you can use ctrl+H to replace within the entire solution ).
typedef unsigned char UINT8;
typedef unsigned short UINT16;
typedef unsigned int UINT32;
The program written later will have more and more output, and if all input to the console, it will be very messy. So the output becomes two ways, one to the console and the other to the log file. Proceed as follows:
1 Create a new Configuration.h
file, put 1.Application
it in the directory, and add the code:
#ifdef _CONFIGURATION_H_
#define _CONFIGURATION_H_
#include <fstream>
#define TRACE_CONFIG_CONSOLE 1
#define TRACE_CONFIG_LOGOUT 1
extern std::ofstream g_traceFile;
#endif
2 Create a new one Configuration.cpp
, put 1.Application
it in the directory, and add the code:
#include "stdafx.h"
#include "Configuration.h"
#if TRACE_CONFIG_LOGOUT
std::ofstream g_traceFile;
#endif
3 stdafx.h
Add the reference library in:
#include <string>
#include "Configuration.h"
4 Whether to write to the log file is defined in Stream.cpp
the constructor in: Add
in CStreamFile::CStreamFile(TCHAR * fileName) :
#if TRACE_CONFIG_LOGOUT
g_traceFile.open(L"trace.txt");
if (!g_traceFile.is_open())
{
file_error(1);
}
g_traceFile << "Trace file:" << endl;
#endif
In the destructor CStreamFile::~CStreamFile()
add:
#ifdef TRACE_CONFIG_LOGOUT
if (g_traceFile.is_open())
{
g_traceFile.close();
}
#endif
When the log file fails to open, the function file_error(1) is called , so modify the void CStreamFile::file_error(int idx) function and add the scheme of error code 1 in it:
case 1:
wcout << L"Error: opening trace file failed." << endl;
break;
After completing the above configuration, compile and run the program, a trace.txt
file will be generated in the \bin\Debug directory, and the string "Trace file:" will be written.
In order to replace the previous direct output in the console, create a new function in the CStreamFile class, first declare the function (private) in the Stream.h file
void dump_NAL_type(UINT8 nalType);
Add the Stream.cpp
implementation of this function in
void CStreamFile::dump_NAL_type(UINT8 nalType)
{
#if TRACE_CONFIG_CONSOLE
wcout << L"NAL Unit Type: " << nalType << endl;
#endif
#if TRACE_CONFIG_LOGOUT
g_traceFile << "NAL Unit Type: " << to_string(nalType) << endl;
#endif
}
Change the wcout output in the Parse_h264_bitstream() function to call the new function:
dump_NAL_type(nalType);
Recompile and run. Since the console and log file output switches are both turned on at this time, you can see the output of NAL Unit Type in the console and trace.txt
Second, define the SPS class:
Create a new class CSeqParamSet, and put the generated one in the CSeqParamSet.h
" CSeqParamSet.cpp
3.NAL Unit" directory.
According to the coding structure mentioned in the official document in the previous note, define all the syntax elements one by one, and set the setter function:
Modify SeqParamSet.h
:
#ifndef _SEQ_PARAM_SET_H_
#define _SEQ_PARAM_SET_H_
class CSeqParamSet
{
public:
CSeqParamSet();
~CSeqParamSet();
void Set_profile_level_idc(UINT8 profile, UINT8 level);
void Set_sps_id(UINT8 spsID);
void Set_chroma_format_idc(UINT8 chromaFormatIdc);
void Set_bit_depth(UINT8 bit_depth_luma, UINT8 bit_depth_chroma);
void Set_max_frame_num(UINT32 maxFrameNum);
void Set_poc_type(UINT8 pocType);
void Set_max_poc_cnt(UINT32 maxPocCnt);
void Set_max_num_ref_frames(UINT32 maxRefFrames);
void Set_sps_multiple_flags(UINT32 flags);
void Set_pic_reslution_in_mbs(UINT16 widthInMBs, UINT16 heightInMapUnits);
void Set_frame_crop_offset(UINT32 offsets[4]);
private:
UINT8 m_profile_idc;
UINT8 m_level_idc;
UINT8 m_sps_id;
// for uncommon profile...
UINT8 m_chroma_format_idc;
bool m_separate_colour_plane_flag;
UINT8 m_bit_depth_luma;
UINT8 m_bit_depth_chroma;
bool m_qpprime_y_zero_transform_bypass_flag;
bool m_seq_scaling_matrix_present_flag;
// ...for uncommon profile
UINT32 m_max_frame_num;
UINT8 m_poc_type;
UINT32 m_max_poc_cnt;
UINT32 m_max_num_ref_frames;
bool m_gaps_in_frame_num_value_allowed_flag;
UINT16 m_pic_width_in_mbs;
UINT16 m_pic_height_in_map_units;
UINT16 m_pic_height_in_mbs; // 图像实际高度 not defined in spec, derived...
bool m_frame_mbs_only_flag;
bool m_mb_adaptive_frame_field_flag;
bool m_direct_8x8_inference_flag;
bool m_frame_cropping_flag;
UINT32 m_frame_crop_offset[4];
bool m_vui_parameters_present_flag;
// UINT32 m_reserved;
};
#endif
Implementing all setter functions in the SeqParamSet.cpp
file is a simple assignment process:
#include "stdafx.h"
#include "SeqParamSet.h"
CSeqParamSet::CSeqParamSet()
{
}
CSeqParamSet::~CSeqParamSet()
{
}
void CSeqParamSet::Set_profile_level_idc(UINT8 profile, UINT8 level)
{
m_profile_idc = profile;
m_level_idc = level;
}
void CSeqParamSet::Set_sps_id(UINT8 sps_id)
{
m_sps_id = sps_id;
}
void CSeqParamSet::Set_chroma_format_idc(UINT8 chromaFormatIdc)
{
m_chroma_format_idc = chromaFormatIdc;
}
void CSeqParamSet::Set_bit_depth(UINT8 bit_depth_luma, UINT8 bit_depth_chroma)
{
m_bit_depth_luma = bit_depth_luma;
m_bit_depth_chroma = bit_depth_chroma;
}
void CSeqParamSet::Set_max_frame_num(UINT32 maxFrameNum)
{
m_max_frame_num = maxFrameNum;
}
void CSeqParamSet::Set_poc_type(UINT8 pocType)
{
m_poc_type = pocType;
}
void CSeqParamSet::Set_max_poc_cnt(UINT32 maxPocCnt)
{
m_max_poc_cnt = maxPocCnt;
}
void CSeqParamSet::Set_max_num_ref_frames(UINT32 maxRefFrames)
{
m_max_num_ref_frames = maxRefFrames;
}
void CSeqParamSet::Set_sps_multiple_flags(UINT32 flags)
{
m_separate_colour_plane_flag = flags & (1 << 21);
m_qpprime_y_zero_transform_bypass_flag = flags & (1 << 20);
m_seq_scaling_matrix_present_flag = flags & (1 << 19);
m_gaps_in_frame_num_value_allowed_flag = flags & (1 << 5);
m_frame_mbs_only_flag = flags & (1 << 4);
m_mb_adaptive_frame_field_flag = flags & (1 << 3);
m_direct_8x8_inference_flag = flags & (1 << 2);
m_frame_cropping_flag = flags & (1 << 1);
m_vui_parameters_present_flag = flags & 1;
}
void CSeqParamSet::Set_pic_reslution_in_mbs(UINT16 widthInMBs, UINT16 heightInMapUnits)
{
m_pic_width_in_mbs = widthInMBs;
m_pic_height_in_map_units = heightInMapUnits;
m_pic_height_in_mbs = m_frame_mbs_only_flag ? m_pic_height_in_map_units : 2 * m_pic_height_in_map_units;
}
void CSeqParamSet::Set_frame_crop_offset(UINT32 offsets[4])
{
for (int idx = 0; idx < 4; idx++)
{
m_frame_crop_offset[idx] = offsets[idx];
}
}
3. Unsigned exponential Columbus data decoding:
It is exactly the same as the unsigned exponential Columbus decoding part implemented in study note 9 , only the code is placed below (detailed in note 9): In the 0.Global directory, create a new one and define the two necessary functions in exponential Columbus encoding :Utils.h
#ifndef _UTILS_H_
#define _UTILS_H_
#include "Global.h"
int Get_bit_at_position(UINT8 *buf, UINT8 &bytePosition, UINT8 &bitPosition);
int Get_uev_code_num(UINT8 *buf, UINT8 &bytePosition, UINT8 &bitPosition);
#endif
In the 0.Global directory, create a new Utils.cpp
one and implement the above two functions:
#include "stdafx.h"
#include "Utils.h"
// 根据bytePosition和bitPosition 获取当前比特位二进制数值 返回0/1
int Get_bit_at_position(UINT8 * buf, UINT8 & bytePosition, UINT8 & bitPosition)
{
UINT8 mask = 0, val = 0;
mask = 1 << (7 - bitPosition);
val = ((buf[bytePosition] & mask) != 0);
if (++bitPosition > 7)
{
bytePosition++;
bitPosition = 0;
}
return val;
}
// 将接下来一个指数哥伦布编码 转换成十进制数值
int Get_uev_code_num(UINT8 * buf, UINT8 & bytePosition, UINT8 & bitPosition)
{
assert(bitPosition < 8);
UINT8 val = 0, prefixZeroCount = 0;
int prefix = 0, surfix = 0;
while (true)
{
val = Get_bit_at_position(buf, bytePosition, bitPosition);
if (val == 0)
{
prefixZeroCount++;
}
else
{
break;
}
}
prefix = (1 << prefixZeroCount) - 1;
for (size_t i = 0; i < prefixZeroCount; i++)
{
val = Get_bit_at_position(buf, bytePosition, bitPosition);
surfix += val * (1 << (prefixZeroCount - i - 1));
}
prefix += surfix;
return prefix;
}
You can copy the code in the main function of study notes 9 for testing, and you can output the decoding result correctly.
4. Parse the SPS data in NALUnit:
Parse the grammatical elements in UALUnit into the value of each member variable in SPS according to the protocol . Add functions to and, NALUnit.h
Parse_as_seq_param_set () is used to parse grammatical elements, the code is as follows. (All can be parsed in the order of the official documents in Study Notes 10 )NALUnit.cpp
int CNalUnit::Parse_as_seq_param_set(CSeqParamSet * sps)
{
UINT8 profile_idc = 0;
UINT8 level_idc = 0;
UINT8 sps_id = 0;
UINT8 chroma_format_idc = 0;
bool separate_colour_plane_flag = 0;
UINT8 bit_depth_luma = 0;
UINT8 bit_depth_chroma = 0;
bool qpprime_y_zero_transform_bypass_flag = 0;
bool seq_scaling_matrix_present_flag = 0;
UINT32 max_frame_num = 0;
UINT8 poc_type = 0;
UINT32 max_poc_cnt = 0;
UINT32 max_num_ref_frames = 0;
bool gaps_in_frame_num_value_allowed_flag = 0;
UINT16 pic_width_in_mbs = 0;
UINT16 pic_height_in_map_units = 0;
UINT16 pic_height_in_mbs = 0; // 图像实际高度 not defined in spec, derived...
bool frame_mbs_only_flag = 0;
bool mb_adaptive_frame_field_flag = 0;
bool direct_8x8_inference_flag = 0;
bool frame_cropping_flag = 0;
UINT32 frame_crop_offset[4] = { 0 };
bool vui_parameters_present_flag = 0;
UINT8 bytePosition = 3, bitPosition = 0;
UINT32 flags = 0; //会检索到各种flag元素,每个元素占一个比特,最终按先后顺序放到flags中
profile_idc = m_pSODB[0];
// 第二个字节是constraint_set_flag 暂时用不到,空过去m_pSODB[1]
level_idc = m_pSODB[2];
sps_id = Get_uev_code_num(m_pSODB, bytePosition, bitPosition); //这里是一个无符号指数哥伦布编码,用前面写好的函数提取
if (profile_idc == 100 || profile_idc == 110 || profile_idc == 122 || profile_idc == 244 || profile_idc == 44 ||
profile_idc == 83 || profile_idc == 86 || profile_idc == 118 || profile_idc == 128)
{
chroma_format_idc = Get_uev_code_num(m_pSODB, bytePosition, bitPosition);
if (chroma_format_idc == 3)
{
separate_colour_plane_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
// 提取到的单个flag,放到flag集合中的(可用的最高位上)
flags |= (separate_colour_plane_flag << 21);
}
bit_depth_luma = Get_uev_code_num(m_pSODB, bytePosition, bitPosition) + 8;
bit_depth_chroma = Get_uev_code_num(m_pSODB, bytePosition, bitPosition) + 8;
qpprime_y_zero_transform_bypass_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
flags |= (qpprime_y_zero_transform_bypass_flag << 20);
seq_scaling_matrix_present_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
flags |= (seq_scaling_matrix_present_flag << 19);
if (seq_scaling_matrix_present_flag)
{
// 这个部分暂时用不到,先返回一个错误码代替
return -1;
}
}
// 下面不求log2_max_frame_num,而是直接将原来的数字求出来
max_frame_num = 1 << (Get_uev_code_num(m_pSODB, bytePosition, bitPosition) + 4);
poc_type = Get_uev_code_num(m_pSODB, bytePosition, bitPosition);
if (0 == poc_type)
{
max_poc_cnt = 1 << (Get_uev_code_num(m_pSODB, bytePosition, bitPosition) + 4);
}
else
{
// 暂时不考虑这种情况
return -1;
}
max_num_ref_frames = Get_uev_code_num(m_pSODB, bytePosition, bitPosition);
gaps_in_frame_num_value_allowed_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
flags |= (gaps_in_frame_num_value_allowed_flag << 5); //中间跳过了好多位,为本该有却没实现的flag留出位置
pic_width_in_mbs = Get_uev_code_num(m_pSODB, bytePosition, bitPosition) + 1;
pic_height_in_map_units = Get_uev_code_num(m_pSODB, bytePosition, bitPosition) + 1;
frame_mbs_only_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
flags |= (frame_mbs_only_flag << 4);
if (!frame_mbs_only_flag)
{
mb_adaptive_frame_field_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
flags |= (mb_adaptive_frame_field_flag << 3);
}
direct_8x8_inference_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
flags |= (direct_8x8_inference_flag << 2);
frame_cropping_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
flags |= (direct_8x8_inference_flag << 1);
if (frame_cropping_flag)
{
for (int idx = 0; idx < 4; idx++)
{
frame_crop_offset[idx] = Get_uev_code_num(m_pSODB, bytePosition, bitPosition);
}
}
vui_parameters_present_flag = Get_bit_at_position(m_pSODB, bytePosition, bitPosition);
flags |= vui_parameters_present_flag;
// 解析码流完成
sps->Set_profile_level_idc(profile_idc, level_idc);
sps->Set_sps_id(sps_id);
sps->Set_chroma_format_idc(chroma_format_idc);
sps->Set_bit_depth(bit_depth_luma, bit_depth_chroma);
sps->Set_max_frame_num(max_frame_num);
sps->Set_poc_type(poc_type);
sps->Set_max_poc_cnt(max_poc_cnt);
sps->Set_max_num_ref_frames(max_num_ref_frames);
sps->Set_sps_multiple_flags(flags);
sps->Set_pic_reslution_in_mbs(pic_width_in_mbs, pic_height_in_map_units);
if (frame_cropping_flag)
{
sps->Set_frame_crop_offset(frame_crop_offset);
}
return 0;
}
5. Add the calling part:
Go back Stream.cpp
to and find the Parse_h264_bitstream() function. In study note 6 , the extraction of nalType has been completed, and the SODB data has been obtained, and the part of parsing the sequence parameter set sps is added later.
CNalUnit nalUint(&m_nalVec[1], m_nalVec.size() - 1);
switch (nalType)
{
case 7:
// 解析SPS NAL 数据
if (m_sps)
{
delete m_sps;
}
m_sps = new CSeqParamSet;
nalUint.Parse_as_seq_param_set(m_sps);
break;
default:
break;
}
It can be single-stepped to debug it, focusing on the two parameters pic_width_in_mbs and pic_height_in_map_units , which are the width and high resolution in macroblock units. The video used for this debugging is still the video used in Study Notes 3. The parameters previously set are:
SourceWidth = 176 # Image width in Pels, must be multiple of 16
SourceHeight = 144 # Image height in Pels, must be multiple of 16
The macroblock resolution should be divided by 16 on the original basis, that is, the width is 11 and the height is 9. These two parameters match, basically indicating that there is no problem with the program.