mp4中box'stsd'中的'avcC'分析的H.264编码的头包发送

stsd 的段分析

这里写图片描述

stsd中avc1段分析

这里写图片描述

字段	字节数	意义
size	4	size
Data format	4	封装格式
Reserved	6	Six bytes that must be set to 0.
Data reference index	2	有说明数据的参数个数

avc1中的编码视频的宽度高度, 压缩编码32分析

这里写图片描述

size(4)	type(4)	version(2)	Revision level(2)	Vendor(4)	Temporal quality(4)	Spatial quality(4)	width(2)	Height(2)	Horizontal resolution(4)	Vertical resolution(4)	Data size(4)	Frame count(2)	Compressor name(4)	Depth(2)	Color table ID(2)

字段	字节数	意义
box size	4	size
box type	4	type
version	2	box 版本, 0或1, 一般为0, (以下字节数均按version=0)
Revision level	2	must be set to 0.
Vendor	4
Temporal quality	4	时间的压缩
Spatial quality	4	视频的质量
Width	2
Height	2
Horizontal resolution	4	垂直分辨率
Vertical resolution	4	水平分辨率
Data size	4	A 32-bit integer that must be set to 0
Frame count	2	A 16-bit integer that indicates how many frames of compressed data are stored in each sample. Usually set to 1.
Compressor name	4	A 32-byte Pascal string containing the name of the compressor that created the image, such as “jpeg”
Depth	2	表示压缩图像的像素深度的16位整数。1, 2, 4，8, 16, 24的值，32表示彩色图像的深度。只有在图像包含时才使用值32。阿尔法通道。灰度值分别为34, 36、40和表示2、4和8位灰度值。图像.
Color table ID	2	标识要使用的颜色表的16位整数。如果这个字段被设置为- 1，默认颜色表应用于指定深度。对于每像素16位以下的深度，这表示一个标准。指定深度的Macintosh颜色表。深度为16, 24，32没有颜色表。如果颜色表ID设置为0，则颜色表包含在示例描述本身中。颜色

这个地方特殊说明,这个地方需要32个8bit位置,0x00 *32第一个8bit来表明字符长度,后面31个8bit来表明压缩的内容,

aligned(8) abstract class SampleEntry (unsigned int(32) format) extends Box(format){
        const unsigned int(8)[6] reserved = 0;    ////首先6个字节的保留位  值都是0
        unsigned int(16) data_reference_index;  ///一个2个字节来描述的 数据索引
    }

    ///如果是一个空的entry,则追加一个字节的空数据
    class HintSampleEntry() extends SampleEntry (protocol) { 
        unsigned int(8) data [];
    }
    // Visual Sequences    视频entry

    class VisualSampleEntry(codingname) extends SampleEntry (codingname){ 
        unsigned int(16) pre_defined = 0;     //2个字节的保留位
        const unsigned int(16) reserved = 0;    //2个字节的保留位
        unsigned int(32)[3] pre_defined = 0;    //3*4个字节的保留位
        unsigned int(16) width;             //2个字节的宽度
        unsigned int(16) height;                //2个字节的高度
        template unsigned int(32) horizresolution = 0x00480000; // 72 dpi    //纵向dpi,4字节
        template unsigned int(32) vertresolution = 0x00480000; // 72 dpi    //横向dpi 4字节
        const unsigned int(32) reserved = 0;                                //4字节保留位
        template unsigned int(16) frame_count = 1;                      //2字节的frame_count
        string[32] compressorname;                                      //32字节的compressorname
        //这个地方特殊说明,这个地方需要32个8bit位置,0x00 *32
        第一个8bit来表明字符长度,后面31个8bit来表明压缩的内容,
        例子:
        0x04,                       //  strlen compressorname: 32 bytes         String[32]
                                                        //32个8 bit    第一个8bit表示长度,剩下31个8bit表示内容
            0x67, 0x31, 0x31, 0x31,  // compressorname: 32 bytes    翻译过来是g111
            0x00, 0x00, 0x00, 0x00,//
            0x00, 0x00, 0x00, 0x00,//
            0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00,


        template unsigned int(16) depth = 0x0018;                       //2字节的色彩深度
        int(16) pre_defined = -1;                                           //2字节的pre_defined
    }
       // Audio Sequences   音频entry

    class AudioSampleEntry(codingname) extends SampleEntry (codingname){ 
        const unsigned int(32)[2] reserved = 0;                             //2*4字节保留位
        template unsigned int(16) channelcount = 2;                         //2字节的channelcount
        template unsigned int(16) samplesize = 16;                          //2字节的 samplesize
        unsigned int(16) pre_defined = 0;                                       //2字节的pre_defined
        const unsigned int(16) reserved = 0 ;                                   //2字节保留位
        template unsigned int(32) samplerate = {timescale of media}<<16;    //4字节声音赫兹
    }

avc1中avcC字段的分析(H.264)

size(4)	Type(4)	AVC Decoder Configuration Record

这里写图片描述

数据有H.264分析header

00 00 00 31, 61 76 63 43, 01 42 C0 15, FF E1 00 19

67 42 C0 15, D9 01 B1 FE, 4F 01 10 00, 00 03 00 10

00 00 03 03, 20 F1 62 E4, 80 01 00 05, 68 CB 82 CB 20

mp4中分析h.264

　AVC sequence header就是AVCDecoderConfigurationRecord结构，该结构在标准文档“ISO-14496-15 AVC file format”中有详细说明。

长度	字段	说明
8 bit	configuration Version	版本号, 1
8 bit	AVCProFileIndication	sps[1]
8 bit	profile_compatibility	sps[2]
8 bit	AVC_LevelIndication	sps[3]
6 bit	reserved
2 bit	lengthSizeMinusOne	NALUnitLength的长度-1
3 bit	reserved	111
5 bit	numOfSequencePaarameterSets	sps个数, 一般为1
	sequenceParameterSetNALUnits	sps_size + size)的数组
8 bit	numOfPictureParameterSets	pps个数, 一般为1
	pictureParameterSetNALUnit	(pps_size + pps)的数组

根据 AVCDecoderConfigurationRecord 结构的定义：

00 00 00 31: box size : ‘avcC’ 大小 49个字节
61 76 63 43: box type : ‘avcC’ 类型
01 : box 版本:
42 C0 15 : box ProFile sps[1]
FF : 非常重要，是 H.264 视频中 NALU 的长度，计算方法是 1 + (lengthSizeMinusOne & 3)，实际计算结果一直是4
E1 : SPS 的个数，计算方法是 numOfSequenceParameterSets & 0x1F，实际计算结果一直为1
00 19 : SPS的长度 ->25个字节
67 42 C0 15, D9 01 B1 FE, 4F 01 10 00, 00 03 00 10, 00 00 03 03, 20 F1 62 E4, 80 –> SPS
01 : PPS的个数
00 05 : PPS的长度
68 CB 82 CB 20 : PPS数据

发送NALU包


    void add_264_sequence_header(unsigned char *pps, unsigned char *sps,
                                 int pps_len, int sps_len) {
        int body_size = 13 + sps_len + 3 + pps_len;
        RTMPPacket *packet = (RTMPPacket *) malloc(sizeof(RTMPPacket));
        RTMPPacket_Alloc(packet, body_size);
        RTMPPacket_Reset(packet);
        char *body = packet->m_body;
        int i = 0;
        body[i++] = 0x17;
        body[i++] = 0x00;
        //composition time 0x000000
        body[i++] = 0x00;
        body[i++] = 0x00;
        body[i++] = 0x00;

        /*AVCDecoderConfigurationRecord*/
        body[i++] = 0x01;
        body[i++] = sps[1];
        body[i++] = sps[2];
        body[i++] = sps[3];
        body[i++] = 0xFF;

        /*sps*/
        body[i++] = 0xE1;
        body[i++] = (sps_len >> 8) & 0xff;
        body[i++] = sps_len & 0xff;
        memcpy(&body[i], sps, sps_len);
        i += sps_len;

        /*pps*/
        body[i++] = 0x01;
        body[i++] = (pps_len >> 8) & 0xff;
        body[i++] = (pps_len) & 0xff;
        memcpy(&body[i], pps, pps_len);
        i += pps_len;

        packet->m_packetType = RTMP_PACKET_TYPE_VIDEO;
        packet->m_nBodySize = body_size;
        packet->m_nChannel = 0x04;
        packet->m_nTimeStamp = 0;
        packet->m_hasAbsTimestamp = 0;
        packet->m_headerType = RTMP_PACKET_SIZE_MEDIUM;
        add_rtmp_packet(packet);
    }

    void add_264_body(unsigned char *buf, int len) {
        /*去掉帧界定符 *00 00 00 01*/
        if (buf[2] == 0x00) { //
            buf += 4;
            len -= 4;
        } else if (buf[2] == 0x01) { //00 00 01
            buf += 3;
            len -= 3;
        }
        int body_size = len + 9;
        RTMPPacket *packet = (RTMPPacket *) malloc(sizeof(RTMPPacket));
        RTMPPacket_Alloc(packet, len + 9);
        char *body = packet->m_body;
        int type = buf[0] & 0x1f;
        /*key frame*/
        body[0] = 0x27;
        if (type == NAL_SLICE_IDR) {
            body[0] = 0x17;
        }
        body[1] = 0x01; /*nal unit*/
        body[2] = 0x00;
        body[3] = 0x00;
        body[4] = 0x00;

        body[5] = (len >> 24) & 0xff;
        body[6] = (len >> 16) & 0xff;
        body[7] = (len >> 8) & 0xff;
        body[8] = (len) & 0xff;

        /*copy data*/
        memcpy(&body[9], buf, len);

        packet->m_hasAbsTimestamp = 0;
        packet->m_nBodySize = body_size;
        packet->m_packetType = RTMP_PACKET_TYPE_VIDEO;
        packet->m_nChannel = 0x04;
        packet->m_headerType = RTMP_PACKET_SIZE_LARGE;
    //  packet->m_nTimeStamp = -1;
        packet->m_nTimeStamp = RTMP_GetTime() - start_time;
        add_rtmp_packet(packet);
    }

    void add_aac_body(unsigned char *buf, int len) {
        //outputformat = 1 ADTS头 7个，写入文件
        //  outputformat = 0  直接为原始数据 不需要去掉头7个
    //      buf += 7;
    //      len -= 7;
        int body_size = len + 2;
        RTMPPacket *packet = (RTMPPacket *) malloc(sizeof(RTMPPacket));
        RTMPPacket_Alloc(packet, body_size);
        char *body = packet->m_body;
        /*AF 01 + AAC RAW data*/
        body[0] = 0xAF;
        body[1] = 0x01;
        memcpy(&body[2], buf, len);
        packet->m_packetType = RTMP_PACKET_TYPE_AUDIO;
        packet->m_nBodySize = body_size;
        packet->m_nChannel = 0x04;
        packet->m_hasAbsTimestamp = 0;
        packet->m_headerType = RTMP_PACKET_SIZE_MEDIUM;
    //  packet->m_nTimeStamp = -1;
        packet->m_nTimeStamp = RTMP_GetTime() - start_time;
        add_rtmp_packet(packet);
    }

音频数据分析

　　AAC sequence header存放的是AudioSpecificConfig结构，该结构则在“ISO-14496-3 Audio”中描述。AudioSpecificConfig结构的描述非常复杂，这里我做一下简化，事先设定要将要编码的音频格式，其中，选择”AAC-LC”为音频编码，音频采样率为44100，于是AudioSpecificConfig简化为下表：

长度	字段	说明
5 bit	audio ObjectType	编码结构类型, AAC-LC为2
4 bit	samplingFrequencyIndex	音频采样率索引值, 44100对应值4
4 bit	channelConfiguration	音频输出声道, 2
	GASpecific	该结构包含以下三项
1 bit	frameLengthFlag	标志位, 用于表明IMDCT窗口长度, 0
1 bit	dependsOnCoreCoder	标志位, 表明是否依赖于corecoder, 0
1 bit	extensionFlag	选择了AAC-LC, 这里必须为0

这里写图片描述