AAC ADTS format analysis and actual combat of extracting aac audio files

1. Introduction to AAC audio format

AAC audio format: Advanced Audio Coding (Advanced Audio Coding), is a lossy audio compression format defined by the MPEG-4 standard, developed by Fraunhofer, Dolby, Sony and AT&T are the main contributors.

AAC consists of two formats: ADIF and ADTS

2. Introduction to ADIF and ADTS

  • ADIF : Audio Data Interchange Format Audio data exchange format. The characteristic of this format is that the beginning of the audio data can be found for sure, without decoding that starts in the middle of the audio data stream, that is, its decoding must be performed at a clearly defined beginning. Therefore, this format is often used in disk files.
  • The full name of ADTS is Audio Data Transport Stream. It is the transport stream format of AAC audio. The AAC audio format is defined in MPEG-2 (ISO-13318-7 2003). AAC was later adopted into the MPEG-4 standard. The characteristic of this format is that it is a bit stream with a sync word, and decoding can start anywhere in this stream. Its characteristics are similar to the mp3 data stream format.

Simply put, ADTS can be decoded at any frame, which means that it has header information for each frame . ADIF has only one unified header, so all data must be obtained and decoded.

And the formats of these two headers are also different. At present, the encoded and extracted audio streams are generally ADTS format audio streams.

ADIF format of AAC:

insert image description here

The general format of ADTS of AAC (blank space indicates front and rear frames):
insert image description here

Sometimes when you encode an AAC naked stream, you will encounter that the written AAC file cannot be played on PCs and mobile phones. The most likely reason is that each frame of the AAC file lacks the ADTS header information file. packaging stitching.

You only need to add the ADTS header file, the length of an AAC original data block is variable, and the ADTS encapsulation is performed on the original frame plus the ADTS header to form an ADTS frame .

3. ADTS analysis

Each frame of an AAC audio file consists of ADTS Header and AAC Audio Data, the structure is as follows:

insert image description here

Note: The length of the ADTS Header may be 7 bytes or 9 bytes. When protection_absent=0, it is 9 bytes. When protection_absent=1, it is 7 bytes.

The ADTS header file of each frame contains information such as audio sampling rate, channel, frame length, etc., so that the decoder can parse and read it. Generally, the ADTS header information is 7 bytes. Divided into 2 parts:

  • adts_fixed_header

  • adts_variable_header

One is the fixed header information, followed by the variable header information. The data in the fixed header is the same every frame, while the variable header is variable from frame to frame.

	/* adts_fixed_header */
    put_bits(&pb, 12, 0xfff);   /* syncword */
    put_bits(&pb, 1, 0);        /* ID */
    put_bits(&pb, 2, 0);        /* layer */
    put_bits(&pb, 1, 1);        /* protection_absent */
    put_bits(&pb, 2, ctx->objecttype); /* profile_objecttype */
    put_bits(&pb, 4, ctx->sample_rate_index);
    put_bits(&pb, 1, 0);        /* private_bit */
    put_bits(&pb, 3, ctx->channel_conf); /* channel_configuration */
    put_bits(&pb, 1, 0);        /* original_copy */
    put_bits(&pb, 1, 0);        /* home */

syncword : The sync header is always 0xFFF, all bits must be 1, which represents the beginning of an ADTS frame

ID : MPEG identifier, 0 identifies MPEG-4, 1 identifies MPEG-2

Layer:always: ‘00’

protection_absent : Indicates whether to check for errors. Warning, set to 1 if there is no CRC and 0 if there is CRC

profile : Indicates which level of AAC to use, such as 01 Low Complexity (LC)—AAC

LC . Some chips only support AAC LC.

   /* adts_variable_header */
    put_bits(&pb, 1, 0);        /* copyright_identification_bit */
    put_bits(&pb, 1, 0);        /* copyright_identification_start */
    put_bits(&pb, 13, full_frame_size); /* aac_frame_length */
    put_bits(&pb, 11, 0x7ff);   /* adts_buffer_fullness */
    put_bits(&pb, 2, 0);        /* number_of_raw_data_blocks_in_frame */

frame_length : The length of an ADTS frame including ADTS header and AAC original stream. frame length, this value must include 7 or 9 bytes of header length:

aac_frame_length = (protection_absent == 1 ? 7 : 9) + size(AACFrame)

protection_absent=0时, header length=9bytes

protection_absent=1时, header length=7bytes

adts_buffer_fullness : 0x7FF indicates that it is a code stream with variable code rate.

number_of_raw_data_blocks_in_frame : Indicates that there are

number_of_raw_data_blocks_in_frame + 1 AAC raw frame.

So number_of_raw_data_blocks_in_frame == 0 means that there is an AAC data block in the ADTS frame.

ADTS MediaInfo analysis:

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-Hm3G5qY2-1672847932496)(C:\Users\56930\Pictures\ADTS.png)]

binary file:

insert image description here

The first 7 bytes of ADTS: 0xFF, 0xF1, 0x4C, 0x80, 0x2B, 0x9F, 0xFC

0xFF---->11111111

0xF1------>11110001

0x4C------>1001100

0x80-------->10000000

0x2B------->101011

0x9F--------->10011111

0xFC--------> 11111100

4. Extracting aac audio files from video files

4.1 Define the sampling rate array
const int sampling_frequencies[]={
    
    
    96000,  // 0x0
    88200,  // 0x1
    64000,  // 0x2
    48000,  // 0x3
    44100,  // 0x4
    32000,  // 0x5
    24000,  // 0x6
    22050,  // 0x7
    16000,  // 0x8
    12000,  // 0x9
    11025,  // 0xa
    8000   // 0xb
    // 0xc d e f是保留的
};

4.2 Add ADTS file header information
int adts_header(char* const p_adts_header,const int data_length,const int profile,
                const int samplerate,const int channels){
    
    
    int sampling_frequency_index=3; //默认使用48000采样率
    int adtsLen=data_length+7;
    int frequencies_size=sizeof (sampling_frequencies)/sizeof (sampling_frequencies[0]);
    int i=0;
    for(i=0;i<frequencies_size;i++){
    
    
        if(sampling_frequencies[i]==samplerate){
    
    
            sampling_frequency_index=i;
            break;
        }
    }

    //采样率不支持
    if(i>=frequencies_size){
    
    
        printf("unsupport samplerate:%d\n", samplerate);
        return -1;
    }


    p_adts_header[0] = 0xff;         //syncword:0xfff                          高8bits
    p_adts_header[1] = 0xf0;         //syncword:0xfff                          低4bits
    p_adts_header[1] |= (0 << 3);    //MPEG Version:0 for MPEG-4,1 for MPEG-2  1bit
    p_adts_header[1] |= (0 << 1);    //Layer:0                                 2bits
    p_adts_header[1] |= 1;           //protection absent:1                     1bit

    p_adts_header[2] = (profile)<<6;            //profile:profile               2bits
    p_adts_header[2] |= (sampling_frequency_index & 0x0f)<<2; //sampling frequency index:sampling_frequency_index  4bits
    p_adts_header[2] |= (0 << 1);             //private bit:0                   1bit
    p_adts_header[2] |= (channels & 0x04)>>2; //channel configuration:channels  高1bit

    p_adts_header[3] = (channels & 0x03)<<6; //channel configuration:channels 低2bits
    p_adts_header[3] |= (0 << 5);               //original:0                1bit
    p_adts_header[3] |= (0 << 4);               //home:0                    1bit
    p_adts_header[3] |= (0 << 3);               //copyright id bit:0        1bit
    p_adts_header[3] |= (0 << 2);               //copyright id start:0      1bit
    p_adts_header[3] |= ((adtsLen & 0x1800) >> 11);           //frame length:value   高2bits

    p_adts_header[4] = (uint8_t)((adtsLen & 0x7f8) >> 3);     //frame length:value    中间8bits
    p_adts_header[5] = (uint8_t)((adtsLen & 0x7) << 5);       //frame length:value    低3bits
    p_adts_header[5] |= 0x1f;                                 //buffer fullness:0x7ff 高5bits
    p_adts_header[6] = 0xfc;      //‭11111100‬       //buffer fullness:0x7ff 低6bits
    // number_of_raw_data_blocks_in_frame:
    //    表示ADTS帧中有number_of_raw_data_blocks_in_frame + 1个AAC原始帧。

    return 0;
}

4.3 Extract aac audio files
 int ret=-1;
    char errors[1024];

    char* filename=NULL;
    char* aac_filename=NULL;

    int audio_index=-1;
    int len=0;

    FILE *aac_fd=NULL;

    AVFormatContext* av_format_context=NULL;
    AVPacket pkt;

    av_log_set_level(AV_LOG_DEBUG);

    if(argc<3){
    
    
        av_log(NULL, LOG_LEVEL, "the count of parameters should be more than three!\n");
        return -1;
    }

    filename=argv[1];  //输入文件名称
    aac_filename=argv[2]; //输出aac文件名

    if(filename == NULL||aac_filename==NULL){
    
    
        av_log(NULL,LOG_LEVEL,"输入文件或者输出文件名为null,请检查");
        return -1;
    }

    aac_fd=fopen(aac_filename,"wb");
    if(!aac_fd){
    
    
        av_log(NULL,LOG_LEVEL,"aac_file open fieled");
        return -1;
    }

    if(ret = avformat_open_input(&av_format_context,filename,NULL,NULL)<0){
    
    
        av_strerror(ret,errors,1024);
        av_log(NULL,LOG_LEVEL,"avformat_open_input failed :%s,%d(%s)\n",
               filename,
               ret,
               errors);
    }
    // 获取解码器信息
    if(ret=avformat_find_stream_info(av_format_context,NULL)<0){
    
    
        av_strerror(ret,errors,1024);
        av_log(NULL,LOG_LEVEL,"avformat_find_stream_info failed :%s,%d(%s)\n",
               filename,
               ret,
               errors);
    }

    av_dump_format(av_format_context,0,filename,0);

    av_init_packet(&pkt);

    audio_index= av_find_best_stream(av_format_context,AVMEDIA_TYPE_AUDIO,-1,-1,NULL,0);
    if(audio_index<0){
    
    
        av_log(NULL,LOG_LEVEL,"没找到%s 流信息 从%s视频文件中\n",
               av_get_media_type_string(AVMEDIA_TYPE_AUDIO),
               filename);
        return AVERROR(EINVAL);
    }

    //打印aac的级别
    printf("audio profile :%d , FF_PROFILE_AAC_LOW:%d\n",
           av_format_context->streams[audio_index]->codecpar->level,
           FF_PROFILE_AAC_LOW);

    if(av_format_context->streams[audio_index]->codecpar->codec_id != AV_CODEC_ID_AAC)
    {
    
    
        printf("the media file no contain AAC stream, it's codec_id is %d\n",
               av_format_context->streams[audio_index]->codecpar->codec_id);
        goto failed;
    }

    //读取媒体文件并把aac数据贞写法如到本地文件
    while(av_read_frame(av_format_context,&pkt)>=0){
    
    
        if(pkt.stream_index==audio_index){
    
    
            char adts_header_buf[7]={
    
    0};
            adts_header(adts_header_buf,pkt.size,
                        av_format_context->streams[audio_index]->codecpar->profile,
                        av_format_context->streams[audio_index]->codecpar->sample_rate,
                        av_format_context->streams[audio_index]->codecpar->channels);
            // 写adts header , ts流不适用,ts流分离出来的packet带了adts header
            //第二个参数:这是要被写入的每个元素的大小,以字节为单位  char 占一个字节所以写1
            fwrite(adts_header_buf,1,7,aac_fd);
            len=fwrite(pkt.data,1,pkt.size,aac_fd); //写adts data
            if(len!=pkt.size){
    
    
                av_log(NULL, AV_LOG_DEBUG, "warning, length of writed data isn't equal pkt.size(%d, %d)\n",
                       len,
                       pkt.size);
            }
        }
        av_packet_unref(&pkt);
    }

 failed:
     if(av_format_context){
    
    
         avformat_close_input(&av_format_context);
     }

     if(aac_fd){
    
    
         fclose(aac_fd);
     }

Guess you like

Origin blog.csdn.net/u014078003/article/details/128556390