Usage and in-depth analysis of FFmpeg library: decoding audio stream process


Decoding audio stream: use and in-depth analysis of FFmpeg library

Insert image description here

1 Introduction

In the field of multimedia processing, FFmpeg is a very powerful library that provides a variety of tools and interfaces for processing audio and video data. This article will delve into how to use the FFmpeg library to decode and resample audio streams.

“Simplicity is the ultimate sophistication.” — Leonardo da Vinci

This statement also applies to programming and data processing. Simple codes and algorithms tend to be easier to maintain and extend.

2. Decapsulation process

2.1 Register all wrappers and dewrappers

Use av_register_all()functions to register.

av_register_all();

2.2 Open file

Use avformat_open_input()functions to open a file or URL.

AVFormatContext* pFormatCtx = nullptr;
avformat_open_input(&pFormatCtx, "input.mp3", nullptr, nullptr);

2.3 Find flow information

Use avformat_find_stream_info()functions to find stream information.

avformat_find_stream_info(pFormatCtx, nullptr);

2.4 Get audio stream index and decoder ID

int audioStream = -1;
for (int i = 0; i < pFormatCtx->nb_streams; i++) {
    if (pFormatCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO) {
        audioStream = i;
        break;
    }
}
AVCodecID codecID = pFormatCtx->streams[audioStream]->codec->codec_id;

3. Decoding process

3.1 Get the decoder

Use avcodec_find_decoder()the function to get the decoder.

AVCodec* pCodec = avcodec_find_decoder(codecID);

3.2 Open the decoder

Use avcodec_open2()the function to open the decoder.

AVCodecContext* pCodecCtx = pFormatCtx->streams[audioStream]->codec;
avcodec_open2(pCodecCtx, pCodec, nullptr);

3.3 Decode data

AVPacket packet;
AVFrame* pFrame = av_frame_alloc();

while (av_read_frame(pFormatCtx, &packet) >= 0) {
    if (packet.stream_index == audioStream) {
        avcodec_send_packet(pCodecCtx, &packet);
        avcodec_receive_frame(pCodecCtx, pFrame);
    }
    av_packet_unref(&packet);
}

4. Resampling

4.1 Create SwrContext

SwrContext* swrCtx = swr_alloc();

4.2 Set parameters and initialize

swr_alloc_set_opts(swrCtx, ...);
swr_init(swrCtx);

4.3 Data conversion and memory release

swr_convert(swrCtx, ...);
swr_free(&swrCtx);

5. Code examples

#include <iostream>
#include <cstdio>
#include <vdef.h>
using namespace std;

#define MAX_AUDIO_FRAME_SIZE 192000
//Buffer:存储格式
//|-----------|-------------|
//chunk-------pos---len-----|

static Uint8* audio_chunk;
static int audio_len;  //音频剩余长度
static Uint8* audio_pos;  //静态控制音频播放位置

//注册回调函数	SDL2.0

// udata就是我们给到SDL的指针,stream是我们要把声音数据写入的缓冲区指针,len是缓冲区的大小。
void Fill_audio(void* udata,Uint8* stream,int len)
{
    
    
    cout << "Fill_audio len:"<<len<<endl;
    SDL_memset(stream,0,len);
    if(audio_len == 0)
    return ;
    len = (len>audio_len?len:audio_len);   //尽可能为最大音频量

    SDL_MixAudio(stream,audio_pos,len,SDL_MIX_MAXVOLUME); //这里的音量设置为函数要求,不影响硬件音量

    audio_pos +=len;//音频播放位置
    audio_len -=len;//剩余音频长度
}

int main()	  //这里main 在SDL_main中被宏定义了用的时候不可以使用int main(省参)
{
    
    
 int  l_s32AStreamSubscript = -1;//音频流标志

  avformat_network_init();
  char fillename[] = "E:\\tt.mp3";//播放文件

// 1.Open the input file in the unpacked format
  l_pstFormatCtx = avformat_alloc_context();
  if(avformat_open_input(&l_pstFormatCtx,fillename,NULL,NULL)!=0)
  {
    
    
    cout << "[music_error]Could not open source file,exit work_" << fillename <<endl;
    return -1;
  }
  if(avformat_find_stream_info(l_pstFormatCtx,NULL)<0)
  {
    
    
    cout << "[music_error]couldn't find stream information" <<endl;
    return -1;
  }

//2. get the index position of the audio stream,
  if(l_pstFormatCtx!=nullptr)
  {
    
    
    for (unsigned int i = 0; i < l_pstFormatCtx->nb_streams; ++i)
    {
    
    
      if (l_pstFormatCtx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO)
        l_s32AStreamSubscript = i;//Audio stream sequence number
    }
  }
  if(l_s32AStreamSubscript == -1)
  {
    
    
    cout << "[music_error]Can't find audiostream" <<endl;
    return -1;
  }

//3.Find and open the audio decoder  Codec type or id mismatches
  l_pstAStream= l_pstFormatCtx->streams[l_s32AStreamSubscript];
  l_pstACodec =  avcodec_find_decoder(l_pstAStream->codecpar->codec_id);

  l_pstACodecCtx = avcodec_alloc_context3(l_pstACodec); //Allocation of AVCodecContext memory
  if(l_pstACodecCtx == nullptr || avcodec_parameters_to_context(l_pstACodecCtx, (const AVCodecParameters *)l_pstAStream->codecpar)<0)
  {
    
    
      cout << "[music_error]Codec ont find" <<endl;
      return -1;
  }
  if (avcodec_open2(l_pstACodecCtx, l_pstACodec, nullptr) < 0 || l_pstACodec == nullptr)
  {
    
    
      cout << "[music_error]Cannot find the corresponding decoder or the file is encrypted" <<SDL_GetError()<<endl;
      return -1;
  }

  if(SDL_Init(SDL_INIT_VIDEO|SDL_INIT_AUDIO|SDL_INIT_TIMER))
  {
    
    
    cout << "[music_error]Could not initialize SDL" <<SDL_GetError()<<endl;
    return -1;
  }


  uint64_t out_channel_layout  = AV_CH_LAYOUT_STEREO;  //声道格式
  AVSampleFormat out_sample_fmt = AV_SAMPLE_FMT_S16;   //采样格式
  int out_nb_samples=l_pstACodecCtx->frame_size;	 //nb_samples: AAC-1024 MP3-1152  格式大小 /*有的是视频格式数据头为非标准格式,从frame_size中得不到正确的数据大小,只能解码一帧数据后才可以获得*/
  int out_sample_rate = 44100;//采样率	pCodecCtx->sample_rate
  int out_channels = av_get_channel_layout_nb_channels(out_channel_layout);	 //根据声道格式返回声道个数
  int out_buffer_size = av_samples_get_buffer_size(NULL,out_channels,out_nb_samples,out_sample_fmt,1);//获取输出缓冲大小

  out_buffer = (uint8_t*)av_malloc(MAX_AUDIO_FRAME_SIZE);
  memset(out_buffer,0,MAX_AUDIO_FRAME_SIZE);

  wanted_spec.freq = out_sample_rate;	//采样率
  wanted_spec.format = AUDIO_S16SYS;	//告诉SDL我们将要给的格式
  wanted_spec.channels = out_channels;	 //声音的通道数
  wanted_spec.silence = 0; 				 //用来表示静音的值
  wanted_spec.samples = out_nb_samples;   //格式大小
  wanted_spec.callback = Fill_audio; 	  //回调函数
  //打开音频设备
  wanted_spec.userdata = l_pstACodecCtx; 	  //SDL供给回调函数运行的参数
  if (SDL_OpenAudio(&wanted_spec, NULL)<0)
  {
    
    
        printf("can't open audio.\n");
        return -1;
  }
   //根据声道数返回默认输入声道格式
  int64_t in_channel_layout = av_get_default_channel_layout(l_pstACodecCtx->channels);
   //音频格式转换准备
  au_convert_ctx = swr_alloc();//等同于au_convert_ctx  = NULL;
  //参数设置:输出格式PCM -- 输入格式	MP3
  au_convert_ctx = swr_alloc_set_opts(au_convert_ctx,out_channel_layout, out_sample_fmt, out_sample_rate,
        in_channel_layout,l_pstACodecCtx->sample_fmt , l_pstACodecCtx->sample_rate,0, NULL);
  swr_init(au_convert_ctx);//初始化

  int index =  0;
  packet = (AVPacket*)av_malloc(sizeof(AVPacket));
  av_init_packet(packet);
  pFrame = av_frame_alloc();
  //解析数据包
  while(av_read_frame(l_pstFormatCtx, packet)>=0)
  {
    
    
      if(packet->stream_index == l_s32AStreamSubscript)  //如果为音频标志
      {
    
                     
          //解码一帧音频压缩数据,得到音频像素数据
          if ( avcodec_send_packet(l_pstACodecCtx, packet) != 0)
          {
    
    
             cout<<"[audio_decode_frame] avcodec_send_packet failed"<<endl;
          }
          else
          {
    
    
      //       cout<<"[audio_decode_frame] avcodec_send_packet successfully"<<endl;
          }
           一个avPacket可能包含多帧数据,所以需要使用while循环一直读取
          while( (avcodec_receive_frame(l_pstACodecCtx, pFrame) )>= 0)
          {
    
    
              //数据格式转换
              swr_convert(au_convert_ctx,&out_buffer,MAX_AUDIO_FRAME_SIZE,(const uint8_t**)pFrame->data,pFrame->nb_samples);

              //输出一帧包大小
              printf("index:%5d\t pts:%lld\t packet size:%d\n",index,packet->pts,packet->size);
              index++;
           }

          while(audio_len>0)
            SDL_Delay(1);//延时1ms

         //指向音频数据 (PCM data)
         audio_chunk = (Uint8 *) out_buffer;
         //音频长度
         audio_len =out_buffer_size;
         //当前播放位置
         audio_pos = audio_chunk;
         //开始播放
         SDL_PauseAudio(0);
      }
//      cout<<"[audio_decode_frame]Remove the reference to the previous frame"<<endl;
      av_packet_unref(packet);
      av_frame_unref(pFrame);

  }
     av_packet_free(&packet);
     //释放转换结构体
         swr_free(&au_convert_ctx);
#if USE_SDL
    SDL_CloseAudio();//Close SDL
    SDL_Quit();
#endif

#if WRITEPCM
    fclose(file);
#endif
    av_free(out_buffer);
//free_AVCodecCtx:
    avcodec_close(l_pstACodecCtx);

    // 关闭打开音频文件
    avformat_close_input(&l_pstFormatCtx);
    system("pause");

    return 0;

}

6. Summary

This article details how to use the FFmpeg library to decapsulate, decode and resample audio streams. Although these steps may seem simple, there is a profound design philosophy behind each function and interface.

“The most important property of a program is whether it accomplishes the intention of its user.” — C.A.R. Hoare

In our programming learning journey, understanding is an important step for us to move to a higher level. However, mastering new skills and ideas always requires time and persistence. From a psychological point of view, learning is often accompanied by constant trial and error and adjustment, which is like our brain gradually optimizing its "algorithm" for solving problems.

This is why when we encounter mistakes, we should view them as opportunities to learn and improve, not just as annoyances. By understanding and solving these problems, we can not only fix the current code, but also improve our programming skills and prevent making the same mistakes in future projects.

I encourage everyone to actively participate and continuously improve their programming skills. Whether you are a beginner or an experienced developer, I hope my blog will be helpful on your learning journey. If you find this article useful, you may wish to click to bookmark it, or leave your comments to share your insights and experiences. You are also welcome to make suggestions and questions about the content of my blog. Every like, comment, share and attention is the greatest support for me and the motivation for me to continue sharing and creating.


Read my CSDN homepage and unlock more exciting content: Bubble’s CSDN homepage
Insert image description here

Guess you like

Origin blog.csdn.net/qq_21438461/article/details/132958278