7. Implement aac audio and pcm encoding and decoding through the libfdk_aac codec

Preface


test environment:

  • 4.3.2 self-compiled version of ffmpeg
  • windows environment
  • qt5.12

AAC encoding is the successor to the MP3 format. It usually achieves higher sound quality than MP3 at the same bit rate. It is the standard audio format for iPhone, iPod, iPad, and iTunes.

The improvements of AAC compared to MP3 include:

  • More sampling rate options: 8kHz ~ 96kHz, MP3 is 16kHz ~ 48kHz
  • Higher channel limit: 48, MP3 has up to two channels in MPEG-1 mode, and up to two channels in MPEG-2 mode5.1 channels
  • Improved compression: higher quality at smaller file sizes
  • Improved decoding efficiency: less processing power is required to decode

AAC encoding has designed many specifications to meet the needs of different scenarios.

  • MPEG-2 AAC LC: Low Complexity specification (Low Complexity)
  • MPEG-2 AAC Main: Main specification
  • MPEG-2 AAC SSR: Scaleable Sample Rate
  • MPEG-4 AAC LC: Low Complexity Specification (Low Complexity)
    • This specification is used in the audio part of MP4 files that are more common in today’s mobile phones.
  • MPEG-4 AAC Main: Main specification
  • MPEG-4 AAC SSR: Scaleable Sample Rate specification (Scaleable Sample Rate)
  • MPEG-4 AAC LTP: Long Term Prediction Specification
  • MPEG-4 AAC LD: Low Delay Specification (Low Delay)
  • MPEG-4 AAC HE: High Efficiency

Among the many specifications, just focus on LC and HE


The conversion between pcm and aac requires the AAC codec (several commonly used AAC codecs are listed below)

  • Nero AAC
    • Support LC/HE specifications
    • Development and maintenance have been stopped at present
  • FFmpeg AAC
    • Support LC specifications
    • FFmpeg's official built-in AAC codec is in the libavcodec library
      • The codec name is aac
      • The codec is found by this name during development
  • FAAC(Freeware Advanced Audio Coder)
    • Support LC specifications
    • Can be integrated into FFmpeg's libavcodec
      • The codec name is libfaac
      • During the development process, the codec is found through this name, and finally the function of the FAAC library is called.
    • Since 2016, FFmpeg hasremoved support for FAAC
  • Fraunhofer FDK AAC
    • Support LC/HE specifications
    • The highest quality AAC codec currently available
    • Can be integrated into FFmpeg's libavcodec
      • The codec name is libfdk_aac
      • During the development process, the codec is found through this name, and finally the function of the FDK AAC library is called

Encoding quality ranking: Fraunhofer FDK AAC > FFmpeg AAC > FAAC.

Because libfdk_aac is the best, but the compiled version of ffmpeg downloaded from the Internet does not have the libfdk_aac codec. So we can only compile ffmpeg ourselves.

You can view the AAC codec currently integrated by FFmpeg with the following command:

ffmpeg -codecs | findstr aac

The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly.


Manually compile the FFmpeg source code yourself and integrate libfdk_aac into FFmpeg. This method is the best, but it is more troublesome in the windows environment.

Because compiling the source code requires a Unix-like system (Linux, Mac, etc.), it cannot be used directly on Windows by default. Therefore, you must first use MSYS2 software to simulate a Linux environment on Windows, and then use MinGW software to compile FFmpeg.

Link:Msys2 compiled 64-bit ffmpeg source code under windows


After compiling the source code, you need to configure the .pro file into the newly compiled source code.

fdk-aac has certain format requirements for pcm audio that needs to be coded

  • The sampling format must be 16-bit integer PCM
  • Sampling rate only supports: 8000, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000, 64000, 88200, 96000

Command line to encode pcm and wav files to aac audio

# pcm -> aac
ffmpeg -ar 44100 -ac 2 -f s16le -i in.pcm -c:a libfdk_aac out.aac
-ar 44100 -ac 2 -f s16le   --PCM输入数据的参数
-c:a	 设置音频编码器,c表示codec(编解码器),a表示audio(音频)。 等价写法 -codec:a或-acodec
    
# wav -> aac
ffmpeg -i in.wav -c:a libfdk_aac out.aac   

The aac file generated by default is of LC specification. aac files are much smaller than the previous pcm files.

The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly.

The abbreviation of aac can also be m4a and mp4. Although now people only think of mp4 as a video file


First, pcm is encoded as aac

Complete code

AacEncodeThread.h

#ifndef AACENCODETHREAD_H
#define AACENCODETHREAD_H

#include <QFile>
#include <QObject>
#include <QThread>

extern "C" {
    
    
#include <libavformat/avformat.h>
}

typedef struct {
    
    
    const char *filename;
    int sampleRate;
    AVSampleFormat sampleFmt;
    int chLayout;
} AudioEncodeSpec;

class AacEncodeThread : public QThread
{
    
    
    Q_OBJECT
public:
    explicit AacEncodeThread(QObject *parent = nullptr);
    ~AacEncodeThread();

    static int check_sample_fmt(const AVCodec *codec,enum AVSampleFormat sample_fmt);
    static int encode(AVCodecContext *ctx,AVFrame *frame,AVPacket *pkt,QFile &outFile);
    static void aacEncode(AudioEncodeSpec &in,const char *outFilename);

signals:


    // QThread interface
protected:
    virtual void run() override;
};

#endif // AACENCODETHREAD_H

AacEncodeThread.cpp

#include "aacencodethread.h"

#include <QDebug>
#include <QFile>

extern "C" {
    
    
#include <libavcodec/avcodec.h>
#include <libavutil/avutil.h>
}

#define ERROR_BUF(ret) \
    char errbuf[1024]; \
    av_strerror(ret, errbuf, sizeof (errbuf));

AacEncodeThread::AacEncodeThread(QObject *parent) : QThread(parent)
{
    
    
    // 当监听到线程结束时(finished),就调用deleteLater回收内存
    connect(this, &AacEncodeThread::finished,
            this, &AacEncodeThread::deleteLater);
}

AacEncodeThread::~AacEncodeThread()
{
    
    
    // 断开所有的连接
    disconnect();
    // 内存回收之前,正常结束线程
    requestInterruption();
    // 安全退出
    quit();
    wait();
    qDebug() << this << "析构(内存被回收)";
}

// 检查采样格式
int AacEncodeThread::check_sample_fmt(const AVCodec *codec,enum AVSampleFormat sample_fmt) {
    
    
    const enum AVSampleFormat *p = codec->sample_fmts;

    while (*p != AV_SAMPLE_FMT_NONE) {
    
    
//        qDebug() << av_get_sample_fmt_name(*p);
        if (*p == sample_fmt) return 1;
        p++;
    }
    return 0;
}

// 音频编码
// 返回负数:中途出现了错误
// 返回0:编码操作正常完成
int AacEncodeThread::AacEncodeThread::encode(AVCodecContext *ctx,
                  AVFrame *frame,
                  AVPacket *pkt,
                  QFile &outFile) {
    
    
    // 发送数据到编码器
    int ret = avcodec_send_frame(ctx, frame);
    if (ret < 0) {
    
    
        ERROR_BUF(ret);
        qDebug() << "avcodec_send_frame error" << errbuf;
        return ret;
    }

    // 不断从编码器中取出编码后的数据
    // while (ret >= 0)
    while (true) {
    
    
        ret = avcodec_receive_packet(ctx, pkt);
        if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
    
    
            // 继续读取数据到frame,然后送到编码器
            return 0;
        } else if (ret < 0) {
    
     // 其他错误
            return ret;
        }

        // 成功从编码器拿到编码后的数据
        // 将编码后的数据写入文件
        outFile.write((char *) pkt->data, pkt->size);

        // 释放pkt内部的资源
        av_packet_unref(pkt);
    }
}

void AacEncodeThread::aacEncode(AudioEncodeSpec &in, const char *outFilename)
{
    
    
    // 文件
    QFile inFile(in.filename);
    QFile outFile(outFilename);

    // 返回结果
    int ret = 0;

    // 编码器
    AVCodec *codec = nullptr;

    // 编码上下文
    AVCodecContext *ctx = nullptr;

    // 存放编码前的数据(pcm)
    AVFrame *frame = nullptr;

    // 存放编码后的数据(aac)
    AVPacket *pkt = nullptr;

    // 获取编码器
//    codec = avcodec_find_encoder(AV_CODEC_ID_AAC);
    codec = avcodec_find_encoder_by_name("libfdk_aac");
    if (!codec) {
    
    
        qDebug() << "encoder not found";
        return;
    }

    // libfdk_aac对输入数据的要求:采样格式必须是16位整数
    // 检查输入数据的采样格式
    if (!check_sample_fmt(codec, in.sampleFmt)) {
    
    
        qDebug() << "unsupported sample format"
                 << av_get_sample_fmt_name(in.sampleFmt);
        return;
    }

    // 创建编码上下文
    ctx = avcodec_alloc_context3(codec);
    if (!ctx) {
    
    
        qDebug() << "avcodec_alloc_context3 error";
        return;
    }

    // 设置PCM参数
    ctx->sample_rate = in.sampleRate;
    ctx->sample_fmt = in.sampleFmt;
    ctx->channel_layout = in.chLayout;
    // 比特率
    ctx->bit_rate = 32000;
    // 规格
    ctx->profile = FF_PROFILE_AAC_HE_V2;

    // 打开编码器
//    AVDictionary *options = nullptr;
//    av_dict_set(&options, "vbr", "5", 0);
//    ret = avcodec_open2(ctx, codec, &options);
    ret = avcodec_open2(ctx, codec, nullptr);
    if (ret < 0) {
    
    
        ERROR_BUF(ret);
        qDebug() << "avcodec_open2 error" << errbuf;
        goto end;
    }

    // 创建AVFrame
    frame = av_frame_alloc();
    if (!frame) {
    
    
        qDebug() << "av_frame_alloc error";
        goto end;
    }

    // frame缓冲区中的样本帧数量(由ctx->frame_size决定)
    frame->nb_samples = ctx->frame_size;
    frame->format = ctx->sample_fmt;
    frame->channel_layout = ctx->channel_layout;

    // 利用nb_samples、format、channel_layout创建缓冲区
    ret = av_frame_get_buffer(frame, 0);
    if (ret < 0) {
    
    
        ERROR_BUF(ret);
        qDebug() << "av_frame_get_buffer error" << errbuf;
        goto end;
    }

    // 创建AVPacket
    pkt = av_packet_alloc();
    if (!pkt) {
    
    
        qDebug() << "av_packet_alloc error";
        goto end;
    }

    // 打开文件
    if (!inFile.open(QFile::ReadOnly)) {
    
    
        qDebug() << "file open error" << in.filename;
        goto end;
    }
    if (!outFile.open(QFile::WriteOnly)) {
    
    
        qDebug() << "file open error" << outFilename;
        goto end;
    }

    // 读取数据到frame中
    while ((ret = inFile.read((char *) frame->data[0],
                              frame->linesize[0])) > 0) {
    
    
        // 从文件中读取的数据,不足以填满frame缓冲区
        if (ret < frame->linesize[0]) {
    
    
            int bytes = av_get_bytes_per_sample((AVSampleFormat) frame->format);
            int ch = av_get_channel_layout_nb_channels(frame->channel_layout);
            // 设置真正有效的样本帧数量
            // 防止编码器编码了一些冗余数据
            frame->nb_samples = ret / (bytes * ch);
        }

        // 进行编码
        if (encode(ctx, frame, pkt, outFile) < 0) {
    
    
            goto end;
        }
    }

    // 刷新缓冲区
    encode(ctx, nullptr, pkt, outFile);

end:
    // 关闭文件
    inFile.close();
    outFile.close();

    // 释放资源
    av_frame_free(&frame);
    av_packet_free(&pkt);
    avcodec_free_context(&ctx);

    qDebug() << "线程正常结束";
}

void AacEncodeThread::run()
{
    
    
    AudioEncodeSpec in;
    in.filename = "E:/media/test.pcm";
    in.sampleRate = 44100;
    in.sampleFmt = AV_SAMPLE_FMT_S16;
    in.chLayout = AV_CH_LAYOUT_STEREO;

    aacEncode(in, "E:/media/test.aac");
}

Thread call:

void MainWindow::on_pushButton_aac_encode_clicked()
{
    
    
    m_pAacEncodeThread=new AacEncodeThread(this);
    m_pAacEncodeThread->start();
}

Note: The following global variables are declared in advance in the .h file

AacEncodeThread *m_pAacEncodeThread=nullptr;


The following is aac decoded into pcm

Complete code

AacDecodeThread.h

#ifndef AACDECODETHREAD_H
#define AACDECODETHREAD_H

#include <QFile>
#include <QObject>
#include <QThread>

extern "C" {
    
    
#include <libavformat/avformat.h>
}

typedef struct {
    
    
    const char *filename;
    int sampleRate;
    AVSampleFormat sampleFmt;
    int chLayout;
} AudioDecodeSpec;

class AacDecodeThread : public QThread
{
    
    
    Q_OBJECT
public:
    explicit AacDecodeThread(QObject *parent = nullptr);
    ~AacDecodeThread();

    static int decode(AVCodecContext *ctx,
                      AVPacket *pkt,
                      AVFrame *frame,
                      QFile &outFile);
    static void aacDecode(const char *inFilename,AudioDecodeSpec &out);

signals:


    // QThread interface
protected:
    virtual void run() override;
};

#endif // AACDECODETHREAD_H

AacDecodeThread.cpp

#include "aacdecodethread.h"

#include <QDebug>

extern "C" {
    
    
#include <libavcodec/avcodec.h>
#include <libavutil/avutil.h>
}

#define ERROR_BUF(ret) \
    char errbuf[1024]; \
    av_strerror(ret, errbuf, sizeof (errbuf));

// 输入缓冲区的大小
#define IN_DATA_SIZE 20480
// 需要再次读取输入文件数据的阈值
#define REFILL_THRESH 4096


AacDecodeThread::AacDecodeThread(QObject *parent) : QThread(parent)
{
    
    
    // 当监听到线程结束时(finished),就调用deleteLater回收内存
    connect(this, &AacDecodeThread::finished,
            this, &AacDecodeThread::deleteLater);
}

AacDecodeThread::~AacDecodeThread()
{
    
    
    // 断开所有的连接
    disconnect();
    // 内存回收之前,正常结束线程
    requestInterruption();
    // 安全退出
    quit();
    wait();
    qDebug() << this << "析构(内存被回收)";
}

int AacDecodeThread::decode(AVCodecContext *ctx,
                  AVPacket *pkt,
                  AVFrame *frame,
                  QFile &outFile) {
    
    
    // 发送压缩数据到解码器
    int ret = avcodec_send_packet(ctx, pkt);
    if (ret < 0) {
    
    
        ERROR_BUF(ret);
        qDebug() << "avcodec_send_packet error" << errbuf;
        return ret;
    }

    while (true) {
    
    
        // 获取解码后的数据
        ret = avcodec_receive_frame(ctx, frame);
        if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
    
    
            return 0;
        } else if (ret < 0) {
    
    
            ERROR_BUF(ret);
            qDebug() << "avcodec_receive_frame error" << errbuf;
            return ret;
        }

//        for (int i = 0; i < frame->channels; i++) {
    
    
//            frame->data[i];
//        }

        // 将解码后的数据写入文件
        outFile.write((char *) frame->data[0], frame->linesize[0]);
    }
}

void AacDecodeThread::aacDecode(const char *inFilename, AudioDecodeSpec &out)
{
    
    
    // 返回结果
    int ret = 0;

    // 用来存放读取的输入文件数据(aac)
    // 加上AV_INPUT_BUFFER_PADDING_SIZE是为了防止某些优化过的reader一次性读取过多导致越界
    char inDataArray[IN_DATA_SIZE + AV_INPUT_BUFFER_PADDING_SIZE];
    char *inData = inDataArray;

    // 每次从输入文件中读取的长度(aac)
    int inLen;
    // 是否已经读取到了输入文件的尾部
    int inEnd = 0;

    // 文件
    QFile inFile(inFilename);
    QFile outFile(out.filename);

    // 解码器
    AVCodec *codec = nullptr;
    // 上下文
    AVCodecContext *ctx = nullptr;
    // 解析器上下文
    AVCodecParserContext *parserCtx = nullptr;

    // 存放解码前的数据(aac)
    AVPacket *pkt = nullptr;
    // 存放解码后的数据(pcm)
    AVFrame *frame = nullptr;

    // 获取解码器
    codec = avcodec_find_decoder_by_name("libfdk_aac");
    if (!codec) {
    
    
        qDebug() << "decoder not found";
        return;
    }

    // 初始化解析器上下文
    parserCtx = av_parser_init(codec->id);
    if (!parserCtx) {
    
    
        qDebug() << "av_parser_init error";
        return;
    }

    // 创建上下文
    ctx = avcodec_alloc_context3(codec);
    if (!ctx) {
    
    
        qDebug() << "avcodec_alloc_context3 error";
        goto end;
    }

    // 创建AVPacket
    pkt = av_packet_alloc();
    if (!pkt) {
    
    
        qDebug() << "av_packet_alloc error";
        goto end;
    }

    // 创建AVFrame
    frame = av_frame_alloc();
    if (!frame) {
    
    
        qDebug() << "av_frame_alloc error";
        goto end;
    }

    // 打开解码器
    ret = avcodec_open2(ctx, codec, nullptr);
    if (ret < 0) {
    
    
        ERROR_BUF(ret);
        qDebug() << "avcodec_open2 error" << errbuf;
        goto end;
    }

    // 打开文件
    if (!inFile.open(QFile::ReadOnly)) {
    
    
        qDebug() << "file open error:" << inFilename;
        goto end;
    }
    if (!outFile.open(QFile::WriteOnly)) {
    
    
        qDebug() << "file open error:" << out.filename;
        goto end;
    }

    while ((inLen = inFile.read(inDataArray, IN_DATA_SIZE)) > 0) {
    
    
        inData = inDataArray;

        while (inLen > 0) {
    
    
            // 经过解析器解析
            // 内部调用的核心逻辑是:ff_aac_ac3_parse
            ret = av_parser_parse2(parserCtx, ctx,
                                   &pkt->data, &pkt->size,
                                   (uint8_t *) inData, inLen,
                                   AV_NOPTS_VALUE, AV_NOPTS_VALUE, 0);

            if (ret < 0) {
    
    
                ERROR_BUF(ret);
                qDebug() << "av_parser_parse2 error" << errbuf;
                goto end;
            }

            // 跳过已经解析过的数据
            inData += ret;
            // 减去已经解析过的数据大小
            inLen -= ret;

            // 解码
            if (pkt->size > 0 && decode(ctx, pkt, frame, outFile) < 0) {
    
    
                goto end;
            }
        }
    }
    decode(ctx, nullptr, frame, outFile);

    // 赋值输出参数
    out.sampleRate = ctx->sample_rate;
    out.sampleFmt = ctx->sample_fmt;
    out.chLayout = ctx->channel_layout;

end:
    inFile.close();
    outFile.close();
    av_packet_free(&pkt);
    av_frame_free(&frame);
    av_parser_close(parserCtx);
    avcodec_free_context(&ctx);
}

void AacDecodeThread::run()
{
    
    
    AudioDecodeSpec out;
    out.filename = "E:/media/test.pcm";

    aacDecode("E:/media/test.aac", out);

    qDebug() << "采样率:" << out.sampleRate;
    qDebug() << "采样格式:" << av_get_sample_fmt_name(out.sampleFmt);
    qDebug() << "声道数:" << av_get_channel_layout_nb_channels(out.chLayout);
}

Note: This article is a personal record. Novices may encounter various problems if they copy it, so please use it with caution.


Coding is not easy. If this blog is helpful to you, please like and save it. Thank you very much! There's something wrong

Guess you like

Origin blog.csdn.net/weixin_44092851/article/details/134552532