Both methods and libswresample FFmpeg libavcodec audio resampling

Disclaimer: This article is a blogger original article, shall not be reproduced without the bloggers allowed. https://blog.csdn.net/myvest/article/details/89442000


For many players, when the output is fixed to a format (e.g., 44100Hz, dual-channel, 16bit signed), because most devices to support these formats. For a plurality of different input sources, namely the need for resampling the audio in this case.

1、libavcodec

Providing libavcodec resampling related interface that older, typically with a decoded version of FFmpeg 2 avcodec_decode_audio3 interfaces used to convert the decoded data to the specified format. The new version does not recommend the use of the interface.

Interface Description:

Header defines the correlation function, the interface is clear:
1) Function av_audio_resample_init () to initialize the parameters resampling, the first six parameters well understood,

  • After resampling the number of channels @param output_channels
  • @param input_channels soundtrack number of road
  • After resampling the sampling rate @param output_rate
  • @param input_rate original sampling rate
  • After @param sample_fmt_out resampled audio data formats
  • @param sample_fmt_in original sampled data format

After four parameters substantially using default parameters (I did not know the use of these values), respectively: 16, 10, 0, 1

2) Function audio_resample () for resampling the first three parameters are context / data output / input data, note that the last parameter refers to a "number of original sample data", rather than the number of bytes. Function return value is the number of samples after resampling.

3) Function audio_resample_close () to clean up resource allocation resampling.

typedef struct ReSampleContext ReSampleContext;
/**
 *  Initialize audio resampling context.
 *
 * @param output_channels  number of output channels
 * @param input_channels   number of input channels
 * @param output_rate      output sample rate
 * @param input_rate       input sample rate
 * @param sample_fmt_out   requested output sample format
 * @param sample_fmt_in    input sample format
 * @param filter_length    length of each FIR filter in the filterbank relative to the cutoff frequency
 * @param log2_phase_count log2 of the number of entries in the polyphase filterbank
 * @param linear           if 1 then the used FIR filter will be linearly interpolated
                           between the 2 closest, if 0 the closest will be used
 * @param cutoff           cutoff frequency, 1.0 corresponds to half the output sampling rate
 * @return allocated ReSampleContext, NULL if error occurred
 */
attribute_deprecated
ReSampleContext *av_audio_resample_init(int output_channels, int input_channels,
                                        int output_rate, int input_rate,
                                        enum AVSampleFormat sample_fmt_out,
                                        enum AVSampleFormat sample_fmt_in,
                                        int filter_length, int log2_phase_count,
                                        int linear, double cutoff);

attribute_deprecated
int audio_resample(ReSampleContext *s, short *output, short *input, int nb_samples);

/**
 * Free resample context.
 *
 * @param s a non-NULL pointer to a resample context previously
 *          created with av_audio_resample_init()
 */
attribute_deprecated
void audio_resample_close(ReSampleContext *s);

Sample code:

On the FFmpeg 3.2, use the interface actually not easy, after all, this interface is the interface with the old audio decoding avcodec_decode_audio3 use. The new audio decoder interfaces (avcodec_decode_audio4 update or send / recive Interface), the decoded audio data will be stored in the data AVFrame structure, the data for the planar type, each channel is distinguished presence of an array such data [0], data [1 ], data [2] ... This is a problem when re-sampling, have to re-merge together the respective channel data, troublesome, so the following example, if a planar type of data, wherein a channel we only take the make resampling.
Note that, in a planar type channel sampled data needs to be converted to the corresponding non-planar type, the interface will otherwise conversion error.

(Code interface simply an example, many abnormal untreated)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#ifdef __cplusplus
extern "C"
{
#endif
#define __STDC_CONSTANT_MACROS
#ifdef _STDINT_H
#undef _STDINT_H
#endif
#include <stdint.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#ifdef __cplusplus
};
#endif
#define MAX_AUDIO_FRAME_SIZE 192000 //48khz 16bit audio 2 channels

int main(int argc, char **argv){
    if(argc < 2){
        return -1;
    }
    const char* in_file = argv[1];

    AVFormatContext *fctx = NULL;
    AVCodecContext *cctx = NULL;
    AVCodec *acodec = NULL;
	
    FILE *audio_dst_file1 = fopen("./before_resample.pcm", "wb");
    FILE *audio_dst_file2 = fopen("./after_resample.pcm", "wb");

    av_register_all();
    avformat_open_input(&fctx, in_file, NULL, NULL);
    avformat_find_stream_info(fctx, NULL);
    //get audio index
    int aidx = av_find_best_stream(fctx, AVMEDIA_TYPE_AUDIO, -1, -1, NULL, 0);
    printf("get aidx[%d]!!!\n",aidx);
    //open audio codec
    AVCodecParameters *codecpar = fctx->streams[aidx]->codecpar;
    acodec = avcodec_find_decoder(codecpar->codec_id);
    cctx = avcodec_alloc_context3(acodec);
    avcodec_parameters_to_context(cctx, codecpar);
    avcodec_open2(cctx, acodec, NULL);

    //init resample
    ReSampleContext * resample_ctx = NULL;
    int output_channels = 2;
    int output_rate = 48000;
    int input_channels = cctx->channels;
    int input_rate = cctx->sample_rate;
    int input_sample_fmt = cctx->sample_fmt;
    if(av_sample_fmt_is_planar((AVSampleFormat)input_sample_fmt)){//if planar, we just use one channel
    	input_sample_fmt = input_sample_fmt - (int)AV_SAMPLE_FMT_U8P;
        input_channels = 1;
    }
    AVSampleFormat output_sample_fmt = AV_SAMPLE_FMT_S16;
    printf("input_channels[%d=>%d],input_rate[%d=>%d],input_sample_fmt[%d=>%d]\n",
                    cctx->channels,input_channels,cctx->sample_rate,input_rate,cctx->sample_fmt,input_sample_fmt);
    resample_ctx = av_audio_resample_init(output_channels, input_channels, output_rate, input_rate, 
  											output_sample_fmt,(AVSampleFormat)input_sample_fmt,16, 10, 0, 1);
    if(!resample_ctx){
    	printf("av_audio_resample_init fail!!!\n");
    	return -1;
    }
	
    AVPacket *pkt =av_packet_alloc();
    AVFrame *frame = av_frame_alloc();
	
    short *out_buffer=(short *)av_malloc(MAX_AUDIO_FRAME_SIZE);
    int size = 0;
    
    while(av_read_frame(fctx,pkt) == 0){//DEMUX
        if(pkt->stream_index == aidx){
            avcodec_send_packet(cctx, pkt);
            while(1){
            	int ret = avcodec_receive_frame(cctx, frame);
            	if(ret != 0){
                    break;
            	}else{
                    //before resample
                    size = frame->nb_samples * av_get_bytes_per_sample((AVSampleFormat)frame->format);
                    if(frame->data[0] != NULL){
                        fwrite(frame->data[0], 1, size, audio_dst_file1);
                        memset(out_buffer,0x00,sizeof(out_buffer));
                        size = audio_resample(resample_ctx, out_buffer, (short *)frame->data[0], frame->nb_samples);
                        size = size* av_get_bytes_per_sample(output_sample_fmt) * output_channels ;//samples * byte * channels
                        if(size > 0){
                            fwrite(out_buffer, 1, size, audio_dst_file2);
                        }
                    }
            	}
            	av_frame_unref(frame);
            }
        }
        else{
            //printf("not audio frame!!!\n");
            av_packet_unref(pkt);
            continue;
        }
        av_packet_unref(pkt);
    }

    //close
    audio_resample_close(resample_ctx);
    av_packet_free(&pkt);
    av_frame_free(&frame);
    avcodec_close(cctx);
    avformat_close_input(&fctx);
    av_free(out_buffer);
    fclose(audio_dst_file1);
    fclose(audio_dst_file2);

    return 0;
}

Sampling a result, in the above example, the PCM stored before sampling and after sampling.
We take a channel of a source, sampling rate 48000, the sample format for the AV_SAMPLE_FMT_FLTP.
PCM sampling ago
According to the sample code, the sampling rate of resampling 48000 (48khz generally sampling rate DVD), dual-channel, sampling data type AV_SAMPLE_FMT_S16, i.e., into a two-channel non-planar type.
In this example, the number of channels / sample format have changed, just as the sample rate (of course, other frequencies can also select a change to verify the correct sampling rate, such as 44,100)
After sampling data

2、libswresample

libswresample interface provides a more convenient method of resampling.

Interface Description:

I only are several important functions, reference may correspond to the other header libswresample / swresample.h
. 1) function swr_alloc_set_opts (), the application context resampling, and related parameters can be set.

  • @param s resampling context, if NULL, the function will generate their own
  • @param out_ch_layout 重采样的声道layout
  • @param out_sample_fmt 重采样的数据格式
  • @param out_sample_rate 重采样的采样率
  • @param in_ch_layout 源声道layout
  • @param in_sample_fmt 源数据格式
  • @param in_sample_rate 源采样率
/**
 * Allocate SwrContext if needed and set/reset common parameters.
 *
 * @param s               existing Swr context if available, or NULL if not
 * @param out_ch_layout   output channel layout (AV_CH_LAYOUT_*)
 * @param out_sample_fmt  output sample format (AV_SAMPLE_FMT_*).
 * @param out_sample_rate output sample rate (frequency in Hz)
 * @param in_ch_layout    input channel layout (AV_CH_LAYOUT_*)
 * @param in_sample_fmt   input sample format (AV_SAMPLE_FMT_*).
 * @param in_sample_rate  input sample rate (frequency in Hz)
 * @param log_offset      logging level offset
 * @param log_ctx         parent logging context, can be NULL
 *
 * @see swr_init(), swr_free()
 * @return NULL on error, allocated context otherwise
 */
struct SwrContext *swr_alloc_set_opts(struct SwrContext *s,
                                      int64_t out_ch_layout, enum AVSampleFormat out_sample_fmt, int out_sample_rate,
                                      int64_t  in_ch_layout, enum AVSampleFormat  in_sample_fmt, int  in_sample_rate,
                                      int log_offset, void *log_ctx);

2)函数 int swr_init(struct SwrContext *s); // 初始化上下文。

3)函数 void swr_free(struct SwrContext **s); // 释放上下文空间。
swr_convert()

针对每一帧音频的处理。把一帧帧的音频作相应的重采样

4)函数int swr_conver

  • @param s 音频重采样的上下文
  • @param out 重采样后的数据
  • @param out_count 重采样输出的单通道的样本数量,注意不是字节数
  • @param in 重采样前的源数据
  • @param in_count 输入的单通道的样本数量
/** Convert audio.
 *
 * in and in_count can be set to 0 to flush the last few samples out at the
 * end.
 *
 * If more input is provided than output space, then the input will be buffered.
 * You can avoid this buffering by using swr_get_out_samples() to retrieve an
 * upper bound on the required number of output samples for the given number of
 * input samples. Conversion will run directly without copying whenever possible.
 *
 * @param s         allocated Swr context, with parameters set
 * @param out       output buffers, only the first one need be set in case of packed audio
 * @param out_count amount of space available for output in samples per channel
 * @param in        input buffers, only the first one need to be set in case of packed audio
 * @param in_count  number of input samples available in one channel
 *
 * @return number of samples output per channel, negative value on error
 */
int swr_convert(struct SwrContext *s, uint8_t **out, int out_count,
                                const uint8_t **in , int in_count);

示例代码:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#ifdef __cplusplus
extern "C"
{
#endif
#define __STDC_CONSTANT_MACROS
#ifdef _STDINT_H
#undef _STDINT_H
#endif
#include <stdint.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswresample/swresample.h>
#ifdef __cplusplus
};
#endif
#define MAX_AUDIO_FRAME_SIZE 192000 //48khz 16bit audio 2 channels

int main(int argc, char **argv){
    if(argc < 2){
        return -1;
    }
    const char* in_file = argv[1];

    AVFormatContext *fctx = NULL;
    AVCodecContext *cctx = NULL;
    AVCodec *acodec = NULL;
	
    FILE *audio_dst_file1 = fopen("./before_resample.pcm", "wb");
    FILE *audio_dst_file2 = fopen("./after_resample.pcm", "wb");

    av_register_all();
    avformat_open_input(&fctx, in_file, NULL, NULL);
    avformat_find_stream_info(fctx, NULL);
    //get audio index
    int aidx = av_find_best_stream(fctx, AVMEDIA_TYPE_AUDIO, -1, -1, NULL, 0);
    printf("get aidx[%d]!!!\n",aidx);
    //open audio codec
    AVCodecParameters *codecpar = fctx->streams[aidx]->codecpar;
    acodec = avcodec_find_decoder(codecpar->codec_id);
    cctx = avcodec_alloc_context3(acodec);
    avcodec_parameters_to_context(cctx, codecpar);
    avcodec_open2(cctx, acodec, NULL);

    //init resample
    int output_channels = 2;
    int output_rate = 48000;
    int input_channels = cctx->channels;
    int input_rate = cctx->sample_rate;
    AVSampleFormat input_sample_fmt = cctx->sample_fmt;
    AVSampleFormat output_sample_fmt = AV_SAMPLE_FMT_S16;
    printf("channels[%d=>%d],rate[%d=>%d],sample_fmt[%d=>%d]\n",
        input_channels,output_channels,input_rate,output_rate,input_sample_fmt,output_sample_fmt);
    
    SwrContext* resample_ctx = NULL;
    resample_ctx = swr_alloc_set_opts(resample_ctx, av_get_default_channel_layout(output_channels),output_sample_fmt,output_rate,
                            av_get_default_channel_layout(input_channels),input_sample_fmt, input_rate,0,NULL);
    if(!resample_ctx){
        printf("av_audio_resample_init fail!!!\n");
        return -1;
    }
    swr_init(resample_ctx);
    
    AVPacket *pkt =av_packet_alloc();
    AVFrame *frame = av_frame_alloc();
    int size = 0;
    uint8_t* out_buffer = (uint8_t*)av_malloc(MAX_AUDIO_FRAME_SIZE);
    
    while(av_read_frame(fctx,pkt) == 0){//DEMUX
        if(pkt->stream_index == aidx){
            avcodec_send_packet(cctx, pkt);
            while(1){
            	int ret = avcodec_receive_frame(cctx, frame);
            	if(ret != 0){
                    break;
            	}else{
                    //before resample
                    size = frame->nb_samples * av_get_bytes_per_sample((AVSampleFormat)frame->format);
                    if(frame->data[0] != NULL){
                        fwrite(frame->data[0], 1, size, audio_dst_file1);
                    }
                    //resample
                    memset(out_buffer,0x00,sizeof(out_buffer));
                    int out_samples = swr_convert(resample_ctx,&out_buffer,frame->nb_samples,(const uint8_t **)frame->data,frame->nb_samples);
                    if(out_samples > 0){
                        av_samples_get_buffer_size(NULL,output_channels ,out_samples, output_sample_fmt, 1);//out_samples*output_channels*av_get_bytes_per_sample(output_sample_fmt);
                        fwrite(out_buffer, 1, size, audio_dst_file2);
                    }
            	}
            	av_frame_unref(frame);
            }
        }
        else{
            //printf("not audio frame!!!\n");
            av_packet_unref(pkt);
            continue;
        }
        av_packet_unref(pkt);
    }

    //close
    swr_free(&resample_ctx);
    av_packet_free(&pkt);
    av_frame_free(&frame);
    avcodec_close(cctx);
    avformat_close_input(&fctx);
    av_free(out_buffer);
    fclose(audio_dst_file1);
    fclose(audio_dst_file2);

    return 0;
}

A method of providing comparison libavcodec, libswresample can easily be planar type of data into the resampling process, the plurality of channels sampled output type we want.
In particular for some types of planar source, when the left and right channels are not the same, with libavcodec resampling process also requires multiple-channel data are merged. The libswresample be omitted these troubles.
After below, a left and right channels for use libswresample different source resampled output sample rate of 48000Hz, two-channel, sampling data type AV_SAMPLE_FMT_S16. From the figure, we can see that the left and right channels are significantly different.
If the above code is in accordance with the first example, only channels for which a re-sampling results thus output as left and right channels, for many source, may lose some fine channel effect.
After resampling the data

Guess you like

Origin blog.csdn.net/myvest/article/details/89442000