For many players, when the output is fixed to a format (e.g., 44100Hz, dual-channel, 16bit signed), because most devices to support these formats. For a plurality of different input sources, namely the need for resampling the audio in this case.
1、libavcodec
Providing libavcodec resampling related interface that older, typically with a decoded version of FFmpeg 2 avcodec_decode_audio3 interfaces used to convert the decoded data to the specified format. The new version does not recommend the use of the interface.
Interface Description:
Header defines the correlation function, the interface is clear:
1) Function av_audio_resample_init () to initialize the parameters resampling, the first six parameters well understood,
- After resampling the number of channels @param output_channels
- @param input_channels soundtrack number of road
- After resampling the sampling rate @param output_rate
- @param input_rate original sampling rate
- After @param sample_fmt_out resampled audio data formats
- @param sample_fmt_in original sampled data format
After four parameters substantially using default parameters (I did not know the use of these values), respectively: 16, 10, 0, 1
2) Function audio_resample () for resampling the first three parameters are context / data output / input data, note that the last parameter refers to a "number of original sample data", rather than the number of bytes. Function return value is the number of samples after resampling.
3) Function audio_resample_close () to clean up resource allocation resampling.
typedef struct ReSampleContext ReSampleContext;
/**
* Initialize audio resampling context.
*
* @param output_channels number of output channels
* @param input_channels number of input channels
* @param output_rate output sample rate
* @param input_rate input sample rate
* @param sample_fmt_out requested output sample format
* @param sample_fmt_in input sample format
* @param filter_length length of each FIR filter in the filterbank relative to the cutoff frequency
* @param log2_phase_count log2 of the number of entries in the polyphase filterbank
* @param linear if 1 then the used FIR filter will be linearly interpolated
between the 2 closest, if 0 the closest will be used
* @param cutoff cutoff frequency, 1.0 corresponds to half the output sampling rate
* @return allocated ReSampleContext, NULL if error occurred
*/
attribute_deprecated
ReSampleContext *av_audio_resample_init(int output_channels, int input_channels,
int output_rate, int input_rate,
enum AVSampleFormat sample_fmt_out,
enum AVSampleFormat sample_fmt_in,
int filter_length, int log2_phase_count,
int linear, double cutoff);
attribute_deprecated
int audio_resample(ReSampleContext *s, short *output, short *input, int nb_samples);
/**
* Free resample context.
*
* @param s a non-NULL pointer to a resample context previously
* created with av_audio_resample_init()
*/
attribute_deprecated
void audio_resample_close(ReSampleContext *s);
Sample code:
On the FFmpeg 3.2, use the interface actually not easy, after all, this interface is the interface with the old audio decoding avcodec_decode_audio3 use. The new audio decoder interfaces (avcodec_decode_audio4 update or send / recive Interface), the decoded audio data will be stored in the data AVFrame structure, the data for the planar type, each channel is distinguished presence of an array such data [0], data [1 ], data [2] ... This is a problem when re-sampling, have to re-merge together the respective channel data, troublesome, so the following example, if a planar type of data, wherein a channel we only take the make resampling.
Note that, in a planar type channel sampled data needs to be converted to the corresponding non-planar type, the interface will otherwise conversion error.
(Code interface simply an example, many abnormal untreated)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#ifdef __cplusplus
extern "C"
{
#endif
#define __STDC_CONSTANT_MACROS
#ifdef _STDINT_H
#undef _STDINT_H
#endif
#include <stdint.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#ifdef __cplusplus
};
#endif
#define MAX_AUDIO_FRAME_SIZE 192000 //48khz 16bit audio 2 channels
int main(int argc, char **argv){
if(argc < 2){
return -1;
}
const char* in_file = argv[1];
AVFormatContext *fctx = NULL;
AVCodecContext *cctx = NULL;
AVCodec *acodec = NULL;
FILE *audio_dst_file1 = fopen("./before_resample.pcm", "wb");
FILE *audio_dst_file2 = fopen("./after_resample.pcm", "wb");
av_register_all();
avformat_open_input(&fctx, in_file, NULL, NULL);
avformat_find_stream_info(fctx, NULL);
//get audio index
int aidx = av_find_best_stream(fctx, AVMEDIA_TYPE_AUDIO, -1, -1, NULL, 0);
printf("get aidx[%d]!!!\n",aidx);
//open audio codec
AVCodecParameters *codecpar = fctx->streams[aidx]->codecpar;
acodec = avcodec_find_decoder(codecpar->codec_id);
cctx = avcodec_alloc_context3(acodec);
avcodec_parameters_to_context(cctx, codecpar);
avcodec_open2(cctx, acodec, NULL);
//init resample
ReSampleContext * resample_ctx = NULL;
int output_channels = 2;
int output_rate = 48000;
int input_channels = cctx->channels;
int input_rate = cctx->sample_rate;
int input_sample_fmt = cctx->sample_fmt;
if(av_sample_fmt_is_planar((AVSampleFormat)input_sample_fmt)){//if planar, we just use one channel
input_sample_fmt = input_sample_fmt - (int)AV_SAMPLE_FMT_U8P;
input_channels = 1;
}
AVSampleFormat output_sample_fmt = AV_SAMPLE_FMT_S16;
printf("input_channels[%d=>%d],input_rate[%d=>%d],input_sample_fmt[%d=>%d]\n",
cctx->channels,input_channels,cctx->sample_rate,input_rate,cctx->sample_fmt,input_sample_fmt);
resample_ctx = av_audio_resample_init(output_channels, input_channels, output_rate, input_rate,
output_sample_fmt,(AVSampleFormat)input_sample_fmt,16, 10, 0, 1);
if(!resample_ctx){
printf("av_audio_resample_init fail!!!\n");
return -1;
}
AVPacket *pkt =av_packet_alloc();
AVFrame *frame = av_frame_alloc();
short *out_buffer=(short *)av_malloc(MAX_AUDIO_FRAME_SIZE);
int size = 0;
while(av_read_frame(fctx,pkt) == 0){//DEMUX
if(pkt->stream_index == aidx){
avcodec_send_packet(cctx, pkt);
while(1){
int ret = avcodec_receive_frame(cctx, frame);
if(ret != 0){
break;
}else{
//before resample
size = frame->nb_samples * av_get_bytes_per_sample((AVSampleFormat)frame->format);
if(frame->data[0] != NULL){
fwrite(frame->data[0], 1, size, audio_dst_file1);
memset(out_buffer,0x00,sizeof(out_buffer));
size = audio_resample(resample_ctx, out_buffer, (short *)frame->data[0], frame->nb_samples);
size = size* av_get_bytes_per_sample(output_sample_fmt) * output_channels ;//samples * byte * channels
if(size > 0){
fwrite(out_buffer, 1, size, audio_dst_file2);
}
}
}
av_frame_unref(frame);
}
}
else{
//printf("not audio frame!!!\n");
av_packet_unref(pkt);
continue;
}
av_packet_unref(pkt);
}
//close
audio_resample_close(resample_ctx);
av_packet_free(&pkt);
av_frame_free(&frame);
avcodec_close(cctx);
avformat_close_input(&fctx);
av_free(out_buffer);
fclose(audio_dst_file1);
fclose(audio_dst_file2);
return 0;
}
Sampling a result, in the above example, the PCM stored before sampling and after sampling.
We take a channel of a source, sampling rate 48000, the sample format for the AV_SAMPLE_FMT_FLTP.
According to the sample code, the sampling rate of resampling 48000 (48khz generally sampling rate DVD), dual-channel, sampling data type AV_SAMPLE_FMT_S16, i.e., into a two-channel non-planar type.
In this example, the number of channels / sample format have changed, just as the sample rate (of course, other frequencies can also select a change to verify the correct sampling rate, such as 44,100)
2、libswresample
libswresample interface provides a more convenient method of resampling.
Interface Description:
I only are several important functions, reference may correspond to the other header libswresample / swresample.h
. 1) function swr_alloc_set_opts (), the application context resampling, and related parameters can be set.
- @param s resampling context, if NULL, the function will generate their own
- @param out_ch_layout 重采样的声道layout
- @param out_sample_fmt 重采样的数据格式
- @param out_sample_rate 重采样的采样率
- @param in_ch_layout 源声道layout
- @param in_sample_fmt 源数据格式
- @param in_sample_rate 源采样率
/**
* Allocate SwrContext if needed and set/reset common parameters.
*
* @param s existing Swr context if available, or NULL if not
* @param out_ch_layout output channel layout (AV_CH_LAYOUT_*)
* @param out_sample_fmt output sample format (AV_SAMPLE_FMT_*).
* @param out_sample_rate output sample rate (frequency in Hz)
* @param in_ch_layout input channel layout (AV_CH_LAYOUT_*)
* @param in_sample_fmt input sample format (AV_SAMPLE_FMT_*).
* @param in_sample_rate input sample rate (frequency in Hz)
* @param log_offset logging level offset
* @param log_ctx parent logging context, can be NULL
*
* @see swr_init(), swr_free()
* @return NULL on error, allocated context otherwise
*/
struct SwrContext *swr_alloc_set_opts(struct SwrContext *s,
int64_t out_ch_layout, enum AVSampleFormat out_sample_fmt, int out_sample_rate,
int64_t in_ch_layout, enum AVSampleFormat in_sample_fmt, int in_sample_rate,
int log_offset, void *log_ctx);
2)函数 int swr_init(struct SwrContext *s); // 初始化上下文。
3)函数 void swr_free(struct SwrContext **s); // 释放上下文空间。
swr_convert()
针对每一帧音频的处理。把一帧帧的音频作相应的重采样
4)函数int swr_conver
- @param s 音频重采样的上下文
- @param out 重采样后的数据
- @param out_count 重采样输出的单通道的样本数量,注意不是字节数
- @param in 重采样前的源数据
- @param in_count 输入的单通道的样本数量
/** Convert audio.
*
* in and in_count can be set to 0 to flush the last few samples out at the
* end.
*
* If more input is provided than output space, then the input will be buffered.
* You can avoid this buffering by using swr_get_out_samples() to retrieve an
* upper bound on the required number of output samples for the given number of
* input samples. Conversion will run directly without copying whenever possible.
*
* @param s allocated Swr context, with parameters set
* @param out output buffers, only the first one need be set in case of packed audio
* @param out_count amount of space available for output in samples per channel
* @param in input buffers, only the first one need to be set in case of packed audio
* @param in_count number of input samples available in one channel
*
* @return number of samples output per channel, negative value on error
*/
int swr_convert(struct SwrContext *s, uint8_t **out, int out_count,
const uint8_t **in , int in_count);
示例代码:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#ifdef __cplusplus
extern "C"
{
#endif
#define __STDC_CONSTANT_MACROS
#ifdef _STDINT_H
#undef _STDINT_H
#endif
#include <stdint.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswresample/swresample.h>
#ifdef __cplusplus
};
#endif
#define MAX_AUDIO_FRAME_SIZE 192000 //48khz 16bit audio 2 channels
int main(int argc, char **argv){
if(argc < 2){
return -1;
}
const char* in_file = argv[1];
AVFormatContext *fctx = NULL;
AVCodecContext *cctx = NULL;
AVCodec *acodec = NULL;
FILE *audio_dst_file1 = fopen("./before_resample.pcm", "wb");
FILE *audio_dst_file2 = fopen("./after_resample.pcm", "wb");
av_register_all();
avformat_open_input(&fctx, in_file, NULL, NULL);
avformat_find_stream_info(fctx, NULL);
//get audio index
int aidx = av_find_best_stream(fctx, AVMEDIA_TYPE_AUDIO, -1, -1, NULL, 0);
printf("get aidx[%d]!!!\n",aidx);
//open audio codec
AVCodecParameters *codecpar = fctx->streams[aidx]->codecpar;
acodec = avcodec_find_decoder(codecpar->codec_id);
cctx = avcodec_alloc_context3(acodec);
avcodec_parameters_to_context(cctx, codecpar);
avcodec_open2(cctx, acodec, NULL);
//init resample
int output_channels = 2;
int output_rate = 48000;
int input_channels = cctx->channels;
int input_rate = cctx->sample_rate;
AVSampleFormat input_sample_fmt = cctx->sample_fmt;
AVSampleFormat output_sample_fmt = AV_SAMPLE_FMT_S16;
printf("channels[%d=>%d],rate[%d=>%d],sample_fmt[%d=>%d]\n",
input_channels,output_channels,input_rate,output_rate,input_sample_fmt,output_sample_fmt);
SwrContext* resample_ctx = NULL;
resample_ctx = swr_alloc_set_opts(resample_ctx, av_get_default_channel_layout(output_channels),output_sample_fmt,output_rate,
av_get_default_channel_layout(input_channels),input_sample_fmt, input_rate,0,NULL);
if(!resample_ctx){
printf("av_audio_resample_init fail!!!\n");
return -1;
}
swr_init(resample_ctx);
AVPacket *pkt =av_packet_alloc();
AVFrame *frame = av_frame_alloc();
int size = 0;
uint8_t* out_buffer = (uint8_t*)av_malloc(MAX_AUDIO_FRAME_SIZE);
while(av_read_frame(fctx,pkt) == 0){//DEMUX
if(pkt->stream_index == aidx){
avcodec_send_packet(cctx, pkt);
while(1){
int ret = avcodec_receive_frame(cctx, frame);
if(ret != 0){
break;
}else{
//before resample
size = frame->nb_samples * av_get_bytes_per_sample((AVSampleFormat)frame->format);
if(frame->data[0] != NULL){
fwrite(frame->data[0], 1, size, audio_dst_file1);
}
//resample
memset(out_buffer,0x00,sizeof(out_buffer));
int out_samples = swr_convert(resample_ctx,&out_buffer,frame->nb_samples,(const uint8_t **)frame->data,frame->nb_samples);
if(out_samples > 0){
av_samples_get_buffer_size(NULL,output_channels ,out_samples, output_sample_fmt, 1);//out_samples*output_channels*av_get_bytes_per_sample(output_sample_fmt);
fwrite(out_buffer, 1, size, audio_dst_file2);
}
}
av_frame_unref(frame);
}
}
else{
//printf("not audio frame!!!\n");
av_packet_unref(pkt);
continue;
}
av_packet_unref(pkt);
}
//close
swr_free(&resample_ctx);
av_packet_free(&pkt);
av_frame_free(&frame);
avcodec_close(cctx);
avformat_close_input(&fctx);
av_free(out_buffer);
fclose(audio_dst_file1);
fclose(audio_dst_file2);
return 0;
}
A method of providing comparison libavcodec, libswresample can easily be planar type of data into the resampling process, the plurality of channels sampled output type we want.
In particular for some types of planar source, when the left and right channels are not the same, with libavcodec resampling process also requires multiple-channel data are merged. The libswresample be omitted these troubles.
After below, a left and right channels for use libswresample different source resampled output sample rate of 48000Hz, two-channel, sampling data type AV_SAMPLE_FMT_S16. From the figure, we can see that the left and right channels are significantly different.
If the above code is in accordance with the first example, only channels for which a re-sampling results thus output as left and right channels, for many source, may lose some fine channel effect.