FFmpeg Getting Started Tutorial: Use common API and C language development


Because the reasons for project use in contact with ffmpeg, was using c # to call ffmpeg, transcode the video through instruction. How the instructions easier to use, but if it comes to more complex secondary development of audio and video, if there is no certain knowledge of audio and video related concepts, then find it hard to understand the meaning and logic of the code. Since the interest has recently begun to explore the use of relevant learning ffmpeg API's.

Understanding concepts

1, the basic concept of multimedia files

  • It is a multimedia container file
  • There are many stream (Stream / Track) in a container
  • Each stream is encoded by different encoders
  • Data read out from the packet stream called
  • It contains one or more frames in one packet

2, the quantization of the encoded audio

  • Analog to digital conversion of signal (Continuous -> discrete, non-continuous process to be used by the computer
  • Analog -> Sample -> Quantification -> Coding -> Digital Signal
  • Quantify the basic concept: Sample size: a sample with a number of bit storage, commonly used 16bit
  • Sampling rate: that is, the sampling frequency (sampling frequency 1 second), the general sample rate 8kHz, 16kHz, 32kHz, 44.1kHz, 48kHz, etc., the higher the sampling frequency, the more realistic sound more natural reduction, of course, the greater the amount of data
  • Number of channels: In order to be able to restore the true sound when playing sound field, the sound at the same time get in a few different directions around, while recording the sound, the sound of each orientation is a channel. Is the number of channels corresponding in number to the number of loudspeakers or audio sound recording is played back, there are mono, two-channel, multichannel
  • Rate: Also known as bit rate, is the number of bit transmitted per second. Units of bps (Bit Per Second), the higher the bit rate, the more data transfer per second, the better the sound quality.
码率计算公式:
码率 = 采样率 * 采样大小 * 声道数
比如采样率44.1kHz,采样大小为16bit,双声道PCM编码的WAV文件:
码率=44.1hHz*16bit*2=1411.2kbit/s。
录制1分钟的音乐的大小为(1411.2 * 1000 * 60) / 8 / 1024 / 1024 = 10.09M。

3, time-yl

  • time_base is used to measure time, such time_base = {1,40}, it means the one second is divided into 40 segments, then each segment is 1/40 seconds, in a function FFmpeg av_q2d (time_base) is used to calculate a period of time, the calculation result is 1/40 sec. For example, a video pts certain frame is 800, which means that there are 800 segments, it indicates how many seconds it, pts av_q2d (time_base) = 800 (1/40) = 20s, that is to say to the first 20 seconds when playing converting the frame time base. Different formats different time base.
  • PTS is a time stamp with rendering. DTS is a Decoding Time Stamp.
    Audio PTS: AAC audio to an example, an AAC frame contains 1024 samples of the original data and a period of time, that is to say a 1024 samples, if the sampling rate of 44.1kHz (1 second acquisitions 44,100 samples), it aac 44100/1024 audio frame has one second, the duration of each frame is 1024/44100 seconds, thereby pts can be calculated for each frame.
  • The conversion formula
timestamp() = pts * av_q2d(st->time_base)//计算该帧在视频音频中的位置
time() = st->duration * av_q2d(st->time_base)//计算视频音频中的长度
st  为AVStream流指针
时间基转换公式
timestamp(ffmpeg内部时间戳) = AV_TIME_BASE * time()
time() = AV_TIME_BASE_Q * timestamp(ffmpeg内部时间戳)//timestamp就算是PTS/DTS

Environment Configuration

Download

Enter the official website were downloaded Dev and Shared archive. Note that the option to download the corresponding platform.
The dev, include, lib files are extracted to the following directory. Copy the dll file to the shared project Debug directory, sub-error will occur.
Here Insert Picture Description

Environment Configuration

Creating c / c ++ project in VS, property right project
Here Insert Picture Description
Here Insert Picture Description
in which to add the following dll file

avcodec.lib; avformat.lib; avutil.lib; avdevice.lib; avfilter.lib; postproc.lib; swresample.lib; swscale.lib
libavcodec  提供一系列编码器的实现
libavformat 实现在流协议,容器格式及其IO访问
libavutil 包括了hash器、解码器和各种工具函数
libavfilter 提供了各种音视频过滤器
libavdevice 提供了访问捕获设备和回放设备的接口
libswresample 实现了混音和重采样
libswscale 实现了色彩转换和缩放功能

Here Insert Picture Description

test

I am using VS2017 as the editor for development.

#include<stdio.h>
#include <iostream>
extern "C" {
#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"
}
int main(int argc, char* argv[]) {
	printf(avcodec_configuration());
	system("pause");
	return 0;
}

Here Insert Picture Description

Development Case

Video and audio mix and match to achieve two videos, a little coffee show similar features.

And processing logic to use the API

  • API Registration
  • Create input, output context
  • Get input audio stream, incoming video stream
  • Creating output audio stream, video stream output
  • The input parameters are copied to the output stream flow parameter
  • Determine the file size, file length output determined
  • Write header
  • Initialization packet, audio and video data are read and written to the file
    Here Insert Picture Description

The relevant code


#include<stdio.h>
#include <iostream>
extern "C" {
#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"
#include "libavformat/avio.h"
#include <libavutil/log.h>
#include <libavutil/timestamp.h>
}
#define ERROR_STR_SIZE 1024
int main(int argc, char const *argv[])
{
	int ret = -1;
	int err_code;
	char errors[ERROR_STR_SIZE];
	AVFormatContext *ifmt_ctx1 = NULL;
	AVFormatContext *ifmt_ctx2 = NULL;
	AVFormatContext *ofmt_ctx = NULL;
	AVOutputFormat *ofmt = NULL;
	AVStream *in_stream1 = NULL;
	AVStream *in_stream2 = NULL;
	AVStream *out_stream1 = NULL;
	AVStream *out_stream2 = NULL;
	int audio_stream_index = 0;
	int vedio_stream_indes = 0;
	// 文件最大时长,保证音频和视频数据长度一致
	double max_duration = 0;
	AVPacket pkt;
	int stream1 = 0, stream2 = 0;
	av_log_set_level(AV_LOG_DEBUG);
	//打开两个输入文件
	if ((err_code = avformat_open_input(&ifmt_ctx1, "C:\\Users\\haizhengzheng\\Desktop\\meta.mp4", 0, 0)) < 0) {
		av_strerror(err_code, errors, ERROR_STR_SIZE);
		av_log(NULL, AV_LOG_ERROR, "Could not open src file, %s, %d(%s)\n",
			"C:\\Users\\haizhengzheng\\Desktop\\meta.mp4", err_code, errors);
		goto END;
	}
	if ((err_code = avformat_open_input(&ifmt_ctx2, "C:\\Users\\haizhengzheng\\Desktop\\mercury.mp4", 0, 0)) < 0) {
		av_strerror(err_code, errors, ERROR_STR_SIZE);
		av_log(NULL, AV_LOG_ERROR,
			"Could not open the second src file, %s, %d(%s)\n",
			"C:\\Users\\haizhengzheng\\Desktop\\mercury.mp4", err_code, errors);
		goto END;
	}
	//创建输出上下文
	if ((err_code = avformat_alloc_output_context2(&ofmt_ctx, NULL, NULL, "C:\\Users\\haizhengzheng\\Desktop\\amv.mp4")) < 0) {
		av_strerror(err_code, errors, ERROR_STR_SIZE);
		av_log(NULL, AV_LOG_ERROR, "Failed to create an context of outfile , %d(%s) \n",
			err_code, errors);
	}
	ofmt = ofmt_ctx->oformat;//获得输出文件的格式信息
	// 找到第一个参数里最好的音频流和第二个文件中的视频流下标
	audio_stream_index = av_find_best_stream(ifmt_ctx1, AVMEDIA_TYPE_AUDIO, -1, -1, NULL, 0);//获取音频流下标
	vedio_stream_indes = av_find_best_stream(ifmt_ctx2, AVMEDIA_TYPE_VIDEO, -1, -1, NULL, 0);//获取视频流下标
	// 获取第一个文件中的音频流
	in_stream1 = ifmt_ctx1->streams[audio_stream_index];
	stream1 = 0;
	// 创建音频输出流
	out_stream1 = avformat_new_stream(ofmt_ctx, NULL);
	if (!out_stream1) {
		av_log(NULL, AV_LOG_ERROR, "Failed to alloc out stream!\n");
		goto END;
	}
	// 拷贝流参数
	if ((err_code = avcodec_parameters_copy(out_stream1->codecpar, in_stream1->codecpar)) < 0) {
		av_strerror(err_code, errors, ERROR_STR_SIZE);
		av_log(NULL, AV_LOG_ERROR,
			"Failed to copy codec parameter, %d(%s)\n",
			err_code, errors);
	}
	out_stream1->codecpar->codec_tag = 0;
	// 获取第二个文件中的视频流
	in_stream2 = ifmt_ctx2->streams[vedio_stream_indes];
	stream2 = 1;

	// 创建视频输出流
	out_stream2 = avformat_new_stream(ofmt_ctx, NULL);
	if (!out_stream2) {
		av_log(NULL, AV_LOG_ERROR, "Failed to alloc out stream!\n");
		goto END;
	}
	// 拷贝流参数
	if ((err_code = avcodec_parameters_copy(out_stream2->codecpar, in_stream2->codecpar)) < 0) {
		av_strerror(err_code, errors, ERROR_STR_SIZE);
		av_log(NULL, AV_LOG_ERROR,
			"Failed to copy codec parameter, %d(%s)\n",
			err_code, errors);
		goto END;
	}
	out_stream2->codecpar->codec_tag = 0;
	//输出流信息
	av_dump_format(ofmt_ctx, 0, "C:\\Users\\haizhengzheng\\Desktop\\amv.mp4", 1);

	// 判断两个流的长度,确定最终文件的长度    time(秒) = st->duration * av_q2d(st->time_base)   duration 就是dts\pts     av_q2d()就是倒数
	if (in_stream1->duration * av_q2d(in_stream1->time_base) > in_stream2->duration * av_q2d(in_stream2->time_base)) {
		max_duration = in_stream2->duration * av_q2d(in_stream2->time_base);
	}
	else {
		max_duration = in_stream1->duration * av_q2d(in_stream1->time_base);
	}
	//打开输出文件
	if (!(ofmt->flags & AVFMT_NOFILE)) {
		if ((err_code = avio_open(&ofmt_ctx->pb, "C:\\Users\\haizhengzheng\\Desktop\\amv.mp4", AVIO_FLAG_WRITE)) < 0) {
			av_strerror(err_code, errors, ERROR_STR_SIZE);
			av_log(NULL, AV_LOG_ERROR,
				"Could not open output file, %s, %d(%s)\n",
				"C:\\Users\\haizhengzheng\\Desktop\\amv.mp4", err_code, errors);
			goto END;
		}
	}
	//写头信息
	avformat_write_header(ofmt_ctx, NULL);
	av_init_packet(&pkt);
	// 读取音频数据并写入输出文件中
	while (av_read_frame(ifmt_ctx1, &pkt) >= 0) {
		// 如果读取的时间超过了最长时间表示不需要该帧,跳过
		if (pkt.pts * av_q2d(in_stream1->time_base) > max_duration) {
			av_packet_unref(&pkt);
			continue;
		}
		// 如果是我们需要的音频流,转换时间基后写入文件  av_rescale_q_rnd()时间基转换函数
		if (pkt.stream_index == audio_stream_index) {
			pkt.pts = av_rescale_q_rnd(pkt.pts, in_stream1->time_base, out_stream1->time_base,//获取包的PTS\DTS\duration
				(AVRounding)(AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX));
			pkt.dts = av_rescale_q_rnd(pkt.dts, in_stream1->time_base, out_stream1->time_base,
				(AVRounding)(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));
			pkt.duration = av_rescale_q(max_duration, in_stream1->time_base, out_stream1->time_base);
			pkt.pos = -1;
			pkt.stream_index = stream1;
			av_interleaved_write_frame(ofmt_ctx, &pkt);
			av_packet_unref(&pkt);
		}
	}


	// 读取视频数据并写入输出文件中
	while (av_read_frame(ifmt_ctx2, &pkt) >= 0) {

		// 如果读取的时间超过了最长时间表示不需要该帧,跳过
		if (pkt.pts * av_q2d(in_stream2->time_base) > max_duration) {
			av_packet_unref(&pkt);
			continue;
		}
		// 如果是我们需要的视频流,转换时间基后写入文件
		if (pkt.stream_index == vedio_stream_indes) {
			pkt.pts = av_rescale_q_rnd(pkt.pts, in_stream2->time_base, out_stream2->time_base,
				(AVRounding)(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));
			pkt.dts = av_rescale_q_rnd(pkt.dts, in_stream2->time_base, out_stream2->time_base,
				(AVRounding)(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));
			pkt.duration = av_rescale_q(max_duration, in_stream2->time_base, out_stream2->time_base);
			pkt.pos = -1;
			pkt.stream_index = stream2;
			av_interleaved_write_frame(ofmt_ctx, &pkt);
			av_packet_unref(&pkt);
		}
	}
	//写尾信息
	av_write_trailer(ofmt_ctx);
		ret = 0;
END:
	// 释放内存
	if (ifmt_ctx1) {
		avformat_close_input(&ifmt_ctx1);
	}

	if (ifmt_ctx2) {
		avformat_close_input(&ifmt_ctx2);
	}

	if (ofmt_ctx) {
		if (!(ofmt->flags & AVFMT_NOFILE)) {
			avio_closep(&ofmt_ctx->pb);
		}
		avformat_free_context(ofmt_ctx);
	}
}

Reference article

Audio Basics
code reference

Published 31 original articles · won praise 29 · views 8725

Guess you like

Origin blog.csdn.net/m0_38051293/article/details/104703396