如何开发一个音视频播放器（ffmpeg3.2+SDL2.0，视频+音频）

前言
创建一个音视频播放器的步骤

一、播放音频
二、播放视频
三、音视频同步

源码分析

一、正式开始前的准备工作
二、配置音视频基本参数

获取文件基本信息
初始化音频参数
初始化视频参数

三、从文件中读取音视频包(AVPacket)

前言

上一篇讲了如何开发一个音频播放器，其实很简单。总结起来无非就是从文件中获取音频帧，再把音频中的数据传递给系统回调系统直接读取并播放音频的过程。
这次我们更近一步开发一个“视频+音频“”的媒体播放器，本篇教程与上一篇有着很强的连续性，请先阅读上一篇后再来看这一篇。同样这次使用的仍然是ffmpeg3.2+SDL2.0开发的，在阅读时请注意自身使用的版本。本篇的源码已提交在github上：https://github.com/XP-online/media-player

创建一个音视频播放器的步骤

如何创建一个音视频的播放器是一个类似于把大象放冰箱里的问题。它总共分三步，即：播放音频 —— 播放视频 —— 音视频同步。

一、播放音频

这一部分我们上篇讲过。

二、播放视频

第二步播放视频它大概分为两个部分：

读取文件中的视频帧
根据获取到的图像信息播放视频

第一部分与读取音频帧的方式几乎没有任何区别。但第二步播放视频的方式却与播放音频完全不同。这也很好理解因为视频的图像信息已经在这里全部获取完毕了，而现实图像的方式有很多种。
实际上我们这里完全可以在获取完图像信息后不适用SDL来播放，换成其他GUI库根据图像信息自行渲染视频完全没有问题。但在本篇种我们的重点是为了介绍播放媒体文件的核心步骤，秉着尽量减少使用其他库的原则我们仍使用SDL2.0来渲染视频。

三、音视频同步

因为视频与音频都是各自独立解码，播放。由于它们各自的解码，渲染速率的不同很可能会产生视频与音频不同步的情况。所以为了解决这个问题快的那一方往往需要等待慢的一方。这就是音视频同步的问题。幸运的是每个通过ffmpeg所获取的包(* AVPacket* )中都封存着当前包的时间信息： pts 和 dts 。它们的具体含义我们后文会介绍，目前只要知道通过这两个成员我们就可以拿到当前包的时间戳就可以。
常见的音视频同步方式主要有三种即：

以音频时间戳为基准。视频根据音频时间戳决定当前包播放还是舍弃或等待。
以视频时间戳为基准。音频根据视频时间戳决定当前包播放还是舍弃或是等待。
以系统时间戳为基准。音频视频根据系统时间决定当前包播放还是舍弃或是等待。

在本篇教程中采用第一种方式，即以音频时间戳为准，视频同步音频。

源码分析

根据我们上文所说，在播放时音频和视频的播放应该是分开相互独立，所以我们需要开辟两个线程分别进行音频和视频的播放。同时与我们之前只播放音频的情况不同，之前视频包是被舍弃的现在我们显然不能这样。所以我们还需要去创建一个音频和一个视频的缓冲队列去保存已经读取到的包再通过队列把包发送给视频和音频的播放线程。此处是多线程的操作，所以这两个队列还应该是线程安全的。

一、正式开始前的准备工作

在正式分析源码前先要创建一个线程安全的队列

扫描二维码关注公众号，回复： 6710728 查看本文章

#include <vector>
#include <mutex> 

template<typename T>
// 线程安全的队列
class Queue
{
public:
	Queue() {
		q.clear();
	}
	// push 入列
	void push(T val) { 
		m.lock();
		q.push_back(val);
		m.unlock();
	}
	// pull 出列，返回false说明队列为空
	bool pull(T &val) { 
		m.lock();
		if (q.empty()) {
			m.unlock();
			return false;
		} else {
			val = q.front();
			q.erase(q.begin());
			m.unlock();
			return true;
		}
	}
	// empty 返回队列是否为空
	bool empty() { 
		m.lock();
		bool isEmpty = q.empty();
		m.unlock();
		return isEmpty;
	}
	// size 返回队列的大小
	int size() { 
		m.lock();
		int s = q.size();
		m.unlock();
		return s;
	}
protected:
	std::vector<T> q; // 容器
	std::mutex     m; // 锁
};

由于是在两个线程中播放，而播放时需要知道当前播放视频或音频的环境信息。所以我们把一些基本的环境信息封装一下，以便创建线程的时候发送给每个线程。

#define MAX_AUDIO_FRAME_SIZE 192000 //采样率：1 second of 48khz 32bit audio 

class PlayerContext {
public:
	PlayerContext();
	AVFormatContext* pFormateCtx; // AV文件的上下文
	std::atomic_bool quit;		  // 退出标志

	// --------------------------- 音频相关参数 ---------------------------- //
	AVCodecParameters* audioCodecParameter; // 音频解码器的参数
	AVCodecContext*	   audioCodecCtx;	    // 音频解码器的上下文
	AVCodec*		   audioCodec;			// 音频解码器
	AVStream*		   audio_stream;		// 音频流
	int				   au_stream_index;	    // 记录音频流的位置
	double			   audio_clk;			// 当前音频时间
	int64_t			   audio_pts;			// 记录当前已经播放音频的时间
	int64_t			   audio_pts_duration;	// 记录当前已经播放音频的时间
	Uint8*			   audio_pos;			// 用来控制每次
	Uint32			   audio_len;			// 用来控制

	Queue<AVPacket*>    audio_queue;		// 音频包队列
	SwrContext*		   au_convert_ctx;		// 音频转换器
	AVSampleFormat	   out_sample_fmt;		// 重采样格式，默认设置为 AV_SAMPLE_FMT_S16
	int				   out_buffer_size;		// 重采样后的buffer大小
	uint8_t*		   out_buffer;			// 重采样后的buffer

	SDL_AudioSpec	   wanted_spec;			// sdl系统播放音频的各项参数信息
	// ------------------------------ end --------------------------------- //

	// ------------------------ 视频相关参数 --------------------------- //
	AVCodecParameters* videoCodecParameter; // 视频解码器的参数
	AVCodecContext*	   videoCodecCtx;		// 视频解码器的上下文
	AVCodec*		   pVideoCodec;			// 视频解码器
	AVStream*		   video_stream;		// 视频流
	Queue<AVPacket*>    video_queue;		// 视频包队列

	SwsContext*		   vi_convert_ctx;		// 视频转换器
	AVFrame*		   pFrameYUV;			// 存放转换后的视频
	int				   video_stream_index;	// 记录视频流的位置
	int64_t			   video_pts;			// 记录当前已经播放了的视频时间
	double			   video_clk;			// 当前视频帧的时间戳
	// ---------------------------- end ------------------------------ //

	// ---------------------------- sdl ----------------------------- //
	SDL_Window*		   screen;	 // 视频窗口
	SDL_Renderer*	   renderer; // 渲染器
	SDL_Texture*	   texture;  // 纹理
	SDL_Rect		   sdlRect;
	// --------------------------- end ------------------------------ //
};

这些参数在这里都做了注释，不过暂时还不用完全理解。在下文使用时自然就会了解。另外在这个类的构造函数中还对这些值做了的初始化，基本就是清空赋空指针等操作，碍于篇幅所限就不在此展示了。有需要的可以去github上查看源码。

二、配置音视频基本参数

与上文类似我们要做的还是获取文件中储存的音视频信息，并配置音视频的解码器。

获取文件基本信息

获取文件信息的方式与上一篇的方式完全一致。最后调用 init_audio_parameters ，init_video_paramerters 两个函数初始化音频和视频的参数信息。这些信息都保存在 playerCtx 中。

// 注册所有编码器
	av_register_all();
	
	// 音视频的环境
	PlayerContext playerCtx;

	//读取文件头的格式信息储存到pFormateCtx中
	if (avformat_open_input(&playerCtx.pFormateCtx, filePath, nullptr, 0) != 0) {
		printf_s("avformat_open_input failed.\n");
		return -1;
	}
	//读取文件中的流信息储存到pFormateCtx中
	if (avformat_find_stream_info(playerCtx.pFormateCtx, nullptr) < 0) {
		printf_s("avformat_find_stream_info failed.\n");
		return -1;
	}
	// 将文件信息储存到标准错误上
	av_dump_format(playerCtx.pFormateCtx, 0, filePath, 0);

	// 查找音频流和视频流的位置
	for (unsigned i = 0; i < playerCtx.pFormateCtx->nb_streams; ++i)
	{
		if (AVMEDIA_TYPE_VIDEO == playerCtx.pFormateCtx->streams[i]->codecpar->codec_type
			&& playerCtx.video_stream_index < 0) { // 获取视频流位置
			playerCtx.video_stream_index = i;
			playerCtx.video_stream = playerCtx.pFormateCtx->streams[i];
		}
		if (AVMEDIA_TYPE_AUDIO == playerCtx.pFormateCtx->streams[i]->codecpar->codec_type
			&& playerCtx.au_stream_index < 0) { // 获取音频流位置
			playerCtx.au_stream_index = i;
			playerCtx.audio_stream = playerCtx.pFormateCtx->streams[i];
			continue;
		}
	}
	// 异常处理
	if (playerCtx.video_stream_index == -1)
		return -1;
	if (playerCtx.au_stream_index == -1)
		return -1;
	
	// 初始化 SDL
	if (SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO))
	{
		printf("Could not initialize SDL - %s\n", SDL_GetError());
		return -1;
	}
	// 初始化音频参数
	if (init_audio_parameters(playerCtx) < 0) {
		return -1;
	}
	// 初始化视频参数
	if (init_video_paramerters(playerCtx) < 0) {
		return -1;
	}

初始化音频参数

可以看到有关音频的部分与上一篇讲的基本一样在此就不做赘述，init_audio_parameters() 的实现请看源码。下面着重介绍一下视频的初始化。

初始化视频参数

视频参数的初始化分为两个部分。一部分是ffmpeg的视频解码器的初始化。这部分与音频的类似。另一个部分是视频所独有的渲染视频界面的参数配置。这部分视音频所没有的，它的核心目的就是为了渲染播放视频。这部分可以用很多的GUI库来实现。在本例中为了减少引入其他的库所以采用SDL来渲染播放视频。
想要通过SDL显示视频，主要通过三个结构：

SDL_Window：视频窗口，通过 SDL_CreateWindow() 创建。
SDL_Texture：视频纹理：通过 SDL_CreateTexture() 创建。SDL的视频纹理是用来储存需要显示的视频信息。视频的格式有很多种这里采用YUV的格式方式。
SDL_Render：视频渲染器，通过 SDL_CreateRenderer() 创建。SDL渲染器用来显示视频到窗口上。

想要显示视频，先要创建这三个结构，然后根据每次获取的视频信息更新纹理（ SDL_UpdateYUVTexture() ）同时重置
渲染器( SDL_RenderClear() ）。再将新的纹理信息拷贝给渲染器（ SDL_RenderCopy() ），最后通过渲染器将视频显示
出来（ SDL_RenderPresent() ）

// init_video_paramerters 初始化视频参数，sws转换器所需的各项参数
int init_video_paramerters(PlayerContext& playerCtx) {
	// 获取视频解码器参数
	playerCtx.videoCodecParameter = playerCtx.pFormateCtx->streams[playerCtx.video_stream_index]->codecpar;
	// 获取视频解码器
	playerCtx.pVideoCodec = avcodec_find_decoder(playerCtx.videoCodecParameter->codec_id);
	if (nullptr == playerCtx.pVideoCodec) {
		printf_s("video avcodec_find_decoder failed.\n");
		return -1;
	}
	// 获取解码器上下文
	playerCtx.videoCodecCtx = avcodec_alloc_context3(playerCtx.pVideoCodec);
	// 根据视频参数配置视频编码器
	if (avcodec_parameters_to_context(playerCtx.videoCodecCtx, playerCtx.videoCodecParameter) < 0) {
		printf_s("video avcodec_parameters_to_context failed\n");
		return -1;
	}
	// 根据上下文配置视频解码器
	avcodec_open2(playerCtx.videoCodecCtx, playerCtx.pVideoCodec, nullptr);

	// 创建一个SDL窗口 SDL2.0之后的版本
	playerCtx.screen = SDL_CreateWindow("MediaPlayer",
		SDL_WINDOWPOS_UNDEFINED,
		SDL_WINDOWPOS_UNDEFINED,
		//playerCtx.videoCodecCtx->width, playerCtx.videoCodecCtx->height,
		1280, 720,
		SDL_WINDOW_SHOWN | SDL_WINDOW_OPENGL);
	if (!playerCtx.screen) {
		fprintf(stderr, "SDL: could not set video mode - exiting\n");
		exit(1);
	}
	// 创建一个SDL渲染器
	playerCtx.renderer = SDL_CreateRenderer(playerCtx.screen, -1, 0);
	// 创建一个SDL纹理
	playerCtx.texture = SDL_CreateTexture(playerCtx.renderer,
		SDL_PIXELFORMAT_YV12,
		SDL_TEXTUREACCESS_STREAMING,
		playerCtx.videoCodecCtx->width, playerCtx.videoCodecCtx->height);
	// 设置SDL渲染的区域
	playerCtx.sdlRect.x = 0;
	playerCtx.sdlRect.y = 0;
	playerCtx.sdlRect.w = 1280;// playerCtx.videoCodecCtx->width;
	playerCtx.sdlRect.h = 720;// playerCtx.videoCodecCtx->height;
	// 设置视频缩放转换器
	playerCtx.vi_convert_ctx = sws_getContext(playerCtx.videoCodecCtx->width, playerCtx.videoCodecCtx->height, playerCtx.videoCodecCtx->pix_fmt
		, playerCtx.videoCodecCtx->width, playerCtx.videoCodecCtx->height, AV_PIX_FMT_YUV420P, SWS_BICUBIC, nullptr, nullptr, nullptr);
	// 配置视频帧和视频像素空间
	unsigned char* out_buffer = (unsigned char*)av_malloc(av_image_get_buffer_size(AV_PIX_FMT_YUV420P
		, playerCtx.videoCodecCtx->width, playerCtx.videoCodecCtx->height, 1));
	playerCtx.pFrameYUV = av_frame_alloc(); // 配置视频帧
	// 配置视频帧的像素数据空间
	av_image_fill_arrays(playerCtx.pFrameYUV->data, playerCtx.pFrameYUV->linesize
		, out_buffer, AV_PIX_FMT_YUV420P, playerCtx.videoCodecCtx->width, playerCtx.videoCodecCtx->height, 1);
	return 0;
}

最后av_image_fill_arrays这个函数的作用与av_init_packet的作用类似。是用来给AVFrame的data创建一块内存空间的。

三、从文件中读取音视频包(AVPacket)

读取视频包的方式与读取音频几乎一摸一样。

// 读取AVFrame并根据pkt的类型放入音频队列或视频队列中
	AVPacket* packet = nullptr;
	while (!playerCtx.quit) {
		// 判断缓存是否填满,填满则等待消耗后再继续填缓存
		if (playerCtx.audio_queue.size() > 50 ||
			playerCtx.video_queue.size() > 100) {
			SDL_Delay(10);
			continue;
		}
		packet = av_packet_alloc();
		av_init_packet(packet);
		if (av_read_frame(playerCtx.pFormateCtx, packet) < 0) { // 从AV文件中读取Frame
			break;
		}
		if (packet->stream_index == playerCtx.au_stream_index) { // 将音频帧存入到音频缓存队列中，在音频的解码线程中解码
			playerCtx.audio_queue.push(packet);
		}
		else if (packet->stream_index == playerCtx.video_stream_index) { //将视频存入到视频缓存队列中，在视频的解码线程中解码
			playerCtx.video_queue.push(packet);
		}
		else {
			av_packet_unref(packet);
			av_packet_free(&packet);
		}
	}

由于之前已经详细介绍过如何读取音频包了在这里不再赘述。需要注意的一点是这里我们不再直接进行解码，而是压入我们储存音频和视频的两个队列中。
关于视频的解码，播放，音视频同步等请看下篇

如何开发一个音视频播放器（ffmpeg3.2+sdl2.0，视频+音频）