简述
视频和音频是在不同的线程内播放的,因为cpu调度的问题和解码效率之类的因素无法同步,所以需要人为同步。就三种,以视频为基准,以音频为基准,和以第三方基准。因为生物学原因人对声音更加敏感,所以不方便去调整音频,一般都是以音频为基准。
DTS :Decoding Time Stamp,解码时间戳,告诉解码器packet的解码顺序。
PTS :Presentation Time Stamp,显示时间戳,指示从packet中解码出来的数据的显示顺序。
在音频中数据顺序存储,上面两种一样。在视频中,因为一些中间帧依赖前后帧,所以会把依赖的后帧存储在中间帧前面,就导致了上面两者的不同。
time_base :时间基,时间戳和时间秒的换算单位,AVRational这个结构体存储分子分母。至于为什么不直接存double,还要通过av_q2d函数转换,大概是因为精度问题吧,大概。
/**
* Rational number (pair of numerator and denominator).
*/
typedef struct AVRational{
int num; ///< Numerator
int den; ///< Denominator
} AVRational;
/**
* This is the fundamental unit of time (in seconds) in terms
* of which frame timestamps are represented.
*
* decoding: set by libavformat
* encoding: May be set by the caller before avformat_write_header() to
* provide a hint to the muxer about the desired timebase. In
* avformat_write_header(), the muxer will overwrite this field
* with the timebase that will actually be used for the timestamps
* written into the file (which may or may not be related to the
* user-provided one, depending on the format).
*/
AVRational time_base;
比如某一帧的显示时间可以由下面公式计算(所得videoClock单位是秒)
videoClock = pts * av_q2d(time_base);
音频时钟
先定义变量
AVRational audioTimeBase;
double audioClock;//音频时钟
然后在获取到音频流时获取音频时间基
if (avFormatContext->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
audio_index = i;
audioTimeBase = avFormatContext->streams[i]->time_base;
}
最后在getPcm这个函数里更新音频时钟
if (audioFrame->pts != AV_NOPTS_VALUE) {
//这一帧的起始时间
audioClock = audioFrame->pts * av_q2d(audioTimeBase);
//这一帧数据的时间
double time = size / ((double) 44100 * 2 * 2);
//最终音频时钟
audioClock = time + audioClock;
}
视频同步音频
先定义变量
AVRational videoTimeBase;
double videoClock;//视频时钟
同上在获取到视频流时获取视频时间基
if (avFormatContext->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
video_index = i;
videoTimeBase = avFormatContext->streams[i]->time_base;
}
在取视频帧的循环函数里,循环外定义变量
double last_play //上一帧的播放时间
, play //当前帧的播放时间
, last_delay // 上一次播放视频的两帧视频间隔时间
, delay //线程休眠时间
, diff //音频帧与视频帧相差时间
, sync_threshold //合理的范围
, pts
, decodeStartTime //每一帧解码开始时间
, frame_time_stamp = av_q2d(videoTimeBase); //时间戳的实际时间单位
在取packet的循环里面计算休眠时间,进行同步。主要是通过延长或缩短每一帧的显示时间
decodeStartTime = av_gettime() / 1000000.0;
AVPacket *packet = videoPacketQueue.front();
videoPacketQueue.pop();
avcodec_send_packet(avCodecContext, packet);
AVFrame *frame = av_frame_alloc();
if (!avcodec_receive_frame(avCodecContext, frame)) {
if ((pts = frame->best_effort_timestamp) == AV_NOPTS_VALUE) {
pts = videoClock;
}
play = pts * frame_time_stamp;
videoClock =
play + (frame->repeat_pict * 0.5 * frame_time_stamp + frame_time_stamp);
delay = play - last_play;
diff = videoClock - audioClock;
sync_threshold = FFMAX(AV_SYNC_THRESHOLD_MIN, FFMIN(AV_SYNC_THRESHOLD_MAX, delay));
if (fabs(diff) < 10) {
if (diff <= -sync_threshold) {
delay = FFMAX(0.01, delay + diff);
} else if (diff >= sync_threshold && delay > AV_SYNC_FRAMEDUP_THRESHOLD) {
delay = delay + diff;
} else if (diff >= sync_threshold && delay) {
delay = 2 * delay;
}
}
if (delay <= 0 || delay > 1) {
delay = last_delay;
}
last_delay = delay;
//减去解码消耗时间
delay = delay + (decodeStartTime - av_gettime() / 1000000.0);
if (delay < 0) {
delay = 0;
}
last_play = play;
最后在抛出视频帧后休眠
ANativeWindow_unlockAndPost(nativeWindow);
if (delay > 0.001) {
av_usleep(delay * 1000000);
}
虽然还有些细节有待改进,但是大体就是这样子。