NuPlayer source code analysis (3) Video frame processing analysis and partial synchronization mechanism analysis

1. Introduction:
In the previous blog, the buffer processing of the audio part was analyzed. Nuplayer shares a lot of code in the processing of audio and video buffers. This blog will directly analyze the differences. Overall, the synchronization mechanism of nuplayer is similar to that of exoplayer. It is based on the pts in the code stream and the system time. Estimated, and combined with the vertical synchronization signal time point to determine the final display time. The difference is that nuplayer's calibration of the display time is too complicated, and many of them are incomprehensible, but if you don't focus on the content of the calibration, other parts are still easy to understand.

2. Determine the display time of the video frame:
many audio and video buffer processing functions are shared, and we directly locate NuPlayerRenderer.cppthe postDrainVideoQueue_lfunction:

void NuPlayer::Renderer::postDrainVideoQueue_l() {
    
    
    if (mDrainVideoQueuePending
            || mSyncQueues
            || (mPaused && mVideoSampleReceived)) {
    
    
        return;
    }

    if (mVideoQueue.empty()) {
    
    
        return;
    }

    QueueEntry &entry = *mVideoQueue.begin();

	/* 1.创建kWhatDrainVideoQueue消息用于后续投递处理 */
    sp<AMessage> msg = new AMessage(kWhatDrainVideoQueue, id());
    msg->setInt32("generation", mVideoQueueGeneration);

    if (entry.mBuffer == NULL) {
    
    
        // EOS doesn't carry a timestamp.
        msg->post();
        mDrainVideoQueuePending = true;
        return;
    }

    int64_t delayUs;
    int64_t nowUs = ALooper::GetNowUs();
    int64_t realTimeUs;
    if (mFlags & FLAG_REAL_TIME) {
    
    
        int64_t mediaTimeUs;
        CHECK(entry.mBuffer->meta()->findInt64("timeUs", &mediaTimeUs));
        realTimeUs = mediaTimeUs;
    } else {
    
    
        int64_t mediaTimeUs;
        CHECK(entry.mBuffer->meta()->findInt64("timeUs", &mediaTimeUs));

        if (mAnchorTimeMediaUs < 0) {
    
    
            setAnchorTime(mediaTimeUs, nowUs);
            mPausePositionMediaTimeUs = mediaTimeUs;
            mAnchorMaxMediaUs = mediaTimeUs;
            realTimeUs = nowUs;
        } else {
    
    
            /* 2.获取当前帧送显时间 */
            realTimeUs = getRealTimeUs(mediaTimeUs, nowUs);
        }
        if (!mHasAudio) {
    
    
            mAnchorMaxMediaUs = mediaTimeUs + 100000; // smooth out videos >= 10fps
        }

        // Heuristics to handle situation when media time changed without a
        // discontinuity. If we have not drained an audio buffer that was
        // received after this buffer, repost in 10 msec. Otherwise repost
        // in 500 msec.
        /* 当前帧渲染时间差 = 当前帧时间戳 - 当前帧送显时间 */
        /* 当前帧渲染时间差 = 当前帧送显时间 - 系统时间 */        
        delayUs = realTimeUs - nowUs;
        if (delayUs > 500000) {
    
    
            int64_t postDelayUs = 500000;
            if (mHasAudio && (mLastAudioBufferDrained - entry.mBufferOrdinal) <= 0) {
    
    
                postDelayUs = 10000;
            }
            msg->setWhat(kWhatPostDrainVideoQueue);
            msg->post(postDelayUs);
            mVideoScheduler->restart();
            ALOGI("possible video time jump of %dms, retrying in %dms",
                    (int)(delayUs / 1000), (int)(postDelayUs / 1000));
            mDrainVideoQueuePending = true;
            return;
        }
    }

    /* 3.校准送显时间 */
    realTimeUs = mVideoScheduler->schedule(realTimeUs * 1000) / 1000;

    /* 4.计算出两个垂直同步信号用时时长 */
    int64_t twoVsyncsUs = 2 * (mVideoScheduler->getVsyncPeriod() / 1000);

    /* 再次计算下一帧视频渲染时间差 = 校准后的送显时间 - 系统时间 */
    delayUs = realTimeUs - nowUs;

    /* 5.送显:两个垂直同步信号点 */
    ALOGW_IF(delayUs > 500000, "unusually high delayUs: %" PRId64, delayUs);
    // post 2 display refreshes before rendering is due
    msg->post(delayUs > twoVsyncsUs ? delayUs - twoVsyncsUs : 0);

    mDrainVideoQueuePending = true;
}

Note 1:
After confirming the sending time, the message will be sent out, and the rendering operation will be performed during the message processing.
Note 2:
This is to get a preliminary display time, mediaTimeUsand nowUsthe representative means are the pts in the code stream and the current system time, follow up to getRealTimeUs to see how the initial display time is calculated:

int64_t NuPlayer::Renderer::getRealTimeUs(int64_t mediaTimeUs, int64_t nowUs) {
    
    
    int64_t currentPositionUs;
    /* 获取当前播放位置 */
    if (mPaused || getCurrentPositionOnLooper(
            &currentPositionUs, nowUs, true /* allowPastQueuedVideo */) != OK) {
    
    
        // If failed to get current position, e.g. due to audio clock is not ready, then just
        // play out video immediately without delay.
        return nowUs;
    }
     /* 当前帧时间戳 - 当前播放位置 + 系统时间 */
    return (mediaTimeUs - currentPositionUs) + nowUs;
}

Looking at the calculation of the return value, you need to get the current playback position, which getCurrentPositionOnLooperis obtained by the function:

status_t NuPlayer::Renderer::getCurrentPositionOnLooper(
        int64_t *mediaUs, int64_t nowUs, bool allowPastQueuedVideo) {
    
    
    int64_t currentPositionUs;
    /* if判断条件返回false,除非pause状态才会进入这里 */
    if (getCurrentPositionIfPaused_l(&currentPositionUs)) {
    
    
        *mediaUs = currentPositionUs;
        return OK;
    }

    return getCurrentPositionFromAnchor(mediaUs, nowUs, allowPastQueuedVideo);
}

Follow up to getCurrentPositionFromAnchor:

// Called on any threads.
status_t NuPlayer::Renderer::getCurrentPositionFromAnchor(
        int64_t *mediaUs, int64_t nowUs, bool allowPastQueuedVideo) {
    
    
    Mutex::Autolock autoLock(mTimeLock);
    if (!mHasAudio && !mHasVideo) {
    
    
        return NO_INIT;
    }

    if (mAnchorTimeMediaUs < 0) {
    
    
        return NO_INIT;
    }

    /* 计算当前播放时间 = (系统时间 - 已播放时间) + 上一帧音频时间戳 */
    int64_t positionUs = (nowUs - mAnchorTimeRealUs) + mAnchorTimeMediaUs;

    if (mPauseStartedTimeRealUs != -1) {
    
    
        positionUs -= (nowUs - mPauseStartedTimeRealUs);
    }

    // limit position to the last queued media time (for video only stream
    // position will be discrete as we don't know how long each frame lasts)
    if (mAnchorMaxMediaUs >= 0 && !allowPastQueuedVideo) {
    
    
        if (positionUs > mAnchorMaxMediaUs) {
    
    
            positionUs = mAnchorMaxMediaUs;
        }
    }

    if (positionUs < mAudioFirstAnchorTimeMediaUs) {
    
    
        positionUs = mAudioFirstAnchorTimeMediaUs;
    }

    *mediaUs = (positionUs <= 0) ? 0 : positionUs;
    return OK;
}

We need to pay attention that the synchronization of nuplayer still uses the strategy of adjusting the video frame based on the audio time stamp, so here we need to get the position of the played audio frame. The core is the comment in the code. Take a look at mAnchorTimeRealUsthis Where is the variable updated? mAnchorTimeRealUs records the playback position of the audio, which is setAnchorTimeupdated through the function:

void NuPlayer::Renderer::setAnchorTime(
        int64_t mediaUs, int64_t realUs, int64_t numFramesWritten, bool resume) {
    
    
    Mutex::Autolock autoLock(mTimeLock);
    /* 更新码流中获得的音频时间戳 */
    mAnchorTimeMediaUs = mediaUs;
    /* 更新AudioTrack实际的播放时间 */
    mAnchorTimeRealUs = realUs;
    /* 更新实际已写入AudioTrack的帧数 */
    mAnchorNumFramesWritten = numFramesWritten;
    if (resume) {
    
    
        mPauseStartedTimeRealUs = -1;
    }
}

setAnchorTime is implemented by calling back to AudioSink's fillAudioBuffer function through the upper layer player:

size_t NuPlayer::Renderer::fillAudioBuffer(void *buffer, size_t size) {
    
    
	...
    if (mAudioFirstAnchorTimeMediaUs >= 0) {
    
    
       int64_t nowUs = ALooper::GetNowUs();
       setAnchorTime(mAudioFirstAnchorTimeMediaUs, nowUs - getPlayedOutAudioDurationUs(nowUs));
   }
	...
}

The key point is getPlayedOutAudioDurationUshow to get the actual playback in the function:

int64_t NuPlayer::Renderer::getPlayedOutAudioDurationUs(int64_t nowUs) {
    
    
    uint32_t numFramesPlayed;
    int64_t numFramesPlayedAt;
    AudioTimestamp ts;
    static const int64_t kStaleTimestamp100ms = 100000;

	/* 调用getTimestamp来获取精确播放时间 */
    status_t res = mAudioSink->getTimestamp(ts);
    if (res == OK) {
    
                     // case 1: mixing audio tracks and offloaded tracks.
    	/* 获取音频已播放帧数 */
        numFramesPlayed = ts.mPosition;
        /* 获取底层更新该值时的系统时间 */
        numFramesPlayedAt =
            ts.mTime.tv_sec * 1000000LL + ts.mTime.tv_nsec / 1000;
        /* 计算底层更新时与当前系统时间的时差 */
        const int64_t timestampAge = nowUs - numFramesPlayedAt;
        /* 如果差值超过100ms,则系统系统时间变更为当前时间 - 100ms */
        if (timestampAge > kStaleTimestamp100ms) {
    
    
            // This is an audio FIXME.
            // getTimestamp returns a timestamp which may come from audio mixing threads.
            // After pausing, the MixerThread may go idle, thus the mTime estimate may
            // become stale. Assuming that the MixerThread runs 20ms, with FastMixer at 5ms,
            // the max latency should be about 25ms with an average around 12ms (to be verified).
            // For safety we use 100ms.
            ALOGV("getTimestamp: returned stale timestamp nowUs(%lld) numFramesPlayedAt(%lld)",
                    (long long)nowUs, (long long)numFramesPlayedAt);
            numFramesPlayedAt = nowUs - kStaleTimestamp100ms;
        }
        //ALOGD("getTimestamp: OK %d %lld", numFramesPlayed, (long long)numFramesPlayedAt);
    } else if (res == WOULD_BLOCK) {
    
     // case 2: transitory state on start of a new track
        numFramesPlayed = 0;
        numFramesPlayedAt = nowUs;
        //ALOGD("getTimestamp: WOULD_BLOCK %d %lld",
        //        numFramesPlayed, (long long)numFramesPlayedAt);
    } else {
    
                             // case 3: transitory at new track or audio fast tracks.
    	/* 调用getPlaybackHeadPosition获取当前播放帧数 */
        res = mAudioSink->getPosition(&numFramesPlayed);
        CHECK_EQ(res, (status_t)OK);
        numFramesPlayedAt = nowUs;
        numFramesPlayedAt += 1000LL * mAudioSink->latency() / 2; /* XXX */
        //ALOGD("getPosition: %d %lld", numFramesPlayed, numFramesPlayedAt);
    }

    // TODO: remove the (int32_t) casting below as it may overflow at 12.4 hours.
    //CHECK_EQ(numFramesPlayed & (1 << 31), 0);  // can't be negative until 12.4 hrs, test
    /* 实际播放时间(us) = audiotrack已播放帧数 * 1000 * 每帧大小(2ch,16bit即为4)+ 当前时间 - 底层最新值对应的系统时间 */
    int64_t durationUs = (int64_t)((int32_t)numFramesPlayed * 1000LL * mAudioSink->msecsPerFrame())
            + nowUs - numFramesPlayedAt;
    if (durationUs < 0) {
    
    
        // Occurs when numFramesPlayed position is very small and the following:
        // (1) In case 1, the time nowUs is computed before getTimestamp() is called and
        //     numFramesPlayedAt is greater than nowUs by time more than numFramesPlayed.
        // (2) In case 3, using getPosition and adding mAudioSink->latency() to
        //     numFramesPlayedAt, by a time amount greater than numFramesPlayed.
        //
        // Both of these are transitory conditions.
        ALOGV("getPlayedOutAudioDurationUs: negative duration %lld set to zero", (long long)durationUs);
        durationUs = 0;
    }
    ALOGV("getPlayedOutAudioDurationUs(%lld) nowUs(%lld) frames(%u) framesAt(%lld)",
            (long long)durationUs, (long long)nowUs, numFramesPlayed, (long long)numFramesPlayedAt);
    return durationUs;
}

The understanding of this function needs to be based on AudioTrack getTimestampand getPlaybackHeadPositiontwo functions. The following are the two interfaces of the java layer:

/* 1 */
public boolean getTimestamp(AudioTimestamp timestamp);
/* 2 */
public int getPlaybackHeadPosition();

The former needs to be implemented at the bottom of the device. It can be understood that this is a function to accurately obtain pts, but this function has a characteristic that it does not refresh the value frequently, so it is not suitable to call frequently. From the reference given in the official document, it is recommended to use 10s~ Call once in 60s, look at the entry class:

public long framePosition; /* 写入帧的位置 */
public long nanoTime;      /* 更新帧位置时的系统时间 */

The latter is an interface that can be called frequently. Its value returns the data that audiotrack continues to write to the hal layer from the beginning of playback. After understanding these two functions, we know that there are two ways to obtain them from audiotrack Audio pts. It needs to be explained here that audiosink encapsulates AudioTrack, and there are differences between the java layer and the native layer, so we will find that when calling the function of AudioTrack to obtain the playback position in the nuplayer code, the function name and return value will have different.
Let's look at the code, which is actually a calibration of the current playback time. The principle of calibration is to compare the system time when the bottom frame value is updated with the current time, calculate a difference, and then use this difference to do Calibration to calculate the actual elapsed playing time.
Back to the function above getCurrentPositionFromAnchor:

    /* 计算当前帧播放时间 = (系统时间 - 已播放时间) + 上一帧音频时间戳 */
    int64_t positionUs = (nowUs - mAnchorTimeRealUs) + mAnchorTimeMediaUs;

After getting the calculated current playback time, we can understand Note 2. The purpose is to infer the display time of the video frame in the synchronous state through the calibration and analysis of the underlying audio data.
Note 3:
Here, a calibration will be performed on the display time, but the calibration of nuplayer is too complicated to understand.
Note 4:
In the official MediaCodec document, the suggestion for sending video frames to the display is two synchronization signal points in advance, so nuplayer calculates the duration of two synchronization signal points here.
Note 5:
Ensure that the message is delivered for rendering within two synchronization signal points, and this message is the kWhatDrainVideoQueuemessage of Note 1.

3. Video rendering:
see kWhatDrainVideoQueuehow the message handles rendering:

        case kWhatDrainVideoQueue:
        {
    
    
            int32_t generation;
            CHECK(msg->findInt32("generation", &generation));
            if (generation != mVideoQueueGeneration) {
    
    
                break;
            }

            mDrainVideoQueuePending = false;

            onDrainVideoQueue();

            Mutex::Autolock autoLock(mLock);
            postDrainVideoQueue_l();
            break;
        }

Follow up to the onDrainVideoQueue() function:

void NuPlayer::Renderer::onDrainVideoQueue() {
    
    
    if (mVideoQueue.empty()) {
    
    
        return;
    }

    QueueEntry *entry = &*mVideoQueue.begin();

    if (entry->mBuffer == NULL) {
    
    
        // EOS

        notifyEOS(false /* audio */, entry->mFinalResult);

        mVideoQueue.erase(mVideoQueue.begin());
        entry = NULL;

        setVideoLateByUs(0);
        return;
    }

    int64_t nowUs = -1;
    int64_t realTimeUs;
    if (mFlags & FLAG_REAL_TIME) {
    
    
        CHECK(entry->mBuffer->meta()->findInt64("timeUs", &realTimeUs));
    } else {
    
    
        int64_t mediaTimeUs;
        CHECK(entry->mBuffer->meta()->findInt64("timeUs", &mediaTimeUs));

        nowUs = ALooper::GetNowUs();
        /* 重新计算实际送显时间  */
        realTimeUs = getRealTimeUs(mediaTimeUs, nowUs);
    }

    bool tooLate = false;

    if (!mPaused) {
    
    
        if (nowUs == -1) {
    
    
            nowUs = ALooper::GetNowUs();
        }
        /* 计算出当前帧播放时差是否大于40ms */
        setVideoLateByUs(nowUs - realTimeUs);
        /* 大于40ms,即视频帧来的太迟了,将丢帧 */
        tooLate = (mVideoLateByUs > 40000);

        if (tooLate) {
    
    
            ALOGV("video late by %lld us (%.2f secs)",
                 mVideoLateByUs, mVideoLateByUs / 1E6);
        } else {
    
    
            ALOGV("rendering video at media time %.2f secs",
                    (mFlags & FLAG_REAL_TIME ? realTimeUs :
                    (realTimeUs + mAnchorTimeMediaUs - mAnchorTimeRealUs)) / 1E6);
        }
    } else {
    
    
        setVideoLateByUs(0);
        if (!mVideoSampleReceived && !mHasAudio) {
    
    
            // This will ensure that the first frame after a flush won't be used as anchor
            // when renderer is in paused state, because resume can happen any time after seek.
            setAnchorTime(-1, -1);
        }
    }

	/* 注意这两个值决定是送显还是丢帧 */
    entry->mNotifyConsumed->setInt64("timestampNs", realTimeUs * 1000ll);
    entry->mNotifyConsumed->setInt32("render", !tooLate);
    entry->mNotifyConsumed->post();
    mVideoQueue.erase(mVideoQueue.begin());
    entry = NULL;

    mVideoSampleReceived = true;

    if (!mPaused) {
    
    
        if (!mVideoRenderingStarted) {
    
    
            mVideoRenderingStarted = true;
            notifyVideoRenderingStart();
        }
        notifyIfMediaRenderingStarted();
    }
}

This function is not complicated, and the sending and display time is calibrated again to eliminate the delay in message sending and receiving processing. If the video frame arrives too late and is greater than 40ms, the variable will be set to false and the frame will be dropped directly tooLate. Here is the first place to pay attention to. We saw that the display time was recalculated in the code, but it was not called to calibrate scheduleagain. I am not sure if it is a bug. scheduleThe principle of the function is not understood, but we can be sure that, This function must adjust the theoretical sending time to a certain synchronous time point. mNotifyConsumed->post()The delivered message is that kWhatRenderBufferthis has been analyzed in the audio part, and let’s look at the processing of the message again, so I won’t repeat it:

void NuPlayer::Decoder::onRenderBuffer(const sp<AMessage> &msg) {
    
    
    status_t err;
    int32_t render;
    size_t bufferIx;
    CHECK(msg->findSize("buffer-ix", &bufferIx));

    if (!mIsAudio) {
    
    
        int64_t timeUs;
        sp<ABuffer> buffer = mOutputBuffers[bufferIx];
        buffer->meta()->findInt64("timeUs", &timeUs);

        if (mCCDecoder != NULL && mCCDecoder->isSelected()) {
    
    
            mCCDecoder->display(timeUs);
        }
    }

	/* 如果render为false,就丢帧不去渲染 */
    if (msg->findInt32("render", &render) && render) {
    
    
        int64_t timestampNs;
        CHECK(msg->findInt64("timestampNs", &timestampNs));
        err = mCodec->renderOutputBufferAndRelease(bufferIx, timestampNs);
    } else {
    
    
        err = mCodec->releaseOutputBuffer(bufferIx);
    }
    if (err != OK) {
    
    
        ALOGE("failed to release output buffer for %s (err=%d)",
                mComponentName.c_str(), err);
        handleError(err);
    }
}

4. Summary:
The logic diagram of the video frame and synchronization processing mechanism is as follows:
insert image description here

Guess you like

Origin blog.csdn.net/achina2011jy/article/details/113929427