Novice learning FFmpeg - call the API to calculate key frame rendering time

By simple calculation, the line breaks on the I frame in the video. Please refer to the complete code https://andy-zhangtao.github.io/ffmpeg-examples/

Glossary

We first need to define the following terms and concepts:

I / P / B-frame (see specific differences https://www.jianshu.com/p/18af03556431 )
I-frame: intra-coded frames (key frames)
P-frame: forward predictive frame (I-frame difference value calculated in accordance)
frame B: bidirectional predictive frame (calculated from the difference between the I and P frames)
PTS: Timescale frame display (this display frame which point in time)
DTS: frame decoding time scale (which at this point in time decoded frame)
Timestamp: timestamp within the frame of the video
Time_base: video indicates the time of the "scale"

Process flow

There is no absolute time within the video, only the relative time (relative to the video starting position). See, for example, in the player timeline "00:00:05" indicates the current frame is seen (00:00:00) decoding and rendering the relative start time point.

And "00:00:05" just to allow users to easily understand and unfolded inside the video is saved with time stamps, time stamp "00:00:05" is likely to be relatively "5000000μs" (not consider rounding).

Then the time stamp is how it calculated? In this case it is necessary to match calculated by the PTS and Time_base.

First look Time_base. Time_base like a ruler, marked a full scale, such as (1,60) represents the time scale is 1/60, per unit of time is 1/60 sec. If so (1,1000) indicates per unit of time is 1 microsecond.

Speaking pts above is the time scale display, which is taking up much time scale. National Cheng Kung University vernacular pts conversion is occupied by the number of scale, while time_base means that each scale is long.

However, this is what use is it? Time_base most important role is to unify the "time rhythm" of. For example when using time_base 1/1000 A video coding, the frame is saved as a pts 465,000. When A video decode replaced time_base 1/9000, in which case the time scale is inconsistent, we need to convert into a timestamp when decoding pts * encode_time_base, so as to ensure correct decoding.

Coding

The above theory is described how the code is calculated by the following point of view and in terms of timestamp time.

The only need to show pts each frame, time_base, time and therefore do not need to initialize the output, as long as the input to initialize.

Initialization input source

According to initialize the introduction of several ideas before, only need to follow 打开文件-> 判断视频流-> 初始化解码器such a step on it.

    +------------------------+              +-------------------------+
    |  avformat_open_input   | ------------>|avformat_find_stream_info|
    +------------------------+              +-------------------------+
                                                      |
                                                      |
                                                      |
                                                     \|/
   +-----------------------------+           +-------------------------+
   |avcodec_parameters_to_context| <---------|   avcodec_find_decoder  |
   +-----------------------------+           +-------------------------+

avcodec_parameters_to_contextIn particular concern, this function will be designated by the user to initialize the coding context based on the coding information of the input source. If the encoded information does not match or is set incorrectly, inexplicable decoding error. After calling this function in general, most of the decoding error can disappear.

Time calculation

time_base is a struct

typedef struct AVRational{
    int num; ///< Numerator
    int den; ///< Denominator
} AVRational;

num represents the molecule, den denominator. For time_base for num is 1, den every second decile represents the number of copies. By the above mentioned pts*time_baseit can be drawn on a time stamp, so it is necessary to calculate how many representatives each specific time scale, so by av_q2ddraw specific value for each scale.

After the cycle time of the decoded frame data, can be directly iframe->ptsread pts value of the current frame, and then multiplied by the scale value of the current time stamp can be drawn iframe->pts * av_q2d(_time_base).

Pseudo-code as follows:

while av_read_frame {
    avcodec_send_packet
    ...
    while avcodec_receive_frame {
        ...
        iframe->pts * av_q2d(_time_base)
        ...
    }
}