FFmpeg re-learning -- FFmpeg decoding knowledge

Continue to read Lei Xiaohua's course materials - the production of a video player based on FFmpeg+SDL

I used five pages to talk about FFmpeg, and its main purpose is to realize the function of converting pictures to videos

In general, I have some understanding of FFmepg. But the source code part is still not clear at all. Next, simply sort out the FFmpeg source code structure. After all, the job I'm doing now doesn't put too much emphasis on this. Wait for the opportunity to study systematically in the future.

ffmpeg re-learning -- Linux installation instructions

ffmpeg re-learning--Installation instructions under Windows

ffmpeg re-learning -- convert jpeg to mp4

ffmpeg re-learning -- hardware accelerated codec

ffmpeg re-learning-- basic knowledge of video and audio ,

★The business card at the end of the article can receive audio and video development learning materials for free, including (FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, srs) and audio and video learning roadmaps, etc.

see below!

1. Introduction to FFmpeg

See: ffmpeg source code

Come, come, README.

FFmpeg is a collection of libraries and tools to process multimedia content such as audio, video, subtitles and related metadata.

The libraries it contains are:

libavcodec provides implementations of a wider range of codecs.

libavformat implements streaming protocols, container formats and basic I/O access.

libavutil includes hashers, decompressors, and miscellaneous utility functions.

libavfilter provides means to alter decoded audio and video through a series of filters.

libavdevice provides an abstraction for accessing capture and playback devices.

libswresample implements audio mixing and resampling programs.

libswscale implements color conversion and scaling routines.

tool

ffmpeg is a command line toolkit for manipulating, converting and streaming multimedia content.

ffplay is a minimalistic multimedia player.

ffprobe is a simple analysis tool for inspecting multimedia content.

ffserver is a multimedia streaming server for live streaming.

Other gadgets like aviocat, ismindex and qt-faststart.

2. FFmpeg decoding function

Introduction to FFmpeg decoding function

av_register_all() registers all components.

avformat_open_input() Opens an input video file.

avformat_find_stream_info() Get video file information.

avcodec_find_decoder() Finds a decoder.

avcodec_open2() Opens the decoder.

av_read_frame() Reads a frame of compressed data from an input file.

avcodec_decode_video2() decodes a frame of compressed data.

avcodec_close() closes the decoder.

avformat_close_input() closes the input video file.

The flowchart of FFmpeg decoding is as follows

Source code analysis

【Architecture Diagram】

FFmpeg source code structure diagram - decoding

FFmpeg source code structure diagram - encoding

【General】

Simple analysis of FFmpeg source code: av_register_all()

Simple analysis of FFmpeg source code: avcodec_register_all()

Simple analysis of FFmpeg source code: memory allocation and release (av_malloc(), av_free(), etc.)

Simple analysis of FFmpeg source code: initialization and destruction of common structures (AVFormatContext, AVFrame, etc.)

Simple analysis of FFmpeg source code: avio_open2()

Simple analysis of FFmpeg source code: av_find_decoder() and av_find_encoder()

Simple analysis of FFmpeg source code: avcodec_open2()

Simple analysis of FFmpeg source code: avcodec_close()

【decoding】

Graphical FFMPEG open media function avformat_open_input

Simple analysis of FFmpeg source code: avformat_open_input()

Simple analysis of FFmpeg source code: avformat_find_stream_info()

Simple analysis of FFmpeg source code: av_read_frame()

Simple analysis of FFmpeg source code: avcodec_decode_video2()

Simple analysis of FFmpeg source code: avformat_close_input()

3. The data structure of FFFmpeg decoding

The structure definition below can be viewed by Goto Definition (F12).

The data structure of FFmpeg decoding is as follows

Introduction to FFmpeg data structure

AVFormatContext

The encapsulation format context structure is also a global structure that stores information about the encapsulation format of video files.

      iformat:输入视频的AVInputFormat
      nb_streams :输入视频的AVStream 个数
      streams :输入视频的AVStream []数组
      duration :输入视频的时长(以微秒为单位)
      bit_rate :输入视频的码率

AVInputFormat

There is one structure for each package format (such as FLV, MKV, MP4, AVI).

      name:封装格式名称
      long_name:封装格式的长名称
      extensions:封装格式的扩展名
      id:封装格式ID  
      一些封装格式处理的接口函数

AVStream

Each video (audio) stream in the video file corresponds to this structure.

      id:序号
      codec:该流对应的AVCodecContext
      time_base:该流的时基
      r_frame_rate: 该流的帧率

The AVCodecContext
encoder context structure stores video (audio) codec-related information.

      codec:编解码器的AVCodec
      width, height:图像的宽高(只针对视频)
      pix_fmt:像素格式(只针对视频)
      sample_rate:采样率( 只针对音频)
      channels:声道数(只针对音频)
      sample_fmt:采样格式(只针对音频)

AVCodec

Each video (audio) codec (eg H.264 codec) corresponds to this structure.

      name:编解码器名称
      long_name:编解码器长名称
      type:编解码器类型
      id:编解码器ID
      一些编解码的接口函数

The AVPacket

Stores a frame of compressed encoded data.

      pts:显示时间戳
      dts :解码时间戳
      data :压缩编码数据
      size :压缩编码数据大小
      stream_index :所属的AVStream

AVFrame

Stores a frame of decoded pixel (sample) data.

      data:解码后的图像像素数据(音频采样数据)。
      linesize:对视频来说是图像中一行像素的大小;对音频来说是整个音频帧的大小。
      width, height:图像的宽高(只针对视频)。
      key_frame:是否为关键帧(只针对视频) 。
      pict_type:帧类型(只针对视频) 。例如I, P, B。

4. Decoding example

(1) The source code is as follows:

/**
* 最简单的基于FFmpeg的解码器
* Simplest FFmpeg Decoder
*
* 雷霄骅 Lei Xiaohua
* [email protected]
* 中国传媒大学/数字电视技术
* Communication University of China / Digital TV Technology
* http://blog.csdn.net/leixiaohua1020
*
* 本程序实现了视频文件的解码(支持HEVC,H.264,MPEG2等)。
* 是最简单的FFmpeg视频解码方面的教程。
* 通过学习本例子可以了解FFmpeg的解码流程。
* This software is a simplest video decoder based on FFmpeg.
* Suitable for beginner of FFmpeg.
*
*/
 
#include <stdio.h>
#include "stdafx.h"
 
#define __STDC_CONSTANT_MACROS
 
#ifdef _WIN32
//Windows
extern "C"
{
#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"
#include "libswscale/swscale.h"
};
#else
//Linux...
#ifdef __cplusplus
extern "C"
{
#endif
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>
#ifdef __cplusplus
};
#endif
#endif
 
 
int main(int argc, char* argv[])
{
	AVFormatContext *pFormatCtx;
	int             i, videoindex;
	AVCodecContext  *pCodecCtx;
	AVCodec         *pCodec;
	AVFrame *pFrame, *pFrameYUV;
	uint8_t *out_buffer;
	AVPacket *packet;
	int y_size;
	int ret, got_picture;
	struct SwsContext *img_convert_ctx;
	//输入文件路径
	char filepath[] = "Titanic.ts";
	//创建两个解码后的输出文件
	FILE *fp_yuv = fopen("output.yuv", "wb+");
	FILE *fp_h264 = fopen("output.h264", "wb+");
 
	av_register_all();//注册所有组件
	avformat_network_init();//初始化网络
	pFormatCtx = avformat_alloc_context();//初始化一个AVFormatContext
 
	if (avformat_open_input(&pFormatCtx, filepath, NULL, NULL) != 0) {//打开输入的视频文件
		printf("Couldn't open input stream.\n");
		return -1;
	}
	if (avformat_find_stream_info(pFormatCtx, NULL)<0) {//获取视频文件信息
		printf("Couldn't find stream information.\n");
		return -1;
	}
	videoindex = -1;
	for (i = 0; i < pFormatCtx->nb_streams; i++)
	{
		if (pFormatCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO) {
			videoindex = i;
			break;
		}
	}
	if (videoindex == -1) {
		printf("Didn't find a video stream.\n");
		return -1;
	}
 
	pCodecCtx = pFormatCtx->streams[videoindex]->codec;
	pCodec = avcodec_find_decoder(pCodecCtx->codec_id);//查找解码器
	if (pCodec == NULL) {
		printf("Codec not found.\n");
		return -1;
	}
	if (avcodec_open2(pCodecCtx, pCodec, NULL)<0) {//打开解码器
		printf("Could not open codec.\n");
		return -1;
	}
 
	/*
	* 在此处添加输出视频信息的代码
	* 取自于pFormatCtx,使用fprintf()
	*/
	FILE *fp = fopen("info.txt", "wb+");
	fprintf(fp,"时长:%d\n",pFormatCtx->duration);
	fprintf(fp,"封装格式:%s\n",pFormatCtx->iformat->long_name);
	fprintf(fp,"宽高:%d*%d\n",pFormatCtx->streams[videoindex]->codec->width, pFormatCtx->streams[videoindex]->codec->height);
 
	pFrame = av_frame_alloc();
	pFrameYUV = av_frame_alloc();
	out_buffer = (uint8_t *)av_malloc(avpicture_get_size(PIX_FMT_YUV420P, pCodecCtx->width, pCodecCtx->height));
	avpicture_fill((AVPicture *)pFrameYUV, out_buffer, PIX_FMT_YUV420P, pCodecCtx->width, pCodecCtx->height);
	packet = (AVPacket *)av_malloc(sizeof(AVPacket));
	//Output Info-----------------------------
	printf("--------------- File Information ----------------\n");
	av_dump_format(pFormatCtx, 0, filepath, 0);
	printf("-------------------------------------------------\n");
	img_convert_ctx = sws_getContext(pCodecCtx->width, pCodecCtx->height, pCodecCtx->pix_fmt,
		pCodecCtx->width, pCodecCtx->height, PIX_FMT_YUV420P, SWS_BICUBIC, NULL, NULL, NULL);
 
	int frame_cnt=0;
	while (av_read_frame(pFormatCtx, packet) >= 0) {//读取一帧压缩数据
		if (packet->stream_index == videoindex) {
			/*
			* 在此处添加输出H264码流的代码
			* 取自于packet,使用fwrite()
			*/
			fwrite(packet->data, 1, packet->size, fp_h264); //把H264数据写入fp_h264文件
 
			ret = avcodec_decode_video2(pCodecCtx, pFrame, &got_picture, packet);//解码一帧压缩数据
			if (ret < 0) {
				printf("Decode Error.\n");
				return -1;
			}
			if (got_picture) {
				sws_scale(img_convert_ctx, (const uint8_t* const*)pFrame->data, pFrame->linesize, 0, pCodecCtx->height,
					pFrameYUV->data, pFrameYUV->linesize);
 
				y_size = pCodecCtx->width*pCodecCtx->height;
				/*
				* 在此处添加输出YUV的代码
				* 取自于pFrameYUV,使用fwrite()
				*/
				printf("Decoded frame index: %d\n", frame_cnt);
 
				fwrite(pFrameYUV->data[0], 1, y_size, fp_yuv);    //Y 
				fwrite(pFrameYUV->data[1], 1, y_size / 4, fp_yuv);  //U
				fwrite(pFrameYUV->data[2], 1, y_size / 4, fp_yuv);  //V
 
				frame_cnt++;
 
			}
		}
		av_free_packet(packet);
	}
 
	//flush decoder
	//FIX: Flush Frames remained in Codec
	int frame_cnt1 = 0;
	while (1) {
		ret = avcodec_decode_video2(pCodecCtx, pFrame, &got_picture, packet);
		if (ret < 0)
			break;
		if (!got_picture)
			break;
		sws_scale(img_convert_ctx, (const uint8_t* const*)pFrame->data, pFrame->linesize, 0, pCodecCtx->height,
			pFrameYUV->data, pFrameYUV->linesize);
 
		int y_size = pCodecCtx->width*pCodecCtx->height;
		printf("Flush Decoder: %d\n", frame_cnt1);
		fwrite(pFrameYUV->data[0], 1, y_size, fp_yuv);    //Y 
		fwrite(pFrameYUV->data[1], 1, y_size / 4, fp_yuv);  //U
		fwrite(pFrameYUV->data[2], 1, y_size / 4, fp_yuv);  //V
		frame_cnt1++;
	}
	
	sws_freeContext(img_convert_ctx);
 
	//关闭文件以及释放内存
	fclose(fp_yuv);
	fclose(fp_h264);
 
	av_frame_free(&pFrameYUV);
	av_frame_free(&pFrame);
	avcodec_close(pCodecCtx);
	avformat_close_input(&pFormatCtx);
 
	return 0;
}

(2) Project download

Download: FFmpeg Decoding Project

(3) Project Description

Use  MediaInfo  software to view Titanic.ts video information

View the debug generated info.txt to see the corresponding video information

View the resulting video via YUV player and CyberLink PowerDVD 14.

5. Subsequent summary

(1) flush_decoder function

When the av_read_frame() loop exits, the decoder may actually contain the remaining frames of data. Therefore, it is necessary to output these frames of data through "flush_decoder". In short, the "flush_decoder" function directly calls avcodec_decode_video2() to obtain AVFrame, instead of passing AVPacket to the decoder.

Specific cause:

See: avcodec_decode_video2() solves the problem of frame loss after decoding video

You can also see the break point, there are indeed a few frames that have not been printed before.

(2) Why should the decoded data be processed by the sws_scale() function?

The decoded video pixel data in YUV format is stored in data[0], data[1], data[2] of AVFrame. However, these pixel values ​​are not stored continuously, and some invalid pixels are stored after each row of valid pixels. Take brightness Y data as an example, data[0] contains linesize[0]*height data in total. But for optimization considerations, linesize[0] is actually not equal to width, but a value larger than width. So need to use sws_scale () to convert. Invalid data is removed after conversion, and the values ​​of width and linesize[0] are equal.

(3) Source code example

There are also examples in the source code, see FFmpeg/doc/examples/

View decode_video.c source code:

/*
 * Copyright (c) 2001 Fabrice Bellard
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 * THE SOFTWARE.
 */
 
/**
 * @file
 * video decoding with libavcodec API example
 *
 * @example decode_video.c
 */
 
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
 
#include <libavcodec/avcodec.h>
 
#define INBUF_SIZE 4096
 
static void pgm_save(unsigned char *buf, int wrap, int xsize, int ysize,
                     char *filename)
{
    FILE *f;
    int i;
 
    f = fopen(filename,"w");
    fprintf(f, "P5\n%d %d\n%d\n", xsize, ysize, 255);
    for (i = 0; i < ysize; i++)
        fwrite(buf + i * wrap, 1, xsize, f);
    fclose(f);
}
 
static int decode_write_frame(const char *outfilename, AVCodecContext *avctx,
                              AVFrame *frame, int *frame_count, AVPacket *pkt, int last)
{
    int len, got_frame;
    char buf[1024];
 
    len = avcodec_decode_video2(avctx, frame, &got_frame, pkt);
    if (len < 0) {
        fprintf(stderr, "Error while decoding frame %d\n", *frame_count);
        return len;
    }
    if (got_frame) {
        printf("Saving %sframe %3d\n", last ? "last " : "", *frame_count);
        fflush(stdout);
 
        /* the picture is allocated by the decoder, no need to free it */
        snprintf(buf, sizeof(buf), "%s-%d", outfilename, *frame_count);
        pgm_save(frame->data[0], frame->linesize[0],
                 frame->width, frame->height, buf);
        (*frame_count)++;
    }
    if (pkt->data) {
        pkt->size -= len;
        pkt->data += len;
    }
    return 0;
}
 
int main(int argc, char **argv)
{
    const char *filename, *outfilename;
    const AVCodec *codec;
    AVCodecContext *c= NULL;
    int frame_count;
    FILE *f;
    AVFrame *frame;
    uint8_t inbuf[INBUF_SIZE + AV_INPUT_BUFFER_PADDING_SIZE];
    AVPacket avpkt;
 
    if (argc <= 2) {
        fprintf(stderr, "Usage: %s <input file> <output file>\n", argv[0]);
        exit(0);
    }
    filename    = argv[1];
    outfilename = argv[2];
 
    avcodec_register_all();
 
    av_init_packet(&avpkt);
 
    /* set end of buffer to 0 (this ensures that no overreading happens for damaged MPEG streams) */
    memset(inbuf + INBUF_SIZE, 0, AV_INPUT_BUFFER_PADDING_SIZE);
 
    /* find the MPEG-1 video decoder */
    codec = avcodec_find_decoder(AV_CODEC_ID_MPEG1VIDEO);
    if (!codec) {
        fprintf(stderr, "Codec not found\n");
        exit(1);
    }
 
    c = avcodec_alloc_context3(codec);
    if (!c) {
        fprintf(stderr, "Could not allocate video codec context\n");
        exit(1);
    }
 
    if (codec->capabilities & AV_CODEC_CAP_TRUNCATED)
        c->flags |= AV_CODEC_FLAG_TRUNCATED; // we do not send complete frames
 
    /* For some codecs, such as msmpeg4 and mpeg4, width and height
       MUST be initialized there because this information is not
       available in the bitstream. */
 
    /* open it */
    if (avcodec_open2(c, codec, NULL) < 0) {
        fprintf(stderr, "Could not open codec\n");
        exit(1);
    }
 
    f = fopen(filename, "rb");
    if (!f) {
        fprintf(stderr, "Could not open %s\n", filename);
        exit(1);
    }
 
    frame = av_frame_alloc();
    if (!frame) {
        fprintf(stderr, "Could not allocate video frame\n");
        exit(1);
    }
 
    frame_count = 0;
    for (;;) {
        avpkt.size = fread(inbuf, 1, INBUF_SIZE, f);
        if (avpkt.size == 0)
            break;
 
        /* NOTE1: some codecs are stream based (mpegvideo, mpegaudio)
           and this is the only method to use them because you cannot
           know the compressed data size before analysing it.
           BUT some other codecs (msmpeg4, mpeg4) are inherently frame
           based, so you must call them with all the data for one
           frame exactly. You must also initialize 'width' and
           'height' before initializing them. */
 
        /* NOTE2: some codecs allow the raw parameters (frame size,
           sample rate) to be changed at any frame. We handle this, so
           you should also take care of it */
 
        /* here, we use a stream based decoder (mpeg1video), so we
           feed decoder and see if it could decode a frame */
        avpkt.data = inbuf;
        while (avpkt.size > 0)
            if (decode_write_frame(outfilename, c, frame, &frame_count, &avpkt, 0) < 0)
                exit(1);
    }
 
    /* Some codecs, such as MPEG, transmit the I- and P-frame with a
       latency of one frame. You must do the following to have a
       chance to get the last frame of the video. */
    avpkt.data = NULL;
    avpkt.size = 0;
    decode_write_frame(outfilename, c, frame, &frame_count, &avpkt, 1);
 
    fclose(f);
 
    avcodec_free_context(&c);
    av_frame_free(&frame);
 
    return 0;
}

Compile:

Make compiles with the following error:

Package libavdevice was not found in the pkg-config search path.
Perhaps you should add the directory containing `libavdevice.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libavdevice' found

Looking at the README, there is a method:

Method 1: build the installed examples in a generic read/write user directory
 
Copy to a read/write user directory and just use "make", it will link
to the libraries on your system, assuming the PKG_CONFIG_PATH is
correctly configured.

It means to configure the PKG_CONFIG_PATH library location

See: From compiling ffmpeg/examples, then understand pkg-config

Solution:

在 /etc/profile 最后添加:
export PKG_CONFIG_PATH=/usr/local/ffmpeg/lib/pkgconfig:$PKG_CONFIG_PATH 
 
执行: source /etc/profile 
立即生效

Then make and compile to generate files:

Original  FFmpeg re-learning -- FFmpeg decoding knowledge

★The business card at the end of the article can receive audio and video development learning materials for free, including (FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, srs) and audio and video learning roadmaps, etc.

see below!

 

Guess you like

Origin blog.csdn.net/yinshipin007/article/details/131994596