(1) Based on FFmpeg and SDL's video player - ready to basics

(1) Based on FFmpeg and SDL's video player - ready to basics

Encapsulation

Most of our videos are usually common MP4, avi video file suffix is ​​actually a Screen file by a certain standard packages, the main role is to package format video streams and audio streams stored in accordance with a certain format in a file.

At present the main video package format:

[Pictures of foreign chains dump fails, the source station may have a security chain mechanism, it is recommended to save the picture down directly upload (img-c6rholbP-1581172818788) (C: \ Users \ zsl \ AppData \ Roaming \ Typora \ typora-user-images \ image-20200208203541191.png)]

Video pixel data

Video pixel data Role: save the pixel value of each pixel on the screen.
Common formats: Common pixel data formats are RGB24, RGB32, YUV420P, YUV422P, YUV444P and so on. Compression coding is generally used for the pixel data in YUV format, the most common format for the YUV420P.

YUV format:

Experiments show that the human eye is sensitive to luminance and not sensitive to color. Thus chrominance information and luminance information can be separated, and the chrominance information a more "hard" bit compression scheme, thereby improving compression efficiency. YUV format, Y contains only luminance information, chrominance information contains only UV.

RGB format:

Here Insert Picture Description
Red, Green, Blue three colors, the colors can be mixed into all the world, each point in the color image, the R, G, B of three components. RGB24 to an example, the pixel data of the image storage are as follows

Here Insert Picture Description

RGB24 sequentially stored in each pixel of R, G, B information

Audio sample data

The audio sample data is data obtained by decoding the audio stream. Common formats PCM audio samples and so on. PCM data including audio frequency typically 44,100 Hz.

The audio sample data functions: holds the value of each sampling point in the audio

Video encoding data

We are looking at the video, in fact, will be a picture of a continuous put out, like a slide show, as here, there is a concept, the number of frames displayed per second video frame rate of the video to be, that as long as the frame rate is high enough on our senses will think this is a continuous video, so the video is actually composed of a large number of continuous video.

​ 如果简单由众多图片组成的视频是非常巨大的,无论是网络传输还是本地存储都是不可行的,所以需要一种方式将视频进行压缩吗,使其适于网络传输和本地存储。视频基本上都是连续的两帧图像差别不是太大。因此,在记录下第一张完整的图像之后以后的每张图像都是只记录下和上一帧图像不一样的地方,直到出现了差别很大的图像,才重新记录一帧完整的图像(这个完整的图像就叫做关键帧)。这样就可以很大程度上减小空间了。

​ 视频编码的作用是将视频像素数据(RGB,YUV等)压缩成为视频码流,从而降低视频的数据量

​ 目前主流的视频编码格式有以下几种:

Here Insert Picture Description

但是最常使用的还是h.264格式,它主要有以下几个优点:

  1. 低码率(Low Bit Rate):和MPEG2和MPEG4 ASP等压缩技术相比,在同等图像质量下,采用H.264技术压缩后的数据量只有MPEG2的1/8,MPEG4的1/3
  2. 高质量的图像:H.264能提供连续、流畅的高质量图像(DVD质量)
  3. 容错能力强:H.264提供了解决在不稳定网络环境下容易发生的丢包等错误的必要工具。
  4. 网络适应性强:H.264提供了网络抽象层(Network Abstraction Layer),使得H.264的文件能容易地在不同网络上传输(例如互联网,CDMA,GPRS,WCDMA,CDMA2000等)
  5. 高压缩率,H.264的压缩比达到惊人的102∶1

音频编码数据

​ 一个完整的视频只有图像而没有声音显然是不行的,那么声音格式又是什么呢?

​ 音频文件的生成过程是将声音信息采样、量化和编码,由模拟信号产生数字信号的过程。与视频一样,声音也有一个频率,叫做声音的采样率,即单位时间内对模拟信号的采样次数。人耳所能听到的声音,人耳所能够听到的频率范围为20Hz~20000Hz,一般音频的采样频率为22050Hz,CD音质的采样率为44100Hz。

​ 常见的音频编码格式:

Here Insert Picture Description

单独音频常使用mp3格式,而视频中的音频格式常为AAC格式。AAC格式可以将原始音频采样数据压缩10倍以上。

​ 音频部分还有下面几个常见概念:

  • 声音: 由物体振动产生的声波。因此也有频率和振幅的特征,频率对应于时间轴线,振幅对应于电平轴线。
  • 采样: 波是无限光滑的,采样的过程就是从波中抽取某些点的频率值,即把模拟信号数字化。
  • 采样率: 录音设备在一秒钟内对声音信号的采样次数,采样频率越高声音的还原就越真实越自然。采样频率一般共分为22.05KHz、44.1KHz、48KHz三个等级。8KHz - 电话所用采样率, 对于人的说话已经足够,22.05KHz只能达到FM广播的声音品质,44.1KHz则是理论上的CD音质界限,48KHz则更加精确一些。
  • 采样位数: 记录每次采样值数值大小的位数。采样位数通常有8bits或16bits两种,采样位数越大,所能记录声音的变化度就越细腻,相应的数据量就越大。
  • 声道数: 声道数是指支持能不同发声的音响的个数,它是衡量音响设备的重要指标之一。
    单声道的声道数为1个声道;双声道的声道数为2个声道;立体声道的声道数默认为2个声道;立体声道(4声道)的声道数为4个声道。
  • 码率: 码率 = 采样率 * 采样位数 * 声道数
    如果是CD音质,采样率44.1KHz,采样位数16bit,立体声(双声道),码率 = 44.1 * 16 * 2 = 1411.2Kbps = 176.4KBps,那么录制一分钟的音乐,大概10.34MB。
  • 音频帧: 音频数据是流式的,本身没有明确的一帧帧的概念,在实际的应用中,为了音频算法处理/传输的方便,一般约定俗成取2.5ms~60ms为单位的数据量为一帧音频。这个时间被称之为“采样时间”,其长度没有特别的标准,它是根据编解码器和具体应用的需求来决定的。
  • PCM数据格式:PCM(Pulse Code Modulation)也被称为脉码编码调制。PCM中的声音数据没有被压缩,如果是单声道的文件,采样数据按时间的先后顺序依次存入。(它的基本组织单位是BYTE(8bit)或WORD(16bit))。PCM数据是最原始的音频数据完全无损,所以PCM数据虽然音质优秀但体积庞大,为了解决这个问题先后诞生了一系列的音频格式,这些音频格式运用不同的方法对音频数据进行压缩,其中有无损压缩(ALAC、APE、FLAC)和有损压缩(MP3、AAC、OGG、WMA)两种。

视频播放器原理

​ 播放一个视频文件的流程如下:

[Pictures of foreign chains dump fails, the source station may have a security chain mechanism, it is recommended to save the picture down directly upload (img-pDIONFWp-1581172818790) (C: \ Users \ zsl \ AppData \ Roaming \ Typora \ typora-user-images \ image-20200208202445939.png)]

  • 解封装的作用: 就是将封装格式的数据,分离成为音频流压缩编码数据和视频流压缩编码数据。封装格式种类很多,例如MP4,MKV,RMVB,TS,FLV,AVI等等,它的作用就是将已经压缩编码的视频数据和音频数据按照一定的格式放到一起。例如,FLV格式的数据,经过解封装操作后,输出H.264编码的视频码流和AAC编码的音频码流。
  • The role of audio and video decoding: that is, the video / audio data compression encoding and decoding become uncompressed video / audio raw data. Audio compression coding standards include AAC, MP3, AC-3, etc., video compression coding standard contains H.264, MPEG2, VC-1, and so on. Decoding the entire system is the most important and most complex aspect. By decoding the compression-encoded video data output in a non-compressed color data, e.g. YUV420P, RGB and the like; compression-encoded audio data output in a non-compressed audio sample data, e.g. PCM data
  • Synchronization of video and audio effects: video and sound cards decapsulation module is based on the acquired processing parameter information, the synchronization of the decoded video and audio data and video data to the audio playback system out
Published 22 original articles · won praise 5 · Views 3889

Guess you like

Origin blog.csdn.net/qq_41345281/article/details/104229666
Recommended