FFmpeg+SDL---Basic knowledge of video and audio and the use of related tools

Chapter 1: Basic knowledge of audio and video

It is recommended to read before reading this chapter: FFmpeg+SDL-----Syllabus

table of Contents

• Preface
• Principle of video player: processing flow, and introduce the flow one by one
• Packaging format (MP4, RMVB, TS, FLV, AVI)
• Video coding data (H.264, MPEG2, VC-1)
• Audio coding Data (AAC, MP3, AC-3)
• Video pixel data (YUV420P, RGB): Data sent to the graphics card for display
• Audio sampling data (PCM)
• Practice

The principle of video player, the process of playing a video file:

Insert picture description here
The function of the encapsulation format: pack the video and audio together, combine them into one file for transmission, and decapsulate them to separate them.
Decoding: Generate data that can be recognized by the display.

Some common visualization tools commonly used in the development and learning process:

  • Common players
    • Cross-platform series (non-DirectShow framework): VLC, Mplayer, ffplay...
    • Windows series (DirectShow framework): perfect decoding, ultimate decoding, Baofengyingyin...
  • Information Viewing Tool
    • Comprehensive information view: MediaInfo
    • Binary information view: UltraEdit
    • Detailed analysis of individual items
      • Package format: Elecard Format Analyzer
      • Video encoding data: Elecard Stream Eye
      • Video pixel data: YUV Player
      • Audio sample data: Adobe Audition

MediaInfo

As shown in the figure below: Open an mkv file, display comprehensive video-related information, video length, audio, video encoding, pixels, frame rate, sampling rate, etc.
Insert picture description here

Package format

1. The function of the encapsulation format: the video code stream and the audio code stream are stored in a file according to a certain format.
2. Package format analysis tool: Elecard Format Analyzer
Insert picture description here
Introduction to MPEG2-TS format The
file header is not included. The TS Packet with a fixed data size (188Byte) is composed of packets one by one into the cable TV network for transmission. The advantage of this format is that there is no file header. Even if there is an error in the front or the back, the video can be played normally.
Insert picture description here
Introduction to FLV format
Contains the file header. The data is composed of tags of variable size, and once the header file is damaged, it cannot be played.
Insert picture description here

Video encoding data

1. The role of video encoding: Compress the video pixel data (RGB, YUV, etc.) into a video stream, thereby reducing the amount of video data.
2. Video coding analysis tool: Elecard Stream Eye
Insert picture description here
is the interface for operation. The above is the data of the corresponding frame. All the videos are divided into grid-like intervals. This is the basic unit of coding, inside the grid. There is also a small grid, which is more complicated to judge. If it is complicated, more detailed coding will be carried out.

Red frame: I frame (direct compression, independent of other images); blue frame: P frame; green frame: B frame, the lines represent the motion vector.
3. Video encoding format:
Insert picture description here
4. Introduction to H.264 format

  • The data is composed of NALUs of variable size
  • In the most common case, 1 NALU stores the compressed and encoded data of 1 frame of picture
    Insert picture description here

5. H.264 compression method

  • quite complicated. Contains intra-frame prediction, inter-frame prediction, entropy coding, loop filtering and other links. This course does not give too much introduction to the algorithms.
  • Image data can be compressed more than 100 times.

Audio coded data

1. The role of audio coding: compress audio sample data (PCM, etc.) into an audio code stream, thereby reducing the amount of audio data. (Audio coding is not as important as video coding, because audio data is not as big as video data)
Insert picture description here
2. Audio coding analysis tool: Not involved yet.
3. Introduction to AAC format: data is composed of ADTS with variable size
Insert picture description here
4. AAC compression method

  • quite complicated. This course does not give too much introduction to the algorithms.
  • The audio data can be compressed more than 10 times.

Video pixel data

If you want to fully understand this convenient knowledge, you can read: vector diagram, bitmap, dot matrix, RGB, YUV
1. Video pixel data function: save the pixel value of each pixel on the screen.
2. Format: Common pixel data formats are RGB24, RGB32, YUV420P, YUV422P, YUV444P, etc. The pixel data in the YUV format is generally used in compression coding, and the most common format is YUV420P.
3. Features: The volume of video pixel data is very large. Generally, the data volume of RGB24 format of 1 hour high-definition video is:

3600*25*1920*1080*3=559.9GByte				// PS:这里假定帧率为25Hz,取样精度8bit。

4. YUV format pixel data viewing tool: YUV Player
5. Introduction to RGB format:

  • The three colors of Red, Green and Blue can be mixed into all the colors in the world.
  • Each point in a color image is composed of three components: R, G, and B.
  • Taking RGB24 as an example, the storage method of image pixel data is as follows:
    Insert picture description here
    ps: The pixel data in RGB format is stored in the BMP file.

Introduction to YUV format

Related experiments show that the human eye is sensitive to brightness but not to chromaticity. Therefore, the luminance information and the chrominance information can be separated, and a more "ruthless" compression scheme can be adopted for the chrominance information, thereby improving the compression efficiency.
Insert picture description here
YUV data viewing tool: YUVPlayer

Audio sample data

1. Audio sampling data function: save the value of each sampling point in the audio. The kind of floating waveform we see is analog data that cannot be displayed on a computer, so sampling is required.
2. Features: The volume of audio sampling data is very large. In general, the volume of a 4-minute PCM format song is:
4 * 60 * 44100 * 2 * 2 = 42.3MByte
PS: It is assumed that the sampling rate is 44100 Hz (human ears only Can hear this half), the sampling accuracy is 16bit.
3. Audio sampling data viewing tool: Adobe Audition
Insert picture description here
4. Introduction to PCM format
▫ In the case of mono, the data of each sampling point is stored in order (it looks like the image is a continuous arc, but the wireless amplification is actually a lot Discrete sampling points).
▫ In the case of two channels, the data of two channels for each sampling point are stored in the order of "left and right, left and right".
Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_37921201/article/details/89367419