Audio and video basics - YUV and RGB

1. The concept of audio and video nouns

1.1 pixels

A pixel is the basic unit of a picture, pixel, referred px
to as a combination of countless pixels to form a picture.

insert image description here

1.2 resolution

分辨率 = 垂直像素*水平像素, (theoretically) the higher the resolution of the image, the sharper the image.
For example, in the picture on the left below, one square of latex represents one pixel, so the resolution is that 15px*16px, and what we see in the end is a very small picture.

insert image description here

In fact, the higher the resolution, the clearer the image is not necessarily, because the image itself may be blurred

1.3 bit depth

The color pictures we see all have three channels, which are red®, green (G), and blue (B) channels.
(with weight if transparency is required alpha)

Usually each channel is 8bitrepresented by , 8bitwhich can represent 256a color, so it can form 256*256*256=16777216=1677thousands of colors

Here 8bitis what we are talking about bit depth.

insert image description here

The greater the bit depth of each channel, the greater the color value that can be represented. For example, the color mentioned by high-end TVs means 10bitthat each channel is 10bitrepresented by a color, and each channel 1024has a color. 1024*1024*1024About 107374million colors, is 8bitthe 64times.

8bitMost common colors

1.4 frame rate

The frame rate FPSis how many frames per second. The higher the frame rate, the more streamlined the picture, and the lower the frame rate, the more stuck.

Due to the temporary stay of the visual image on the retina, the frame rate of the image can reach 24 frames, so we consider the image to be continuous and dynamic.

  • The TV frame rate is generally24FPS
  • TV shows are generally25FPS
  • Commonly used in monitoring industry25FPS
  • Commonly used for audio and video calls15FPS

The higher the frame rate, the more streamlined the picture, and the higher the performance of the required equipment (involving encoding and decoding).

1.5 code rate

  • The data traffic used by video files per unit time, such as1Mbps
  • In most cases, the higher the bit rate, the clearer it will be. However, the size (bit rate) of the blurred video file can also be very large, and the video file with a small resolution may not be clear as the video file played with the resolution.
  • For the same original image source, the same encoding algorithm, the higher the bit rate, the smaller the distortion of the image, and the clearer the video picture will be.

2. The difference between RGB and YUV

  • RGB: Red R, green G, blue Bthree primary colors
  • YUV: YIndicates the brightness ( Luminance或Luma), that is, the grayscale value, U和Vand the chroma ( Chrominance或Chroma)

2.1 RGB

The usual image pixels are arranged in RGB order, but some image processing needs to be converted into other orders, such as the OpenCVfrequently converted BGRarrangement.

insert image description here

2.2

And RGBsimilarly, YUVit is also a color coding method, which refers to a pixel coding format that expresses brightness parameters Y(Luma)and chrominance parameters separately.U和V (Chroma)

The advantage of this separation is that it can not only avoid mutual interference, but also UVcan display a complete image without information, thus solving the compatibility problem of color TV and black and white TV, and can also reduce the color sampling rate without affecting the image quality too much.

  • Reduce the sampling rate of the color without affecting the image quality too much: YYshare a set of UVcomponents (depending on YUVthe type)

insert image description here
YUVIt is a relatively general statement, and it can be divided into many kinds according to its specific arrangement.

  • Packing ( packed) format: The components of each pixel are Y、U、Varranged crosswise and stored in the same array in units of pixels. Usually, several adjacent pixels form a macro pixel ( macro-pixel)
  • Flat ( ) format: Use three arrays to store three components planarseparately and continuously , that is , store them in their own arraysY、U、VY、U、V

insert image description here

3. YUV sampling representation

YUVUsing A:B:Cthe representation to describe Y、U、Vthe sampling frequency ratio

The black dots in the figure below indicate the amount of sampling pixels Y分, and the hollow circles indicate the sampling pixel components UV.

insert image description here

  • 4:4:4Indicates that the chroma channel is not downsampled: that is, a Y component corresponds to a U component and a V component
  • 4:2:2Indicates 2:1 horizontal downsampling without vertical downsampling, that is, every two Y components share a U component and a V component
  • 4:2:0Indicates 2:1 horizontal downsampling, 2:1 vertical downsampling, that is, every four Y components share a U component and a V component

4. YUV data storage

4.1 YUV data storage 4:4:4 format

For example YUV444, the format corresponds to FFmpegthe pixel representation AV_PIX_FMT_YUV444P, accounting for24bit

insert image description here

4.2 YUV data storage 4:2:2 format

For example YUV422P, the format corresponds to FFmpegthe pixel representation AV_PIX_FMT_YUV422P, accounting for16bit
insert image description here

4.3 YUV data storage 4:2:0 format

One of the most commonly used formats, such as YUV420Pformat, corresponds to FFmpegpixel representation AV_PIX_FMT_YUV420P, accounting for12bit

insert image description here

Such as NV12format, corresponding FFmpegpixel representationAV_PIX_FMT_NV12

insert image description here

4:2:0format reference
insert image description here

I420 is also called YU12, and it is called I420 format under the android platform. The
preview data collected by the android phone from the camera is generally NV21.

5. Conversion of RGB and YUV

  • Usually, the RGBdirect YUVmutual conversion is implemented by calling the interface, such as FFmpegor swscaleother libyuvlibraries
  • The main conversion criteria are BT601andBT709
    • 8bitIn case of bit depth
      • TV rangeYes 16-235(Y), 16-240(UV), also calledLimited Range
      • PC rangeyes 0-255, also calledFull Range
      • And RGBnothing range, all2-255

BT601 TV Rangeconversion formula

insert image description here

From YUV to RGB, if the value is less than 0, take 0, if it is greater than 255, take 255

5.1 Why does the decoding error display a green screen

Because during the conversion of the RGBsum YUV, YUVthe components are filled with 0values ​​when the decoding fails, and then according to the formula:

R = 1.402 * (-128) = -126.598
G = -0.34414 * (-128) - 0.71414*(-128) = 135.45984
B = 1772 * (-128) = -126.228

RGBThe value range is [0,255], so the final value is:

R = 0
G = 135.45984
B = 0

At this time, only Gthe component has a value, so it is green

6. YUV Stride alignment problem

For example, the storage format of the resolution 638*480image YUV420Pwe
already know YUV420Pis as shown in the figure below. If we need to align bytes
insert image description here
when processing memory , all the numbers must be divisible by 16, and they cannot be divisible. We need to fill a byte animation at the end of each line. At this time, the image is in bytes.16Y
638162640Y stride640
insert image description here

7. Other

This article develops a zero-based series for audio and video study notes for watching the following video
-image articles YUV-RGB-up | YUV format YUV sampling method YUV storage method RGB to YUV conversion audio and video development zero-based series-image article YUV-RGB-bottom | YUV format YUV sampling method YUV storage method RGB to YUV
conversion

Guess you like

Origin blog.csdn.net/EthanCo/article/details/131529006