1. The concept of audio and video nouns
1.1 pixels
A pixel is the basic unit of a picture, pixel
, referred px
to as a combination of countless pixels to form a picture.
1.2 resolution
分辨率 = 垂直像素*水平像素
, (theoretically) the higher the resolution of the image, the sharper the image.
For example, in the picture on the left below, one square of latex represents one pixel, so the resolution is that 15px*16px
, and what we see in the end is a very small picture.
In fact, the higher the resolution, the clearer the image is not necessarily, because the image itself may be blurred
1.3 bit depth
The color pictures we see all have three channels, which are red®, green (G), and blue (B) channels.
(with weight if transparency is required alpha
)
Usually each channel is 8bit
represented by , 8bit
which can represent 256
a color, so it can form 256*256*256=16777216=1677
thousands of colors
Here 8bit
is what we are talking about bit depth.
The greater the bit depth of each channel, the greater the color value that can be represented. For example, the color mentioned by high-end TVs means 10bit
that each channel is 10bit
represented by a color, and each channel 1024
has a color. 1024*1024*1024
About 107374
million colors, is 8bit
the 64
times.
8bit
Most common colors
1.4 frame rate
The frame rate FPS
is how many frames per second. The higher the frame rate, the more streamlined the picture, and the lower the frame rate, the more stuck.
Due to the temporary stay of the visual image on the retina, the frame rate of the image can reach 24 frames, so we consider the image to be continuous and dynamic.
- The TV frame rate is generally
24FPS
- TV shows are generally
25FPS
- Commonly used in monitoring industry
25FPS
- Commonly used for audio and video calls
15FPS
The higher the frame rate, the more streamlined the picture, and the higher the performance of the required equipment (involving encoding and decoding).
1.5 code rate
- The data traffic used by video files per unit time, such as
1Mbps
- In most cases, the higher the bit rate, the clearer it will be. However, the size (bit rate) of the blurred video file can also be very large, and the video file with a small resolution may not be clear as the video file played with the resolution.
- For the same original image source, the same encoding algorithm, the higher the bit rate, the smaller the distortion of the image, and the clearer the video picture will be.
2. The difference between RGB and YUV
RGB
: RedR
, greenG
, blueB
three primary colorsYUV
:Y
Indicates the brightness (Luminance或Luma
), that is, the grayscale value,U和V
and the chroma (Chrominance或Chroma
)
2.1 RGB
The usual image pixels are arranged in RGB order, but some image processing needs to be converted into other orders, such as the OpenCV
frequently converted BGR
arrangement.
2.2
And RGB
similarly, YUV
it is also a color coding method, which refers to a pixel coding format that expresses brightness parameters Y(Luma)
and chrominance parameters separately.U和V (Chroma)
The advantage of this separation is that it can not only avoid mutual interference, but also UV
can display a complete image without information, thus solving the compatibility problem of color TV and black and white TV, and can also reduce the color sampling rate without affecting the image quality too much.
- Reduce the sampling rate of the color without affecting the image quality too much:
YY
share a set ofUV
components (depending onYUV
the type)
YUV
It is a relatively general statement, and it can be divided into many kinds according to its specific arrangement.
- Packing (
packed
) format: The components of each pixel areY、U、V
arranged crosswise and stored in the same array in units of pixels. Usually, several adjacent pixels form a macro pixel (macro-pixel
) - Flat ( ) format: Use three arrays to store three components
planar
separately and continuously , that is , store them in their own arraysY、U、V
Y、U、V
3. YUV sampling representation
YUV
Using A:B:C
the representation to describe Y、U、V
the sampling frequency ratio
The black dots in the figure below indicate the amount of sampling pixels Y分
, and the hollow circles indicate the sampling pixel components UV
.
4:4:4
Indicates that the chroma channel is not downsampled: that is, a Y component corresponds to a U component and a V component4:2:2
Indicates 2:1 horizontal downsampling without vertical downsampling, that is, every two Y components share a U component and a V component4:2:0
Indicates 2:1 horizontal downsampling, 2:1 vertical downsampling, that is, every four Y components share a U component and a V component
4. YUV data storage
4.1 YUV data storage 4:4:4 format
For example YUV444
, the format corresponds to FFmpeg
the pixel representation AV_PIX_FMT_YUV444P
, accounting for24bit
4.2 YUV data storage 4:2:2 format
For example YUV422P
, the format corresponds to FFmpeg
the pixel representation AV_PIX_FMT_YUV422P
, accounting for16bit
4.3 YUV data storage 4:2:0 format
One of the most commonly used formats, such as YUV420P
format, corresponds to FFmpeg
pixel representation AV_PIX_FMT_YUV420P
, accounting for12bit
Such as NV12
format, corresponding FFmpeg
pixel representationAV_PIX_FMT_NV12
4:2:0
format reference
I420 is also called YU12, and it is called I420 format under the android platform. The
preview data collected by the android phone from the camera is generally NV21.
5. Conversion of RGB and YUV
- Usually, the
RGB
directYUV
mutual conversion is implemented by calling the interface, such asFFmpeg
orswscale
otherlibyuv
libraries - The main conversion criteria are
BT601
andBT709
8bit
In case of bit depthTV range
Yes16-235(Y)
,16-240(UV)
, also calledLimited Range
PC range
yes0-255
, also calledFull Range
- And
RGB
nothingrange
, all2-255
BT601 TV Range
conversion formula
From YUV to RGB, if the value is less than 0, take 0, if it is greater than 255, take 255
5.1 Why does the decoding error display a green screen
Because during the conversion of the RGB
sum YUV
, YUV
the components are filled with 0
values when the decoding fails, and then according to the formula:
R = 1.402 * (-128) = -126.598
G = -0.34414 * (-128) - 0.71414*(-128) = 135.45984
B = 1772 * (-128) = -126.228
RGB
The value range is [0,255]
, so the final value is:
R = 0
G = 135.45984
B = 0
At this time, only G
the component has a value, so it is green
6. YUV Stride alignment problem
For example, the storage format of the resolution 638*480
image YUV420P
we
already know YUV420P
is as shown in the figure below. If we need to align bytes
when processing memory , all the numbers must be divisible by 16, and they cannot be divisible. We need to fill a byte animation at the end of each line. At this time, the image is in bytes.16
Y
638
16
2
640
Y stride
640
7. Other
This article develops a zero-based series for audio and video study notes for watching the following video
-image articles YUV-RGB-up | YUV format YUV sampling method YUV storage method RGB to YUV conversion audio and video development zero-based series-image article YUV-RGB-bottom | YUV format YUV sampling method YUV storage method RGB to YUV
conversion