New generation video compression coding standard-----H.264/AVC

2.Digital video

2.1.2. Digital TV PCM principle

The input analog signal is converted into the output digital TV signal, which is completed by the A/D converter through three steps of sampling, quantization and encoding.

2.1.2.1. Sampling

在时间轴上将连续变化的模拟信号---》转化为离散量

2.1.2.2.Quantification

因取样后的脉冲信号在时间上是离散的,在幅值和空间上是连续的,可能取值有无限多个,因此需要进行四舍五入。将这种信号幅值从连续量变为离散量,这个过程为量化。

2.2.2.3.PCM encoding

量化信号一般为“0”,“1”,这时编码为PCM编码。

The binary sequence obtained after the analog electrical signal is -》sampling-》quantization-》encoding is the digital TV signal. Generally, the more binary digits there are after serialization, the smaller the quantization noise is and the closer the digital signal is to the analog signal.

2.2 Digital TV signal

2.2.1.1 Time sampling of television signals:

运动图像由每秒若干帧静止图形构成,假设设定电视频率为每秒20帧,这种取样即为时间取样。

2.2.1.2 Spatial sampling of television signals:

同一个电视信号帧当中,同一行由若干取样点构成,这种取样点为像素,这种取样为空间取样。

2.2.2.2 YOU

实现视频压缩的一种方式:因人视觉对亮度和彩色更敏感,可通过把亮度信息从色彩中分离,使得有更高的清晰度,也可显著降低带宽实现视频压缩。

2.2.4.1 The size of the quantized value qp

Generally, each sample value is represented by 8 bits, that is, 256 gray levels. Because if the video is too large, the video will be rough, and if the video is too small, the quality will be good, but the bandwidth will be wasted too much.

2.2.4.2 Sampling frequency

对不同的屏幕类型,取样频率不同。

2.3 Preprocessing of video signals

A basic video processing and communication including 采集、预处理、视频编码、通信、图像processing etc.
Insert image description here

2.3.1 Color interpolation

Generally, a pixel can only give a monotonous color tone from white to black, and cannot give RGB three colors. To obtain color images, a color filter array is required

2.3.2 Color calibration

There will still be differences between the image obtained through color interpolation and the real scene, and the image pixel values ​​need to be linearly transformed to reduce the difference as much as possible.

2.3.3 Gamma correction

2.3.4 Image enhancement

包括直方图均衡,平滑滤波,中值滤波,锐化等。可在空间域

2.3.4.1 Smooth filtering

1. The main purpose of image smoothing: to eliminate noise caused by quality factors of the image sampling system and maintain image details.
2. Image smoothing includes spatial domain method and frequency domain method.
   2.1 Commonly used methods in the spatial domain method: mean filtering and median filtering.
   2.2 Commonly used method in frequency domain method: low-pass filtering method.

2.3.4.2 Weighted mean filter

Method: Take an nxn window and replace the original value of the center pixel with the weighted average of the n² pixels in the window. (Related weighting algorithm templates can be queried)Insert image description here

2.3.4.3 Principle of median filtering
: Sort the grayscale of all pixels in the small window centered on a certain point (X, Y) from large to small, and use the intermediate value as the grayscale value at (x, y). Sorting algorithm is generally used

2.3.4.4 Image sharpening

Purpose: Reduce the impact of blurring boundaries and contours in images and make changes clear.
The root cause of image blurring: caused by averaging/integration, which can be changed by inverse operations.
Two methods of image sharpening: 1. Differential method (gradient sharpening and Laplacian sharpening). 2. High-pass filtering methodInsert image description here

2.3.4.5 Histogram equalization

Concept: It is an image analysis tool that describes the gray level of an image.
Histogram repair: Use the grayscale mapping level function S=F® to change the original grayscale histogram to the histogram you want.
Histogram repair is commonly used: histogram equalization, that is, changing the histogram distribution of a given image to a uniform one. is a histogram distribution.
Insert image description here

2.3.4.6 White balance

It is related to color temperature. The higher the color temperature, the more blue components. The lower the color temperature, the more red components.
Automatic white balance algorithm: 1. Global balance method. 2. Local white balance method
Insert image description here

3. Video compression coding principle

3.1 Predictive coding

3.1.1 Basic concepts of predictive coding

Commonly used methods for video compression coding: prediction method: that is, what is transmitted after compression coding is the difference between the predicted value and the actual value of the sample. Because there is a strong correlation between adjacent pixels in the same image,
the actual operation method: different distances, different correlations, use P as the predicted value, give different weights according to the distance from the pixel X, and compare these weights Add to get the predicted value P, and then subtract it from X to get the difference q
Insert image description here

3.1.2 Intra prediction coding

3.1.2.1 One-dimensional best prediction
3.1.2.2 Two-dimensional best prediction
3.1.2.3 Predictive coding gain
3.1.2.4 Quantizer of predictive coding

Generally, there are more flat areas than mutation areas in images, such as the nose in a human face, which is a mutation area. When the quantization error becomes larger in the mutation part, coarse quantization will not make the human eye sensitive. On the contrary, in flat areas, finer values ​​should be adopted.

3.1.3 Inter-frame predictive coding

3.1.3.1 One-way prediction
(1) Prediction principle:

The method of using the motion vector displacement of the previous frame image as the prediction value is called unidirectional prediction. That is, the current frame F (x, y) and the previous frame F (x1, y1) in the frame storage area are input into the motion parameter estimator at the same time, and the motion vector MV is compared. Input this MV into motion compensation prediction to obtain the predicted image F. The predicted image and the actual image cannot be the same, and there is always an error e(x, y).

(2) Motion vector estimation based on block matching:

Unidirectional prediction predicts in units of pixels. In addition to transmitting frame differences, it also increases the motion vector of each pixel, and the coding efficiency decreases. Therefore, a frame of image is often divided into MXN blocks, and vectors are allocated in units of blocks, which can reduce the total code rate.

(3) Search method:

1. Exhaustive search method.
2. Quick search method.

3.1.3.2 Bidirectional prediction

1. Use the pixels of the previous frame and the next frame to predict simultaneously.
2. Predicting the current frame using forward reference frames is called forward motion compensation, using backward reference frames to predict the current frame is called backward motion compensation, and using forward and backward simultaneous predictions is called bidirectional prediction motion compensation.
3. This prediction is only for the scene where frame t-1 is not exposed and frame t+1 is presented.

3.1.3.3 Overlapping block motion compensation OBMC

It mainly solves the problem of inaccurate estimation caused by block-based motion compensation. When using OBMC, the prediction of a pixel is not only based on its own MV estimate, but also based on adjacent MV estimates.

3.1.4 Motion estimation

3.1.4.1 Basic concepts

In inter-frame predictive coding, there is a certain correlation between adjacent frames of a moving image. Therefore, the moving image can be divided into several blocks, try to find the position of each block in the adjacent frame graphics, and obtain the offset of the spatial position of the block and the graphics. The offset obtained is the motion vector . The process of obtaining motion vectors is motion estimation.
Advantages: Inter-frame redundancy can be removed through motion estimation, greatly reducing the number of bits in video transmission.

3.1.4.2 Methods of motion estimation
3.1.4.3 Motion representation
3.1.4.3.1 Block-based motion representation
3.1.4.3.2 Interpolation of sub-pixel positions
3.1.4.3.3 Prediction method of motion vectors in the spatiotemporal domain
  (1 ) Motion vector spatial domain prediction method:

a. Motion vector median prediction (Median Prediction)
b. Upper block mode motion vector in the spatial domain (Uplayer Prediction)

	  (2)运动矢量在时间域预测方式

a. Corresponding-block motion vector prediction of the previous frame (Corresponding-block Prediction)
b. Motion vector prediction of adjacent reference frame in the time domain (Neighboring Reference-frame Prediction)

3.1.4.3.4 Prediction method of matching error in spatiotemporal domain

3.1.4.4 Classification of motion estimation criteria

The purpose of motion search is to find the data block that best matches the current block within the search window, so there is the problem of how to judge whether two blocks match, that is, how to define a matching criterion

3.1.4.5 Motion search algorithm

Several main search algorithms:
① Global search algorithm: 在一个预先定义的搜索区域内,把它与参考帧中所有的候选块进行比较,并且寻找具有最小匹配误差的一个。这两个块之间的位移就是所估计的MV。
② Fractional precision search algorithm:
③ Fast search algorithm:
④ Hierarchical search range (DSR) algorithm:
⑤ Hybrid search algorithm:

3.2 Transform coding

3.2.1 Basic concepts

In most images, flat areas and content account for the majority, and details and content mutation areas account for a small portion, that is, DC and low frequencies in the image account for most, and high frequencies account for a small portion. In this way, the image in the spatial domain is transformed into the frequency domain, which will produce smaller transformation coefficients and can be compressed and encoded, that is, transform coding.

3.2.2 KL transform
3.2.3 Discrete cosine transform DCT
3.2.4 Zigzag scan and run-length encoding

3.3 Comparison between transform coding and predictive coding

① The implementation of transform coding is relatively complex, and the implementation of predictive coding is relatively easy, but the error of predictive coding will spread.
②The error of predictive coding will spread backward, forming regional bit errors. Transform coding does not, its restriction only affects within a block.

///Continually updated

Guess you like

Origin blog.csdn.net/weixin_43917045/article/details/126667553