A Preliminary Study on Video Compression

I recently studied this part, trying to cooperate with the self-created compression method to form its own video format. The experiment is summarized as follows. (I haven't been exposed to this part before, and I learned it temporarily on the Internet. I learned while experimenting, and got preliminary results)

1. Most of the current video is compressed after DCT (Discrete Cosine Transform) processing. The video processing process (encoding process) is as follows:

Video frame -> decompose into YUV color difference -> UV color difference offset +128 -> sub-block DCT processing -> quantization -> compression -> compressed frame packing. Video decompression is reversed, (decoding process) as follows:

Video frame <- YUV color difference synthesis RGB <- UV color difference offset -128 <- Inverse DCT to restore YUV block <- Inverse quantization <- Decompression <- Compressed frame unpacking.

2. The DCT theory is lossless, the floating point number is obtained, and then the floating point inverse DCT transform is used to restore the original number (there will be losses in floating point, depending on the floating point precision). Because the compression requires an integer number, it is converted into an integer number. This process is called "quantization", which causes a certain loss. If the floating point number is "amplified" (multiplied by a certain number to improve the accuracy) after DCT, the recovery quality will be improved, but Compression ratio drops. This number is generally called quantization quality. When it is 10, the compression rate is 20%, and the image quality is acceptable. If it is not "enlarged", the display effect is extremely poor and unacceptable.

3. DCT input range: Byte and word input have been tested, and the characteristics are the same. Signed and unsigned numbers have basically the same properties.

4. Traditional videos (almost all current videos) need to be converted into YUV color difference method and then DCT transformed and compressed, because of the following three reasons:

(1) The earliest black and white TVs were black and white grayscale signals. When color TVs appeared, in order to be compatible with black and white TVs, color-difference signals were selected, in which Y is a signal synthesized by three primary colors, which can be used for color and black and white. TV set.

(2) Compression needs, because it is not sensitive to color details, reducing the color details alone has a certain compression effect.

(3) Since it was first developed from black and white TV sets, video cameras, video recorders, and broadcasting systems, the compatible mode was chosen. Although there are RGB video and video original signals today, they are not compressed and cannot be used normally, so they need to be converted. Color difference compression.

5. As a self-developed multimedia software, video is indispensable. Existing video players all call other people's video playback libraries without any autonomy. Although there are free libraries, considering complete self-control and simplicity, it is still To take the road of innovation, the ideas are as follows:

(1) The video is decomposed into the color difference method, one of which is to be "compatible" with the traditional black and white system. As an independent software, there is no need to consider "compatibility", and it adopts its own format and its own player.

(2) Video compression is different from traditional compression. It requires speed and also takes into account the compression rate. After reading almost all the compression principles today, try new compression methods by integrating self-ideas.

(3) Converting into color difference, compressing color, and making full use of visual characteristics is also a good solution. Compared with RGB-based compression, whether there is an advantage or not requires more thinking. At present, it can be determined that the effect of YUV4:2:0 is not good, and the effect of YUV4:1:1 is improved, but it is rarely used. The effect and compression balance of YUV4:2:2 are relatively good. The disadvantage of YUV is that it needs to separate the video, requires calculation, and consumes a certain amount of time. Another problem is that the three primary colors are interdependent, which is not good for restoring color purity.

(4) If the three primary color variables of RGB are compressed separately, because they are not related to each other, when the compression rate is different, it may cause color confusion. Therefore, RGB is compressed as a variable, and the most used RGB888 is the RGB24 original color format, as a 24-bit unsigned variable, but there is no relevant quantization table, and there is a dilemma.

(5) The quantification table made by myself is a bit confusing. The quantification table is based on psychological factors, and has made a trade-off to remove high-frequency components (that is, clarity) in the image. , indeed.

(6) After turning a circle, I still returned to the color difference method, not for compatibility, but for compressing colors, using existing resources, first to be independent, fully mastered, and take the first step.

6. Related references:

//YUV转换宏
#define MY(R, G, B) ( R *  0.2989 + G *  0.5866 + B * 0.1145) 
#define MU(R, G, B) ((R * (-0.1688) + G * (-0.3312) + B * 0.5000 + 128)) 
#define MV(R, G, B) ((R *  0.5000 + G * (-0.4184) + B * (-0.0816) + 128))

const BYTE bzYQTable[ 8 ][ 8 ] = { // luminance Y quantization table 
{ 16 , 11 , 10 , 16 , 24 , 40 , 51 , 61 },
{12,12,14,19,26,58,60,55},
{14,13,16,24,40,57,69,56},
{14,17,22,29,51,87,80,62},
{18,22,37,56,68,109,103,77},
{24,35,55,64,81,104,113,92},
{49,64,78,87,103,121,120,101},
{72,92,95,98,112,100,103,99}};

const BYTE bzUVQTable[ 8 ][ 8 ] = { // chroma UV quantization table 
{ 17 , 18 , 24 , 47 , 99 , 99 , 99 , 99 },
{18,21,26,66,99,99,99,99},
{24,26,56,99,99,99,99,99},
{47,66,99,99,99,99,99,99},
{99,99,99,99,99,99,99,99},
{99,99,99,99,99,99,99,99},
{99,99,99,99,99,99,99,99},
{99,99,99,99,99,99,99,99}};

void __fastcall CDCT::DCT( float fOutPut[][ 8 ], char czInPut[][ 8 ]) // discrete cosine transform 
{
    float ALPHA, BETA, tmp; int u,v,i,j;

   for(u = 0; u < 8; u++)
      {
        for(v = 0; v < 8; v++)
           {
             if(u == 0)
                ALPHA = sqrt(1.0 / 8.0);
                else
                ALPHA = sqrt(2.0 / 8.0);

             if(v == 0)
                BETA = sqrt(1.0 / 8.0);
                else
                BETA = sqrt(2.0 / 8.0);

             tmp = 0.0;
             for(i = 0; i < 8; i++)
             for(j = 0; j < 8; j++)
                {
                   tmp += czInPut[i][j] * cos((2 * i + 1) * u * PI / (2 * 8)) * cos((2 * j + 1) * v * PI / (2 * 8));
                }
             fOutPut[u][v] = ALPHA * BETA * tmp;
           }
      }
}

void __fastcall CDCT::Quant(int czOutPut[][8], float InMatrix[][8], const BYTE bzQTable[][8])//量化
{
   int x,y; float fQuant;

   for(x = 0; x < 8; x++)
   for(y = 0; y < 8; y++)
      {
         fQuant = InMatrix[x][y] / bzQTable[x][y]; // quantization 
         fQuant = fQuant * byQuantizationQuality; // quantization quality 
         czOutPut[x][y] = ( int )fQuant; // truncated 
      }
}

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324822116&siteId=291194637