RGB and YUV conversion in OpenCV

1 Basic concepts

        YUV color space has been widely used in the conversion and processing of color images since the era of analog television. It is based on a 3x3 matrix that converts RGB pixels into a luminance (Luma) component Y and two chroma (Chroma) components U and V through linear transformation. Since there are many analog TV formats, such as NTSC and PAL, etc., taking into account the differences in specific hardware and technology, they usually use different conversion matrix coefficients. Even in today's digital TV era, the industry will still retain these differences to ensure compatibility, but at the same time, more new conversion coefficients will be developed based on demand. This leads to the fact that the YUV color space is actually a very confusing concept. Even YUV itself is just a conventional collective name. It may actually be YCbCr, Y'CbCr, Y'UV, YPbPr, YCC and other standard names. A sort of. Therefore, if the specific conversion coefficient is not known, there may be a problem of color deviation caused by the conversion matrix not being reciprocal during the process of converting RGB to YUV and then converting back to RGB. However, it is difficult to fully explain the various conversion coefficients of YUV. For details, you can check Wiki and other related information online. This article mainly sorts out the YUV color space conversion in OpenCV based on practicality. After all, OpenCV is very commonly used in image preprocessing.

        The first thing to note is that although the conversion between RGB and YUV can be achieved through a 3x3 matrix transformation, in fact the 9 numbers in this 3x3 matrix are not completely independently selected. Another more common name for YUV is YCbCr, that is, U corresponds to Cb and V corresponds to Cr. The brightness Y in YCbCr is obtained by the weighted sum of R / G / B, which corresponds to a single-channel grayscale image; while Cb reflects the difference between the blue B component and brightness Y, and Cr reflects the difference between the red R component and brightness Y The difference, Cb and Cr are collectively called color difference or chromaticity. The specific conversion formula is as follows:

Y  = R2Y * R + G2Y * G + B2Y * B
Cb = (B - Y) * YCB + delta
Cr = (R - Y) * YCR + delta

Among them [ R2Y, G2Y, B2Y, YCB, YCR ] determines different transformation matrices, all of which are non-negative. Since Cb and Cr reflect color difference, their numerical distribution should be symmetrical about the origin 0, so a delta offset needs to be added to make it fall in the same range as Y as much as possible. Delta = 128 is commonly used. Generally speaking, the conversion of RGB and YUV is not an orthogonal transformation, so for the RGB value of the 256 x 256 x 256 rectangular cube distribution, the converted YUV distribution does not also satisfy the distribution of the rectangular cube, but is usually a truncated cone. . If YUV values ​​outside the pyramid are used when converting YUV back to RGB, the resulting RGB values ​​do not exist in reality and cannot be displayed.

        For RGB to YUV or YUV to RGB, OpenCV can be implemented by calling dst = cvtColor(src, code), code determines the conversion coefficient used and mainly: in the OpenCV source code, we can get , the article does not distinguish between BGR and RGB. By checking storage format. Note that OpenCV usually uses BGR instead of RGB color sequence, but it still retains support for RGB sequential input. There is no difference between them except the memory access sequence, so we will discuss it latersrc and dstcodeimgproc.hppcode

// YCbCr 4:4:4  <-->  BGR
COLOR_BGR2YCrCb    = 36,
COLOR_YCrCb2BGR    = 38,
// YUV 4:4:4  <-->  BGR
COLOR_BGR2YUV      = 82, 
COLOR_YUV2BGR      = 84,
// YUV_FOCC 4:2:0  -->  BGR
COLOR_YUV2BGR_NV12  = 91,
COLOR_YUV2BGR_NV21  = 93,
COLOR_YUV420sp2BGR  = COLOR_YUV2BGR_NV21,
COLOR_YUV2BGR_YV12  = 99,
COLOR_YUV2BGR_IYUV  = 101,
COLOR_YUV2BGR_I420  = COLOR_YUV2BGR_IYUV,
COLOR_YUV420p2BGR   = COLOR_YUV2BGR_YV12,
// YUV_FOCC 4:2:2  -->  BGR
COLOR_YUV2BGR_UYVY = 108,
COLOR_YUV2BGR_Y422 = COLOR_YUV2BGR_UYVY,
COLOR_YUV2BGR_UYNV = COLOR_YUV2BGR_UYVY,
COLOR_YUV2BGR_YUY2 = 116,
COLOR_YUV2BGR_YVYU = 118,
COLOR_YUV2BGR_YUYV = COLOR_YUV2BGR_YUY2,
COLOR_YUV2BGR_YUNV = COLOR_YUV2BGR_YUY2,
// BGR  <-->  YUV_FOCC 4:2:0
COLOR_BGR2YUV_I420  = 128,
COLOR_BGR2YUV_IYUV  = COLOR_BGR2YUV_I420,
COLOR_BGR2YUV_YV12  = 132,

As you can see, the naming of RGB and YUV conversion is very complicated and confusing, but the actual conversion coefficients involved in the above code can be divided into three categories. Among them, the first category is COLOR_BGR2YUV and its inverse transformation COLOR_YUV2BGR; the second category is COLOR_BGR2YCrCb and its inverse transformation COLOR_YCrCb2BGR; The third category is COLOR_BGR2YUV_FOCC with FOCC (Four-Character Code) suffix and its inverse transformation COLOR_YUV2BGR_FOCC. Through further exploration, it can be seen that the first two categories are used for the conversion of non-downsampled YUV, while the third category is used for the conversion of downsampled YUV. Here we first briefly introduce the YUV downsampling format, which is very helpful for subsequent understanding.

        YUV consists of a brightness component Y and two chroma components U and V. According to color science research, the human eye is more sensitive to changes in brightness, while the perception of chroma changes is relatively weak. Therefore, in order to reduce the bandwidth required for image data transmission, the chroma components U and V are usually downsampled while leaving Y intact, so that the same visual subjective experience can be maintained at as low a cost as possible. Generally speaking, the downsampling positions of U and V are the same. Downsampling of chrominance components usually takes every 4 adjacent pixels as a downsampling unit, and retains 1, 2, or 4 pixels among them. However, the selection methods of these 4 adjacent pixels and retained pixels are various. This results in very complicated downsampling formats, which are usually represented by YUV 4 : m : n. The figure below shows three commonly used downsampling formats.

YUV 4 : m : n is usually a conventional concept, and m and n do not necessarily have actual meaning. YUV 4:4:4 is equivalent to retaining all chroma pixels. This format is usually only used when converting YUV to RGB. YUV 4:2:2 means that the chroma component is downsampled by 1/2 in the horizontal direction, but not downsampled in the vertical direction, that is, only 2 of the 4 pixels are retained, of which 1/2 downsampling in the horizontal direction is usually Keep the left pixels and discard the right pixels, but this is not mandatory. There are usually multiple mode choices in specific hardware implementations, including only keeping the right pixels, or averaging two pixels, etc. YUV 4:2:2 is usually only used in higher-end camera image processing chips. YUV 4:2:0 means 1/2 downsampling in both horizontal and vertical directions, that is, only 1 of 4 pixels is retained, usually the upper left corner pixel, but just like YUV 4:2:2, its specific implementation can also be Chosen. YUV 4:2:0 is the format used by most image processing chips. YUV 4:1:1 is similar to YUV 4:2:0 and only retains 1 of 4 pixels, but it only downsamples 1/4 in the horizontal direction, while the vertical direction is completely retained. This format There are relatively few applications. In addition to the above formats, we usually use YUV 4:0:0 to represent grayscale images, that is, only retaining the brightness Y and completely discarding the chroma U and V. When you need to restore from a downsampled format to a complete 4:4:4 format, you usually only need to copy the pixels retained during downsampling to the discarded pixel positions. Of course, this will inevitably lead to distortion, but due to human The eye is not sensitive to chromaticity changes and this distortion is acceptable.

2 YUV 4:4:4 conversion

        For non-downsampled YUV, also known as YUV 4:4:4 format, OpenCV provides two types of conversion coefficients.

        For the first categoryCOLOR_BGR2YUV and its inverse transformationCOLOR_YUV2BGR, although its name is very consistent with our cognitive intuition, according to Wikipedia it should belong to The relatively old BT.470 standard is no longer in common use. In color_yuv.simd.hpp of the OpenCV source code, you can see

//to YUV
static const float B2YF = 0.114f;
static const float G2YF = 0.587f;
static const float R2YF = 0.299f;
static const float B2UF = 0.492f;
static const float R2VF = 0.877f;

Substituting into the previous conversion formula, the specific conversion coefficient and formula can be obtained as follows:

[ Y U V ] = [ 0.299 0.587 0.114 − 0.147 − 0.289 0.436 0.615 − 0.515 − 0.100 ] [ R G B ] + [ 0 128 128 ] \left[ {\begin{array}{c} Y \\ U \\ V \end{array}} \right] = \left[ {\begin{array}{c} {0.299}&{0.587}&{0.114} \\ { - 0.147}&{ - 0.289}&{0.436} \\ {0.615}&{ - 0.515}&{ - 0.100} \end{array}} \right]\left[ {\begin{array}{c} R \\ G \\ B \end{array}} \right] + \left[ {\begin{array}{c} 0 \\ {128} \\ {128} \end{array}} \right] ANDINV = 0.2990.1470.6150.5870.2890.5150.1140.4360.100 RGB + 0128128

[ R G B ] = [ 1 0 1.140 1 − 0.395 − 0.581 1 2.032 0 ] [ Y U − 128 V − 128 ] \left[ {\begin{array}{c} R \\ G \\ B \end{array}} \right] = \left[ {\begin{array}{c} 1&0&{1.140} \\ 1&{ - 0.395}&{ - 0.581} \\ 1&{2.032}&0 \end{array}} \right]\left[ {\begin{array}{c} Y \\ {U - 128} \\ {V - 128} \end{array}} \right] RGB = 11100.3952.0321.1400.5810 ANDIN128IN128

Note that when R = G = 0 , B = 255 R=G=0, B=255 R=G=0,B=255 time, U U U Maximum acquisition U m a x = 239 U_{max}=239 INatx=239. Depending on this calculation, it can be obtained Y ∈ [ 0 , 255 ] Y \in [0, 255] AND[0,255] U ∈ [ 17 , 239 ] U \in [17, 239] IN[17,239] V ∈ [ − 29 , 285 ] V \in [-29, 285] IN[29,285]. Therefore, the measurement ranges of Y / U / V are not the same under the BT.470 standard. This is probably to compensate for the difference in the response of analog TV to different components. In digital signal processing, because images usually use uint8 to represent pixel values, V needs to be truncated before storage, thereby losing some color information.

        For the second typeCOLOR_BGR2YCrCb and its inverse transformationCOLOR_YCrCb2BGR, it should be noted that OpenCV uses YCrCb instead of the YCbCr sequence corresponding to YUV. Care should be taken when using it, but the exact reasons for this are unknown. In fact, the conversion coefficient used by COLOR_BGR2YCrCb is the BT.601 standard that we are familiar with and commonly used. In the OpenCV source code color_yuv.simd.hpp, you can see

//to YCbCr
static const float B2YF = 0.114f;
static const float G2YF = 0.587f;
static const float R2YF = 0.299f;
static const float YCBF = 0.564f; // == 1/2/(1-B2YF)
static const float YCRF = 0.713f; // == 1/2/(1-R2YF)

Substituting into the conversion formula, the specific conversion coefficient and formula can be obtained as follows:

[ Y C b C r ] = [ 0.299 0.587 0.114 − 0.169 − 0.331 0.500 0.500 − 0.419 − 0.081 ] [ R G B ] + [ 0 128 128 ] \left[ {\begin{array}{c} Y \\ {Cb} \\ {Cr} \end{array}} \right] = \left[ {\begin{array}{c} {0.299}&{0.587}&{0.114} \\ { - 0.169}&{ - 0.331}&{0.500} \\ {0.500}&{ - 0.419}&{ - 0.081} \end{array}} \right]\left[ {\begin{array}{c} R \\ G \\ B \end{array}} \right] + \left[ {\begin{array}{c} 0 \\ {128} \\ {128} \end{array}} \right] ANDCbCr = 0.2990.1690.5000.5870.3310.4190.1140.5000.081 RGB + 0128128

[ R G B ] = [ 1 0 1.402 1 − 0.344 − 0.714 1 1.772 0 ] [ Y C b − 128 C r − 128 ] \left[ {\begin{array}{c} R \\ G \\ B \end{array}} \right] = \left[ {\begin{array}{c} 1&0&{1.402} \\ 1&{ - 0.344}&{ - 0.714} \\ 1&{1.772}&0 \end{array}} \right]\left[ {\begin{array}{c} Y \\ {Cb - 128} \\ {Cr - 128} \end{array}} \right] RGB = 11100.3441.7721.4020.7140 ANDCb128Cr128

By comparing the conversion formulas of COLOR_BGR2YUV, it can be seen that their brightness components are equal, but the difference is that the range of each component of YCbCr is [ 0 , 255 ] [0 , 255] [0,255] so that it can be stored within the uint8 type without truncation, which is what Adjustment of the chroma component range to accommodate digital signal processing. Regardless of the loss of storage accuracy, color space conversion using COLOR_BGR2YCrCb and its inverse transformation COLOR_YCrCb2BGR will not cause color distortion. Note that in older versions of OpenCV, COLOR_BGR2YCR_CB may still exist. It and COLOR_BGR2YCrCb are actually the same thing and are generally not used.

3 YUV downsampling format conversion

        In addition to supporting color space conversion in the YUV 4:4:4 format, OpenCV also supports direct conversion of RGB to YUV downsampling format, or direct conversion of YUV downsampling format to RGB, which is the third step mentioned above. Classcode. The third typecode can be expressed as with a certain FOCC suffix and the inverse transformation , where refers to the code composed of 4 ASCII characters, that is, Four-Character Code, which is somewhat similar to in C/C++. can usually be defined arbitrarily in personal programs, but there are also some conventions or standardized naming, such as , and so on. In the first section, we mentioned that the chroma components in YUV are usually downsampled to save system bandwidth. Furthermore, the number of brightness and chrominance pixels in the down-sampled YUV data is different. We cannot represent it as a matrix of [H, W, 3] like RGB, so the down-sampled YUV data There are many different storage methods, which constitute a variety of complicated information about the YUV format. COLOR_BGR2YUV_FOCCCOLOR_YUV2BGR_FOCCFOCCenumFOCCJPEGMPG4H264FOCC

        In order to correctly use OpenCV to convert YUV data color space in downsampling format, here is a brief introduction to the storage method of YUV. There are three commonly used storage methods, namely Planar, Semi-Planar and Interleave.

  • Planar means flat storage. It first stores all Y pixels completely and sequentially in the memory space, then stores all downsampled U data completely and sequentially, and finally stores all downsampled V data completely and sequentially, for exampleY1 Y2 Y3 … Yend U1 U2 U3 … Uend V1 V2 V3 … Vend. Its advantage is that it can adapt to any downsampling format, and it does not need to access other components when operating only on a single component such as the Y component. The disadvantage is that the span between Y/U/V data in the memory is too large. When Y/U/V data needs to be processed simultaneously, it is not conducive to continuous burst transmission of data, or multiple data parallel access branches are required. Note that sometimes V is stored first and then U is stored. In this case, specific FOCC must be used to distinguish.

  • Semi-Planar is semi-flat storage. It also stores all Y pixels completely and sequentially in the memory space. However, since the data amounts of U and V are equal, and most algorithms need to use U and V at the same time, there is no need between U and V. Stored separately, they can instead be interleaved, for exampleY1 Y2 Y3 … Yend U1 V1 U2 V2 U3 V3 … Uend Vend. It also has the adaptability of the Planar format and the advantages of independent storage of luminance and chrominance components, but it can merge the U and V data access operations and improve the memory burst transmission performance, so the Semi-Planar format is better in hardware More commonly used. In the same way as Planar, sometimes V is stored first and then U is stored. In this case, specific FOCC must be used to distinguish.

  • Interleave is interleaved storage. Unlike Semi-Planar, which only interleaves U and V, Interleave interleaves all Y/U/V data, which can maximize the burst transmission performance of YUV data. The disadvantage is that the chrominance component must be used when only the luminance component is needed. Transmission is also performed. It is usually used when the amount of brightness and chrominance data is equal, such as the YUV 4:2:2 format. In this case, two Y corresponds to a U and a V. There are multiple Interleave formats depending on the order of the four. And corresponds to differentFOCC. YUV 4:2:2 commonly used interleaving storage sequences are YUYV, YVYU, UYVY, VYUY Wait, the two Y's need to be arranged in pixel order. For example, for YUYV, there are Y1 U1 Y2 V1 Y3 U2 Y4 V2… in memory. In fact, most hardware supports parallel memory access operations, so for YUV 4:2:2, cross-storage of luminance and chrominance components will not cause too much data transmission burden, so YUV 4:2:2 data is typically processed in this storage format.

        In the first section we have listed the FOCC supported by OpenCV, but please note that COLOR_BGR2YUV_FOCC and COLOR_YUV2BGR_FOCC The supported FOCC is not symmetrical. For example, COLOR_BGR2YUV_FOCC does not support the conversion from RGB to YUV 4:2:2. If there is a need for this, you need Use Section 2's COLOR_BGR2YCrCb to convert RGB to YUV 4:4:4 and perform downsampling manually. However, it should be noted that the conversion coefficients of COLOR_BGR2YCrCb and COLOR_BGR2YUV_FOCC are different, which will be discussed later. In addition, some FOCC have repetitive meanings. Here we first clarify the downsampling and storage format corresponding to each FOCC:

  • NV12: YUV 4:2:0 Semi-Planar format, in which the chroma component is stored in U first and then V.
  • NV21 / 420S: YUV 4:2:0 Semi-Planar format, in which the chroma component is stored in V first and then in U.
  • IYUV / I420: YUV 4:2:0 Planar format, in which the chroma component is stored in U first and then V.
  • YV12 / 420P: YUV 4:2:0 Planar format, in which the chroma component is stored in V first and then U.
  • UYVY / Y222 / UYNV :YUV 4:2:2 Interleave formality, exist in order of UYVY.
  • YUY2 / YUYV / YUNV :YUV 4:2:2 Interleave formality,existing order order YUYV.
  • YVYU: YUV 4:2:2 Interleave format, storage order is YVYU.

        In OpenCV, although YUV has been downsampled, the YUV data will still be packaged into a matrix for easy processing. For YUV 4:2:0, the sizes of the chroma components U and V are respectively 1/4 of the brightness Y. In order to pack them together, OpenCV uses a two-dimensional matrix of [H*3/2, W] for storage. , where [:H, :W] stores Y data, while [H : H+H/2, :W] stores U and V data sequentially according to Planar or Semi-Planar format. According to the above FOCC corresponding format, their distribution in the two-dimensional matrix is ​​as shown below:

For YUV 4:2:2, the sizes of the chroma components U and V are 1/2 of the brightness Y respectively, that is, the data amount of the chroma component and the brightness component are the same, so OpenCV uses a [ H, W, 2] is stored as a three-dimensional matrix. Because the index order of matrices in memory usually starts from the last dimension, when storing YUV 4:2:2 data according to FOCC order, it should be on two [H, W] planes Write alternately. Taking UYVY as an example, their distribution in the three-dimensional matrix is ​​as shown in the figure below. Other FOCC can be inferred by themselves and will not be described here:

        After understanding the storage method of YUV downsampling format and the corresponding FOCC, we can use COLOR_BGR2YUV_FOCC and in OpenCV COLOR_YUV2BGR_FOCC Directly convert the downsampled YUV data to RGB, or directly convert RGB to YUV data with specific downsampling and storage methods. In fact, the third category COLOR_BGR2YUV_FOCC can be regarded as the inheritance of the second category COLOR_BGR2YCrCb, that is, BT.601, but it is somewhat different. In signal processing, a very important concept is system response. Operations such as filtering in the frequency domain usually introduce oscillation or overshoot in the time domain or spatial domain. The most common one is the Gibbs phenomenon in the ideal low-pass filter. The digital YUV signal ultimately needs to be converted into an analog level signal to drive the display, such as controlling the deflection angle of the liquid crystal through voltage, etc. When the discrete time signal is restored to an analog continuous time signal, a low-pass filter needs to be connected to filter it. The high-frequency extension component introduced by the sampling theorem is removed, so the problem of signal overshoot may occur at this time. Because the value range of COLOR_BGR2YCrCb obtained YUV data is [0, 255], if 0 is mapped to 0 level during the digital-to-analog conversion process, and 255 is mapped to the highest level, after the system After filtering the response, the actual signal level may exceed the range that the system can handle. In order to solve this problem, the ITU-R organization scaled and offset the results obtained by COLOR_BGR2YCrCb based on the BT.601 conversion coefficient, retaining about 10% overshoot margin, thus obtaining The conversion coefficient corresponding to OpenCV type 3code is usually used with specific YUV downsampling and storage formats. By checking the OpenCV source code color_yuv.simd.hpp, you can see

// Coefficients for RGB to YUV420p conversion
static const int ITUR_BT_601_SHIFT = 20;
static const int ITUR_BT_601_CRY =  269484;
static const int ITUR_BT_601_CGY =  528482;
static const int ITUR_BT_601_CBY =  102760;
static const int ITUR_BT_601_CRU = -155188;
static const int ITUR_BT_601_CGU = -305135;
static const int ITUR_BT_601_CBU =  460324;
static const int ITUR_BT_601_CGV = -385875;
static const int ITUR_BT_601_CBV = -74448;
//R = 1.164(Y - 16) + 1.596(V - 128)
//G = 1.164(Y - 16) - 0.813(V - 128) - 0.391(U - 128)
//B = 1.164(Y - 16)                  + 2.018(U - 128)

The unspecified ITUR_BT_601_CRV and ITUR_BT_601_CBU are the same. After converting to floating point, the RGB and YUV conversion coefficients with FOCC suffix can be obtained as follows

[ Y ′ U ′ V ′ ] = [ 0.257 0.504 0.098 − 0.148 − 0.291 0.439 0.439 − 0.368 − 0.071 ] [ R G B ] + [ 16 128 128 ] \left[ {\begin{array}{c} {Y'} \\ {U'} \\ {V'} \end{array}} \right] = \left[ {\begin{array}{c} {0.257}&{0.504}&{0.098} \\ { - 0.148}&{ - 0.291}&{0.439} \\ {0.439}&{ - 0.368}&{ - 0.071} \end{array}} \right]\left[ {\begin{array}{c} R \\ G \\ B \end{array}} \right] + \left[ {\begin{array}{c} {16} \\ {128} \\ {128} \end{array}} \right] ANDININ = 0.2570.1480.4390.5040.2910.3680.0980.4390.071 RGB + 16128128

[ R G B ] = [ 1.164 0 1.596 1.164 − 0.391 − 0.813 1.164 2.018 0 ] [ Y ′ − 16 U ′ − 128 V ′ − 128 ] \left[ {\begin{array}{c} R \\ G \\ B \end{array}} \right] = \left[ {\begin{array}{c} {1.164}&0&{1.596} \\ {1.164}&{ - 0.391}&{ - 0.813} \\ {1.164}&{2.018}&0 \end{array}} \right]\left[ {\begin{array}{c} {Y' - 16} \\ {U' - 128} \\ {V' - 128} \end{array}} \right] RGB = 1.1641.1641.16400.3912.0181.5960.8130 AND16IN128IN128

Easy calculation, at this time Y ∈ [ 16 , 235 ] Y \in [16, 235] AND[16,235],而 U , V ∈ [ 16 , 240 ] U,V \in [16, 240] U,IN[16,240]. That is, a certain amount of space is reserved on the left and right sides of the YUV data, so that general operations will not cause data overflow, thus reducing some truncation operations in the hardware. Of course, the price is that the range of data representation is reduced, and the accuracy that can be represented by the uint8 type cannot be maximized. However, overall the advantages outweigh the disadvantages, so this non-full-scale YUV is also Promoted as a standard by the ITU-R organization. We mentioned earlier that COLOR_BGR2YUV_FOCC is the result of scaling and offsetting COLOR_BGR2YCrCb. This is also true. Just compare the transformation matrices of the two, as follows :

[ 0.257 0.504 0.098 − 0.148 − 0.291 0.439 0.439 − 0.368 − 0.071 ] ≈ 1 255 [ 219 224 224 ] [ 0.299 0.587 0.114 − 0.169 − 0.331 0.500 0.500 − 0.419 − 0.081 ] \left[ {\begin{array}{c} {0.257}&{0.504}&{0.098} \\ { - 0.148}&{ - 0.291}&{0.439} \\ {0.439}&{ - 0.368}&{ - 0.071} \end{array}} \right] \approx \frac{1}{ {255}}\left[ {\begin{array}{c} {219}&{}&{} \\ {}&{224}&{} \\ {}&{}&{224} \end{array}} \right]\left[ {\begin{array}{c} {0.299}&{0.587}&{0.114} \\ { - 0.169}&{ - 0.331}&{0.500} \\ {0.500}&{ - 0.419}&{ - 0.081} \end{array}} \right] 0.2570.1480.4390.5040.2910.3680.0980.4390.071 2551 219224224 0.2990.1690.5000.5870.3310.4190.1140.5000.081

4 Verification

In order to verify whether the above analysis is correct, a test script is provided here. Those who are interested can copy and run it by themselves.

# -*- coding: utf-8 -*-
import numpy as np
import cv2

rgb2yuv_470_full = np.array([
    [ 0.299,  0.587,  0.114],
    [-0.147, -0.289,  0.436],
    [ 0.615, -0.515, -0.100]
])

rgb2yuv_601_full = np.array([
    [ 0.299,  0.587,  0.114],
    [-0.169, -0.331,  0.500],
    [ 0.500, -0.419, -0.081]
])

rgb2yuv_709_full = np.array([
    [ 0.2126,  0.7152,  0.0722],
    [-0.1146, -0.3854,  0.5000],
    [ 0.5000, -0.4542, -0.0458]
])

yuv2rgb_470_full = np.linalg.inv(rgb2yuv_470_full)
yuv2rgb_601_full = np.linalg.inv(rgb2yuv_601_full)
yuv2rgb_709_full = np.linalg.inv(rgb2yuv_709_full)

'''
print('rgb2yuv_470_full')
print(rgb2yuv_601_full)
print('yuv2rgb_470_full')
print(yuv2rgb_601_full)
print('rgb2yuv_601_full')
print(rgb2yuv_601_full)
print('yuv2rgb_601_full')
print(yuv2rgb_601_full)
print('rgb2yuv_709_full')
print(rgb2yuv_709_full)
print('yuv2rgb_709_full')
print(yuv2rgb_709_full)
'''

rgb2yuv_601_comp = 1./255 * np.diag([219, 224, 224]).dot(rgb2yuv_601_full)
rgb2yuv_709_comp = 1./255 * np.diag([219, 224, 224]).dot(rgb2yuv_709_full)

yuv2rgb_601_comp = np.linalg.inv(rgb2yuv_601_comp)
yuv2rgb_709_comp = np.linalg.inv(rgb2yuv_709_comp)

'''
print('rgb2yuv_601_comp')
print(rgb2yuv_601_comp)
print('yuv2rgb_601_comp')
print(yuv2rgb_601_comp)
print('rgb2yuv_709_comp')
print(rgb2yuv_709_comp)
print('yuv2rgb_709_comp')
print(yuv2rgb_709_comp)
'''

def convert_bgr_to_yuv(bgr):
    h, w = bgr.shape[:2]
    assert h % 4 == 0 and w % 4 == 0
    b, g, r = bgr.transpose(2, 0, 1)
    y =  0.299 * r + 0.587 * g + 0.114 * b
    # COLOR_BGR2YUV
    u = -0.147 * r - 0.289 * g + 0.436 * b + 128
    v =  0.615 * r - 0.515 * g - 0.100 * b + 128
    yuv_470 = np.clip(np.array([y, u, v]).transpose(1, 2, 0), 0, 255).astype('uint8')
    # COLOR_BGR2YCrCb
    u = -0.169 * r - 0.331 * g + 0.500 * b + 128
    v =  0.500 * r - 0.419 * g - 0.081 * b + 128
    yvu_601 = np.clip(np.array([y, v, u]).transpose(1, 2, 0), 0, 255).astype('uint8')
    # COLOR_BGR2YUV_FOCC
    y = y * 219. / 255 + 16
    u = u * 224. / 255 + 16
    v = v * 224. / 255 + 16
    yuv_fcc = np.clip(np.array([y, u, v]).transpose(1, 2, 0), 0, 255).astype('uint8')
    return yuv_470, yvu_601, yuv_fcc

def convert_yuv_to_fcc(yuv, fcc):
    fcc_list_420 = ['NV12', 'NV21', '420S', 'YV12', 'IYUV', 'I420', '420P']
    fcc_list_422 = ['UYVY', 'Y422', 'UYNV', 'YUY2', 'YVYU', 'YUYV', 'YUNV']
    assert fcc in fcc_list_420 + fcc_list_422
    h, w = yuv.shape[:2]
    assert h % 4 == 0 and w % 4 == 0
    y, u, v = yuv.transpose(2, 0, 1)
    if fcc in fcc_list_420:
        yuv420 = np.zeros([3*h//2, w], dtype='uint8')
        yuv420[:h] = y
        if fcc == 'NV12':
            yuv420[h:, 0::2] = u[::2, ::2]
            yuv420[h:, 1::2] = v[::2, ::2]
        elif fcc == 'NV21' or fcc == '420S':
            yuv420[h:, 0::2] = v[::2, ::2]
            yuv420[h:, 1::2] = u[::2, ::2]
        elif fcc == 'YV12' or fcc == '420P':
            yuv420[h: h+h//4] = v[::2, ::2].reshape([-1, w])
            yuv420[-h//4:   ] = u[::2, ::2].reshape([-1, w])
        elif fcc == 'IYUV' or fcc == 'I420':
            yuv420[h: h+h//4] = u[::2, ::2].reshape([-1, w])
            yuv420[-h//4:   ] = v[::2, ::2].reshape([-1, w])
        return yuv420
    elif fcc in fcc_list_422:
        yuv422 = np.zeros([h, w, 2], dtype='uint8')
        if fcc == 'UYVY' or fcc == 'Y422' or fcc == 'UYNV':
            yuv422[:, 0::2, 0] = u[:, ::2]
            yuv422[:, 1::2, 0] = v[:, ::2]
            yuv422[:,    :, 1] = y
        elif fcc == 'YUY2' or fcc == 'YUYV' or fcc == 'YUNV':
            yuv422[:,    :, 0] = y
            yuv422[:, 0::2, 1] = u[:, ::2]
            yuv422[:, 1::2, 1] = v[:, ::2]
        elif fcc == 'YVYU':
            yuv422[:,    :, 0] = y
            yuv422[:, 0::2, 1] = v[:, ::2]
            yuv422[:, 1::2, 1] = u[:, ::2]
        return yuv422
    else:
        return yuv

def convert_yuv470_to_bgr(yuv470):
    rgb = (yuv470 - np.array([0, 128, 128])).dot(yuv2rgb_470_full.T)
    bgr = np.clip(rgb[..., ::-1], 0, 255).astype('uint8')
    return bgr

def convert_yvu601_to_bgr(yvu601):
    rgb = (yvu601[..., [0, 2, 1]] - np.array([0, 128, 128])).dot(yuv2rgb_601_full.T)
    bgr = np.clip(rgb[..., ::-1], 0, 255).astype('uint8')
    return bgr

def test_yuv2bgr_fcc(yuv):
    fcc_dict_420 = {
    
    
        'NV12': cv2.COLOR_YUV2BGR_NV12,
        'NV21': cv2.COLOR_YUV2BGR_NV21, '420S': cv2.COLOR_YUV420sp2BGR,
        'YV12': cv2.COLOR_YUV2BGR_YV12, '420P': cv2.COLOR_YUV420p2BGR,
        'IYUV': cv2.COLOR_YUV2BGR_IYUV, 'I420': cv2.COLOR_YUV2BGR_I420
        }
    fcc_dict_422 = {
    
    
        'UYVY': cv2.COLOR_YUV2BGR_UYVY, 'Y422': cv2.COLOR_YUV2BGR_Y422, 'UYNV': cv2.COLOR_YUV2BGR_UYNV,
        'YUY2': cv2.COLOR_YUV2BGR_YUY2, 'YUYV': cv2.COLOR_YUV2BGR_YUYV, 'YUNV': cv2.COLOR_YUV2BGR_YUNV,
        'YVYU': cv2.COLOR_YUV2BGR_YVYU
        }
    yuv420up = yuv.copy()
    yuv420up[::2, 1::2, 1:] = yuv420up[::2, ::2, 1:]
    yuv420up[1::2, :, 1:] = yuv420up[::2, :, 1:]
    rgb = (yuv420up - np.array([16, 128, 128])).dot(yuv2rgb_601_comp.T)
    bgr420 = np.clip(rgb[..., ::-1], 0, 255).astype('int')
    yuv422up = yuv.copy()
    yuv422up[:, 1::2, 1:] = yuv422up[:, ::2, 1:]
    rgb = (yuv422up - np.array([16, 128, 128])).dot(yuv2rgb_601_comp.T)
    bgr422 = np.clip(rgb[..., ::-1], 0, 255).astype('int')
    for fcc, code in fcc_dict_420.items():
        yuv_fcc = convert_yuv_to_fcc(yuv, fcc)
        bgr_cv = cv2.cvtColor(yuv_fcc, code)
        print('{} bgr max diff: {}'.format(fcc, np.max(np.abs(bgr420 - bgr_cv))))
    for fcc, code in fcc_dict_422.items():
        yuv_fcc = convert_yuv_to_fcc(yuv, fcc)
        bgr_cv = cv2.cvtColor(yuv_fcc, code)
        print('{} bgr max diff: {}'.format(fcc, np.max(np.abs(bgr422 - bgr_cv))))

if __name__ == '__main__':
    w, h = 128, 128
    bgr = np.random.randint(0, 256, [h, w, 3], dtype='uint8')
    yuv470, yvu601, yuv_fcc = convert_bgr_to_yuv(bgr)

    # test COLOR_BGR2YUV
    y0 = cv2.cvtColor(bgr, cv2.COLOR_BGR2YUV)
    x0 = cv2.cvtColor(y0, cv2.COLOR_YUV2BGR)
    x1 = convert_yuv470_to_bgr(yuv470)
    d0 = np.max(yuv470.astype('int') - y0)
    d1 = np.max(x1.astype('int') - x0)
    print('470 max diff: yuv={}, bgr={}'.format(d0, d1))

    # test COLOR_BGR2YCrCb
    y0 = cv2.cvtColor(bgr, cv2.COLOR_BGR2YCrCb)
    x0 = cv2.cvtColor(y0, cv2.COLOR_YCrCb2BGR)
    x1 = convert_yvu601_to_bgr(yvu601)
    d0 = np.max(yvu601.astype('int') - y0)
    d1 = np.max(x1.astype('int') - x0)
    print('601 max diff: yuv={}, bgr={}'.format(d0, d1))

    # test fcc yuv2bgr
    test_yuv2bgr_fcc(yuv_fcc)

    # test fcc br2yuv
    y0 = cv2.cvtColor(bgr, cv2.COLOR_BGR2YUV_I420)
    y1 = convert_yuv_to_fcc(yuv_fcc, 'I420')
    print('I420 yuv max diff: {}'.format(np.max(np.abs(y1.astype('int') - y0))))

    y0 = cv2.cvtColor(bgr, cv2.COLOR_BGR2YUV_IYUV)
    y1 = convert_yuv_to_fcc(yuv_fcc, 'IYUV')
    print('IYUV yuv max diff: {}'.format(np.max(np.abs(y1.astype('int') - y0))))

    y0 = cv2.cvtColor(bgr, cv2.COLOR_BGR2YUV_YV12)
    y1 = convert_yuv_to_fcc(yuv_fcc, 'YV12')
    print('YV12 yuv max diff: {}'.format(np.max(np.abs(y1.astype('int') - y0))))


Guess you like

Origin blog.csdn.net/qq_33552519/article/details/131663618