Image Processing Discrete Cosine Transform

What is DCT

One-dimensional DCT transform

One-dimensional DCT transformation is the basis of two-dimensional DCT transformation, so let's discuss the next-dimensional DCT transformation first. There are 8 forms of one-dimensional DCT transformation, among which the second form is the most commonly used, because of its simple operation and wide application range. We only discuss this form here, whose expression is as follows:
insert image description here

Among them, f(i) is the original signal, F(u) is the coefficient after DCT transformation, N is the number of points of the original signal, c(u) can be considered as a compensation coefficient, which can make the DCT transformation matrix an orthogonal matrix.

Two-dimensional DCT transformation

The two-dimensional DCT transformation is actually a DCT transformation based on the one-dimensional DCT transformation. The formula is as follows:

insert image description here

From the formula, we can see that the above only discusses the case where the two-dimensional image data is a square matrix. In practical applications, if the data is not a square matrix, it is generally filled and then transformed. After reconstruction, the filling can be removed. The part to get the original image information, try this, it should be easier to understand.

In addition, due to the symmetry of the DCT transformation height, we can use a simpler matrix processing method when using Matlab to perform related operations:

insert image description here

Two-dimensional DCT inverse transform

At the receiving end of the image, according to the reversibility of the DCT change, we can restore the original image information through the DCT inverse transformation, the formula is as follows:

insert image description here

In the same way, we can derive the corresponding matrix form of DCT inverse transformation by using the previous matrix operation company

Perform discrete cosine transform on two-dimensional images#
From the above definition of two-dimensional discrete cosine transform and formula (7), it can be seen that the following steps are required to obtain the discrete cosine transform of a two-dimensional image:

1. Obtain the two-dimensional data matrix f(x,y) of the image;
2. Find the coefficient matrix [A] of the discrete cosine transform;
3. Find the transposition matrix [A]T corresponding to the coefficient matrix;
4. According to the formula (7 )[F(u,v)]=[A][f(x,y)][A]T Calculate discrete cosine transform; advantage # 1.
DCT
transform has better energy concentration in frequency domain than DFT transform (say In human words, it is able to gather more important information of the image together), then those unimportant frequency domain regions and coefficients can be directly cropped (a bit like gold panning, you get all the important gold in the stone, and the rest Useless stones can be thrown away), therefore, DCT transformation is very suitable for the processing of image compression algorithms, for example, the now famous jpeg uses DCT as an image compression algorithm

2. The DCT transform is a separable transform, and its transform kernel is a cosine function. In addition to the general orthogonal transformation properties of DCT, the basis vector of its transformation matrix can well describe the relevant characteristics of human speech signals and image signals. Therefore, in the transformation of speech signal and image signal, DCT transformation is considered as a quasi-optimal transformation.

Spectrum Characteristic Analysis of Two-Dimensional DFT and Two-Dimensional DCT

insert image description here

1. Image experiments with less details (high frequency components):

Conclusion:
For relatively smooth images/data, the DFT transform data is concentrated in the middle (low frequency signal area), and the DCT transform data is concentrated in the upper left corner. It is almost impossible to see where the advantages of DCT are.
2. Detailed image experiments

insert image description here

Conclusion:
The data after the DCT change is very divergent, but the data after the DCT change is still relatively concentrated. If the original image is also restored from the frequency spectrum, it is more reasonable to choose DCT, because DCT only needs to store fewer data points. It is for this reason that DCT is widely used in image compression.

application

DCT Applied to Image Compression

insert image description here

16*16 partitions are used for DCT transformation, and then data storage and reconstruction are performed according to different templates. We will find that if too little data is saved, block effects will occur.

insert image description here

64*64 partition setting, the block effect is more obvious. At this time, it is necessary to collect more data points in each partition.

Application of DCT in JPEG Compression Coding

DCT, also known as discrete cosine transform, is a block transform method that only uses cosine functions to express signals, and is closely related to Fourier transform. It is often used for the compression of image data. By dividing the image into blocks of equal size (generally 8*8), it is transformed by DCT to obtain more concise data. Because there is a large spatial correlation between image pixels, DCT can greatly reduce these correlations, so that the energy of the image is concentrated in the upper left corner area, which is beneficial to data compression. The transformed data are called DCT coefficients. This process is lossless.

According to the theory of the human visual system, the human eye is more sensitive to the transformation of the smooth area of ​​the image, but less sensitive to the transformation of the texture area. After the discrete cosine transform, the image information is concentrated on a few low-frequency coefficients, while the texture and edge information are in the middle. Among the low-frequency coefficients, the change of the low-frequency coefficients has a greater impact on the image visually than the high-frequency coefficients.

The DCT transform transforms the image signal from the spatial domain to the frequency domain, which is the core step of JPEG (lossy image digital compression technology).

In JPEG compression, in order to obtain a higher compression ratio under the premise that the image quality is not obvious, the low-frequency coefficients that are important to human vision are retained, and most of the high-frequency coefficients are turned into zero. Therefore, JPEG Compression is not sensitive to low-frequency coefficients, but sensitive to high-frequency coefficients. Embedding information data in high-frequency parts may be lost in lossy compression. As a trade-off, information can be embedded between mid-frequency coefficients of the image

After the image is transformed by DCT, the total energy in the spatial domain is maintained in the transform domain, but the correlation between pixels decreases, and the energy will be redistributed, transforming from the energy divergence in the spatial domain to the relatively concentrated energy in the frequency domain , and focus on the low-frequency coefficients in the transform domain.

insert image description here

The main calculation steps of the JPEG algorithm:

Forward discrete cosine transform (FDCT)
quantization (quantization)
Zigzag scan
(differential pulse code modulation, DPCM) to encode the DC coefficient (DC) Use run
-length encoding (run-length encoding, RLE) encodes AC coefficients
Entropy coding (entropy coding)
Application of DCT in digital watermarking #Digital
watermarking technology is to embed specific information into the content of digital information, requiring that the embedded information cannot be easily removed. Under certain conditions, it can be extracted to confirm the author's copyright.
Watermark embedded block diagram:
insert image description here

Watermark detection block diagram:
insert image description here

experiment

matlab basics

insert image description here

DCT transform 1

%读入测试图像
mypicture=imread('input.jpg');%显示读入的图像 %为了防止后一个显示的图像覆盖前一个显示结果,每次显示时调用figure生成一个新窗口 
figure(),imshow(mypicture),title('原输入图像');

insert image description here

Convert to grayscale:

grayImage=rgb2gray(mypicture);%如果读入的是彩色图像则转化为灰度图像(灰度图像省略这一步)
figure(),imshow(grayImage),title('原输入彩色图像转化为灰度图像');

Figure II

For image DCT conversion:

%对图像DCT变换 
dctgrayImage=dct2(grayImage);
figure(), imshow(log(abs(dctgrayImage)),[]),title('DCT变换灰度图像'), colormap(gray(4)), colorbar;

Figure three

Quantize the grayscale matrix:

%对灰度矩阵进行量化
dctgrayImage(abs(dctgrayImage)<0.1)=0;

DCT inverse transform:

%DCT逆变换 
I=idct2(dctgrayImage)/255; 
figure(), imshow(I), title('经过DCT变换,然后逆变换的灰度图像');

insert image description here

Compare the images before and after the Fourier transform:

%对比变换傅里叶变换前后的图像 
figure(), subplot(121), imshow(grayImage), title('原灰度图像'),
subplot(122), imshow(I), title('DCT逆变换图像');

Figure five

Result analysis: Discrete cosine transform is performed on the original image, as shown in Figure 3. From the results, it can be seen that the energy of DCT coefficients after transformation is mainly concentrated in the upper left corner, and most of the remaining coefficients are close to zero, which shows that DCT has the characteristics suitable for image compression . The converted DCT coefficients are thresholded, and the coefficients smaller than a certain value are returned to zero. This is the quantization process in image compression, and then the inverse DCT operation is performed to obtain the compressed image, as shown in Figure 4. Comparing the images before and after the transformation from Figure 5, it is difficult to tell the difference with the naked eye. It can be seen that the effect of compression is ideal

DCT transformation 2#

image=imread('input.jpg');
figure;
subplot(2,4,1),imshow(image),title("原图");
 
grayI=rgb2gray(image);
subplot(2,4,2),imshow(grayI),title("灰度图");
 
DCTI=dct2(grayI);
subplot(2,4,3),imshow(DCTI),title("DCT变换");
 
ADCTI=abs(DCT1);
subplot(2,4,4),imshow(ADCTI),title("取绝对值");
 
top=max(ADCTI(:));
subplot(2,4,5),imshow(top),title("取最大值");
 
bottom=min(ADCTI(:));
subplot(2,4,6),imshow(bottom),title("取最小值");
 
ADCTI=(ADCTI-bottom)/(top-bottom)*100;
subplot(2,4,7),imshow(ADCTI),title("量化后");
 
IDCTI=idct2(DCTI)/255;
subplot(2,4,8),imshow(IDCTI),title("DCT逆变换");

insert image description here

Therefore, the energy is mainly distributed in the low frequency components in the upper left corner

DCT image compression

Open an image, DCT transform it, zero out the high frequencies and inverse transform

i

mage=imread('input.jpg');
figure;
subplot(2,2,1),imshow(image),title("原图");
 
grayI=rgb2gray(image);
subplot(2,2,2),imshow(grayI),title("灰度图");
 
DCTI=dct2(grayI);
subplot(2,2,3),imshow(DCTI),title("DCT变换");
 
[h,w]=size(DCTI);
cf=60;
FDCTI=zeros(h,w);
FDCTI(1:cf,1:cf)=DCTI(1:cf,1:cf);
gratOut=uint8(abs(idct2(FDCTI)));
subplot(2,2,4),imshow(gratOut),title("压缩重建后");

insert image description here

Color image to grayscale image

There are two ways to convert a color image to a grayscale image:

1. Use rgb2gray()

The values ​​​​of the three components of RGB are equal, and the output is a grayscale image

2. Use ycbcr()

Extract the Y component. The Y component in the YCBCR format represents the brightness and density in the image, so you only need to output the Y component to get the grayscale image of the image.

Specifically:

YCbCr is represented by an ordered triplet consisting of Y (Luminance), Cb (Chrominance-Blue) and Cr (Chrominance-Red), where Y represents the brightness and concentration of the color, while Cb and Cr Then represent the blue density offset and red density offset of the color, respectively. The human eye is more sensitive to the Y component in the video coded by the YCbCr color space, and small changes in Cb and Cr will not cause visual differences. According to this principle, the image is reduced by sub-sampling Cb and Cr. The amount of data greatly reduces the image's requirements for storage and transmission bandwidth, thereby achieving the effect of almost no visual loss while completing image compression, which in turn makes image transmission faster and storage more convenient. If we want to get a grayscale image, we must first convert the collected color image to YCbCr.

insert image description here

This is the algorithm formula of RGB888 to YCbCr given in the manual of OV7725. Simple and clear, extract the RGB components of a picture, and then use the above formula to calculate the YCbCr component, and then display it in the composite. The picture displayed in this way is the picture in the YCbCr color space. We only take the Y component as the three components of the new picture to synthesize, and the grayscale image of this color picture is obtained.

%将一幅640*480的彩色图片转换成显示成灰度显示?
clc;
clear all;
close all;
 
RGB_data = imread('input.jpg');%图像读入
 
R_data =    RGB_data(:,:,1);
G_data =    RGB_data(:,:,2);
B_data =    RGB_data(:,:,3);
figure;
subplot(1,2,1),imshow(RGB_data),title("原图");
 
[ROW,COL, DIM] = size(RGB_data); %提取图片的行列数
 
Y_data = zeros(ROW,COL);
Cb_data = zeros(ROW,COL);
Cr_data = zeros(ROW,COL);
Gray_data = RGB_data;
%YCbCr_data = RGB_data;
 
for r = 1:ROW
    for c = 1:COL
        Y_data(r, c) = 0.299*R_data(r, c) + 0.587*G_data(r, c) + 0.114*B_data(r, c);
        Cb_data(r, c) = -0.172*R_data(r, c) - 0.339*G_data(r, c) + 0.511*B_data(r, c) + 128;
        Cr_data(r, c) = 0.511*R_data(r, c) - 0.428*G_data(r, c) - 0.083*B_data(r, c) + 128;
    end
end
 
Gray_data(:,:,1)=Y_data;
Gray_data(:,:,2)=Y_data;
Gray_data(:,:,3)=Y_data;
subplot(1,2,2),imshow(Gray_data),title("YCBCR转灰度图");

insert image description here

DCT block transform

1. Use dct2()

% 读取灰度图像
img = imread('huidu.jpg');
% dct2 是2维dct变换函数,得到一个与图像大小相同的二维矩阵
dct_mtx = dct2(img);
% idct2 是逆2维dct变换函数,得到原图像矩阵
img_idct = idct2(dct_mtx)/255;
figure;
subplot(1,3,1),imshow(img),title("原图");
subplot(1,3,2),imshow(dct_mtx),title("DCT变换");
subplot(1,3,3),imshow(img_idct),title("DCT还原");

2. Use dctmtx()

io = double(imread("huidu.jpg"));
T = dctmtx(8);
% 对载体图像进行DCT变换
DCT_org = blkproc(io,[8 8], 'P1*x*P2',T, T');
% 对DCT 矩阵进行逆变换
DCT_reverse = blkproc(DCT_org,[8 8], 'P1*x*P2',T', T);
figure;
subplot(1,3,1),imshow(io),title("原图");
subplot(1,3,2),imshow(DCT_org),title("DCT分块变换");
subplot(1,3,3),imshow(DCT_reverse),title("DCT分块还原");

DCT Reversible Information Hiding

Principle: There are many ways to use the matrix of DCT transformation to operate. Here is one of the methods [the original image is lossy, and the extracted information is lossless]

Hiding method: It mainly uses the relative size of two specific numbers in the carrier to represent the hidden information. The sender and the receiver agree in advance on the positions of the two DCT coefficients used in the embedding process (for the robustness and imperceptibility of concealment, these two DCT coefficients should be selected in the DCT intermediate frequency coefficients). For example, let (u,v) and (m,n) be the coordinates of the two selected coefficients. The embedding process is: if Bi(u,v)> Bi(m,n), it represents the hidden information "1", and if Bi(u,v) < Bi(m,n) represents the hidden information "0".

If the information bit that needs to be hidden is 1, but Bi(u,v)<Bi(m,n), then the two coefficients are exchanged, and finally the sender converts the image into the spatial domain through the two-dimensional inverse DCT transformation for transmission.

Extraction method: After receiving the image, the receiver performs two-dimensional DCT transformation on the image, and compares the relative size of the DCT coefficients at the agreed positions in each block to obtain the bit string of hidden information, thereby extracting the secret information.

% DCT 变换信息隐藏
io = imread("huidu.jpg");
figure;
subplot(1,3,1),imshow(io),title("原图");
 
io = double(io);
% 待嵌入的秘密信息msg
msg = [1,0,1,1];
% 用于计数,嵌入完成后停止操作。
count = length(msg);
org_msg = [1,0,1,1];
T = dctmtx(8); %图像分块8*8
DCTrgb = blkproc(io,[8 8], 'P1*x*P2',T, T'); % 对载体图像进行DCT变换
subplot(1,3,2),imshow(DCTrgb),title("DCT分块变换");
 
[row,col]=size(DCTrgb);
row=floor(row/8);
col=floor(col/8);
alpha=0.02;
k = 1;
temp=0;
for i=0:(row - 1)
    for j=0: (col -1)
        irow = i * 8;
        jcol = j * 8;
        if k <= count
            if msg(k) == 0
                %选择(5,2),(4,3)这两对系数,
                % 策略是(5,2)的DCT系数 < (4,3)时,表示嵌入了0 
                % 如果(5,2) > (4,3) 那我们把两个系数交换,还表示嵌入了0
                if DCTrgb(irow + 5, jcol + 2) < DCTrgb(irow + 4,jcol + 3)
                    temp = DCTrgb(irow + 5, jcol + 2);
                    DCTrgb(irow + 5, jcol + 2) = DCTrgb(irow + 4,jcol + 3);
                    DCTrgb(irow + 4, jcol + 3) = temp;
                end
            else
                if DCTrgb(irow + 5, jcol + 2) > DCTrgb(irow + 4,jcol + 3)
                    temp = DCTrgb(irow + 5, jcol + 2);
                    DCTrgb(irow + 5, jcol + 2) = DCTrgb(irow + 4,jcol + 3);
                    DCTrgb(irow + 4, jcol + 3) = temp;
                end
            end
            %将原本小的系数变的更小,使系数差变大
            if DCTrgb(irow + 5, jcol + 2) < DCTrgb(irow + 4,jcol +3)
                DCTrgb(irow + 5, jcol + 2) = DCTrgb(irow +5, jcol +2) - alpha;
            else
                DCTrgb(irow + 4, jcol + 3) = DCTrgb(irow + 4, jcol +3) - alpha;
            end
            k = k + 1;
        end
    end
end
wi=blkproc(DCTrgb,[8 8],'P1*x*P2',T',T); %嵌入信息的载体DCT变换,恢复图像
orgin_wi=wi/255;
subplot(1,3,3),imshow(orgin_wi),title("嵌入信息后的");
 
% 提取消息
% ext_msg是提取出的秘密信息
ext_msg = [];
T=dctmtx(8);
DCTcheck=blkproc(wi,[8 8],'P1*x*P2',T,T'); %对隐秘图像进行DCT变换
[row,col]=size(DCTcheck);
row=floor(row/8);
col=floor(col/8);
k = 1;
for i=0:(row - 1)
    for j=0: (col -1)
        irow = i * 8;
        jcol = j * 8;
        %通过比较(5,2),(4,3)这两对系数,判断隐藏的信息是1还是0
        if k <= count
            if DCTcheck(irow + 5, jcol + 2) < DCTcheck(irow + 4,jcol + 3)
                ext_msg(k,1)=1;
            end
            if DCTcheck(irow + 5, jcol + 2) > DCTcheck(irow + 4,jcol + 3)
                ext_msg(k,1)=0;
            end
            k = k + 1;
        end
    end
end
fprintf("嵌入的信息:");
disp(msg);
fprintf("提取的信息:");
disp(ext_msg);

Guess you like

Origin blog.csdn.net/VinagerJoe/article/details/129023880