opencv foundation 58 Fourier transform cv2.dft()->(image enhancement, image denoising, edge detection, feature extraction, image compression and encryption)

What is the Fourier transform?

The Fourier transform is named after the French mathematician Jean-Baptiste Joseph Fourier
in honor of his contributions to this mathematical tool. Fourier lived at the end of the 18th century and the beginning of the 19th century. He was a versatile scientist who not only made outstanding achievements in the field of mathematics, but also made important contributions in the fields of heat conduction and astrophysics.

The earliest form of the Fourier transform was first introduced in 1822 by Fourier, who described a method for decomposing a periodic function into a series of sine and cosine functions in his book "Analytic Method for Solving the Heat Equation". His work laid the foundation for modern frequency-domain analysis in signal processing, image processing, and other fields.

The Fourier transform is named after Fourier because he was the first to systematically study and apply this mathematical tool. Fourier's work has profoundly influenced modern science, especially in the fields of signal processing, image processing, and communication. Fourier transform is widely used to analyze and process different types of signals and data.

What is the transformation in the Fourier transform?

In Fourier transform, "transform" refers to converting a function from a time domain (time domain) representation to a frequency domain (frequency domain) representation. In other words, the Fourier transform transforms the representation of a function from describing its behavior at different points in time to describing its components at different frequencies.

In the Fourier transform, we transform a function (usually a signal or image). This function can be continuous or discrete. The Continuous Fourier Transform (CFT) is applied to a continuous-time signal, transforming it from the time domain to the continuous frequency domain. The Discrete Fourier Transform (DFT) is applied to a discrete-time signal, transforming it from the time domain to the discrete frequency domain.

What is transformed in opencv?

In OpenCV, Fourier Transform (Fourier Transform) is a transformation performed on an image, converting the image from the spatial domain (time domain) to the frequency domain. Specifically,
the Fourier transform in OpenCV is the discrete Fourier transform (DFT), which is used to convert a discrete image from the time domain to the discrete frequency domain.

In the Fourier transform, OpenCV 对输入的图像进行数学变换,将图像的每个像素表示为一系列正弦和余弦波的振幅和相位. This converts an image from its raw pixel representation to a complex representation representing the different frequency components in the image.

After reading the above information, I still don’t know what it is, so I know why it is called Fourier transform.


The Fourier transform is very abstract. Many people have used the Fourier transform for many years in engineering and have not fully understood what the Fourier transform is all about. In order to better illustrate the Fourier transform, let's look at an example in life.

Table 14-1 shows the formula of a certain drink. The formula is a table in the form of time. The table is very long, and only a part of it is intercepted here. This table records the operations from time "00:00" to a specific time "00:11".

insert image description here

Careful analysis of the table reveals that the recipe:

  • Put 1 piece of rock sugar every 1 minute.
  • Put 3 red beans every 2 minutes.
  • Put 2 mung beans every 3 minutes.
  • Place 4 tomatoes every 4 minutes.
  • Put 1 cup of purified water every 5 minutes.

The above text is an explanation of the formula from the perspective of operating frequency.

In the process of data processing, information is often expressed in the form of graphs. From the perspective of time domain, the recipe table can be expressed as Figure 14-1. Figure 14-1 only shows the operation of the recipe in the first 11 minutes. If the operation of the recipe is to be fully expressed, the operation steps in the whole time must be drawn with a chart.

insert image description here

If expressed from the perspective of frequency (period), this formula table can be expressed as Figure 14-2, where the abscissa in the figure is the period (reciprocal of frequency), and the ordinate is the number of ingredients.

It can be seen that Figure 14-2 can completely represent the operation process of this recipe.

insert image description here

For functions, it is also possible to transform it from the time domain to the frequency domain .

Figure 14-3 is a sinusoid with frequency 5 (5 cycles in 1 second) and amplitude 1.

insert image description here
If considered from the perspective of frequency, it can be drawn as a frequency domain diagram shown in Figure 14-4, where the abscissa is the frequency and the ordinate is the amplitude.

insert image description here
Figure 14-3 and Figure 14-4 are equivalent; they are different representations of the same function. The corresponding time domain representation can be obtained through the frequency domain representation, and the corresponding frequency domain representation can also be obtained through the time domain representation.

French mathematician Fourier pointed out that any periodic function can be expressed as the sum of sinusoidal functions of different frequencies. In today's view, this theory is taken for granted, but this theory is difficult to understand and was greatly questioned at the time.

Let's look at the specific process of Fourier transform. For example, the curve of a periodic function is shown in the upper left plot of Figure 14-5. This periodic function can be expressed as:

y = 3 np.sin(0.8 x) + 7 np.sin(0.5 x) + 2 np.sin(0.2 x)

Therefore, the function can be seen as the sum of the following three functions:

  • y1 = 3 np.sin(0.8 x) (function 1)
  • y2 = 7 np.sin(0.5 x) (function 2)
  • y3 = 2 np.sin(0.2 x) (function 3)

The function curves corresponding to the above three functions are shown in the upper right corner, lower left corner and lower right corner of Figure 14-5 respectively.

insert image description here
If considered from the perspective of the frequency domain, the above three sine functions can be expressed as three columns in Figure 14-6, where the abscissa in the figure is the frequency, and the ordinate is the amplitude.

insert image description here
From the above analysis, we can see that the function curve in the upper left corner of Figure 14-5 can be expressed as the frequency domain diagram shown in Figure 14-6.

The process of constructing the frequency domain graph shown in Figure 14-6 from the time domain function graph in the upper left corner of Figure 14-5 is the Fourier transform.

Figure 14-1 and Figure 14-2 represent the same information, Figure 14-1 is a time domain diagram and Figure 14-2 is a frequency domain diagram.
The time domain function graph in the upper left corner of Figure 14-5 is exactly the same information as the frequency domain graph shown in Figure 14-6.
Fourier transform is to completely express time domain information from the perspective of frequency domain.
In addition to the frequency and amplitude mentioned above, the problem of time difference should also be considered. For example, beverage formulations require tight control over when ingredients are added in order to control flavor. The operation at "00:00" in Table 14-1, under finer control, is actually shown in Table 14-2.

insert image description here
If the timing of adding the ingredients is changed, the flavor of the drink will change. Therefore, in the actual processing process, the time difference must also be considered. This time difference is the phase in the Fourier transform . Phase represents information related to time difference.

For example, the function corresponding to the upper left corner of Figure 14-7 can be expressed as:

y = 3 np.sin(0.8 x) + 7 np.sin(0.5 x+2) + 2 np.sin(0.2 x+3)

Therefore, the function can be seen as the sum of the following three functions:

  • y1 = 3 np.sin(0.8 x) (function 1)
  • y2 = 7 np.sin(0.5 x+2) (function 2)
  • y3 = 2 np.sin(0.2 x+3) (function 3)

The function curves corresponding to the above three functions are shown in the upper right corner, lower left corner and lower right corner of Figure 14-7 respectively.

insert image description here
In this example, if the abscissa is regarded as the start time, the three sine functions that constitute the function y do not all start from time 0, and there is a time difference between them. If the function without time difference is directly used, the function shown in the upper left corner of Figure 14-7 cannot be formed, but the function shown in the upper left corner of Figure 14-5 will be formed.
Therefore, the phase difference is also a very important condition in the Fourier transform.

The above uses examples of beverage recipes and functions to introduce the feasibility of time domain and frequency domain conversion. I hope it will be helpful for everyone to understand Fourier transform.

In the image processing process, the Fourier transform is to decompose the image into two parts, the sine component and the cosine component, that is, to convert the image from the spatial domain to the frequency domain (hereinafter referred to as the frequency domain). After the digital image undergoes Fourier transform, the obtained frequency domain value is a complex number.

Therefore, to display the results of the Fourier transform, it is necessary to use the form of real image plus imaginary image (complex image), or magnitude image (magnitude image) plus phase image (phase image).

Because the magnitude image contains most of the information we need in the original image, usually only the magnitude image is used during image processing. Of course, if you want to process the image in the frequency domain first, and then obtain the modified spatial domain image through inverse Fourier transform, you must preserve the magnitude image and phase image at the same time.

After performing Fourier transform on the image, we will get the low frequency and high frequency information in the image. Low-frequency information corresponds to slowly changing grayscale components in the image. The high-frequency information corresponds to the gray-scale component that changes faster and faster in the image, which is caused by the sharp transition of the gray scale.

For example, if there is a lion in an image of a prairie, the low-frequency information corresponds to details such as the vast grassland with consistent colors, while the high-frequency information corresponds to various edge and noise information such as the outline of the lion.

The purpose of Fourier transform is to convert the image from the spatial domain to the frequency domain, and realize the processing of specific objects in the image in the frequency domain, and then inverse Fourier transform the processed frequency domain image to obtain the spatial domain image. Fourier transform plays a very critical role in the field of image processing, which can realize image enhancement, image denoising, edge detection, feature extraction, image compression and encryption, etc.


Looking at a bunch of text descriptions burns my brain, let's take a look at the code

Numpy implements Fourier transform function description

The Numpy module provides the Fourier transform function, and the fft2() function in the Numpy module can realize the Fourier transform of the image.

The function provided by Numpy to realize the Fourier transform is numpy.fft.fft2(), and its syntax format is:

return value = numpy.fft.fft2(original image)

这里需要注意的是,参数“原始图像”的类型是灰度图像,函数的返回值是一个复数数组(complex ndarray)。

After the processing of this function, the spectrum information of the image can be obtained. At this time, the zero frequency component in the image spectrum is located in the upper left corner of the spectrum image (frequency domain image) 1. For the convenience of observation, the numpy.fft.fftshift() function is usually used to move the zero frequency component to the center of the frequency domain image , as shown in Figure 14-8.

insert image description here
The syntax format of the function numpy.fft.fftshift() is:

return value = numpy.fft.fftshift(raw spectrum)

After processing with this function, the zero-frequency component in the image spectrum will be moved to the center of the frequency-domain image, which is very effective for observing the zero-frequency part in the spectrum after Fourier transform.
After performing Fourier transform on the image, what is obtained is an array of complex numbers. In order to be displayed as an image, their values ​​​​need to be adjusted to the gray space of [0, 255], and the formula used is:

Pixel new value = 20*np.log(np.abs(spectrum value))

Code example: Implement Fourier transform with Numpy and observe the obtained spectrum image.

code show as below:

import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread('lena.png',0)
f = np.fft.fft2(img)
fshift = np.fft.fftshift(f)
magnitude_spectrum = 20*np.log(np.abs(fshift))
plt.subplot(121)
plt.imshow(img, cmap = 'gray')
plt.title('original')
plt.axis('off')
plt.subplot(122)
plt.imshow(magnitude_spectrum, cmap = 'gray')
plt.title('result')
plt.axis('off')
plt.show()

Running result:
display the original image and its spectrum image
insert image description here

Implement the inverse Fourier transform

What is required 注意is that if the numpy.fft.fftshift() function is used to shift the zero frequency component during the Fourier transform, then in the inverse Fourier transform process, the numpy.fft.ifftshift() function needs to be used to shift the zero frequency component first The frequency components are moved to the original position, and then inverse Fourier transform is performed, the process is shown in Figure 14-10.

insert image description here

The function numpy.fft.ifftshift() is the inverse function of numpy.fft.fftshift(), and its syntax format is:

adjusted spectrum = numpy.fft.ifftshift(raw spectrum)

The numpy.fft.ifft2() function can implement the inverse Fourier transform and return an array of spatial complex numbers. It is the inverse function of numpy.fft.fft2(), the syntax of this function is:

Return value = numpy.fft.ifft2 (frequency domain data)

The return value of the function numpy.fft.ifft2() is still a complex ndarray.
The spatial domain information obtained by the inverse Fourier transform is a complex array, which needs to be adjusted to the [0, 255] grayscale space. The formula used is:

iimg = np.abs(inverse Fourier transform result)

Example: Implement Fourier transform and inverse Fourier transform in Numpy, and observe the result image of inverse Fourier transform.

import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread('lena.png',0)
#对图像进行傅里叶变换,结果是复数
f = np.fft.fft2(img)
#将低频部分移动到图像中心
fshift = np.fft.fftshift(f)
#将复数变化成实数
magnitude_spectrum = 20*np.log(np.abs(fshift))
#将傅里叶变换的结果进行反变换
ishift = np.fft.ifftshift(fshift)


#对傅里叶变换的结果进行反变换
iimg = np.fft.ifft2(ishift)

#取绝对值
iimg = np.abs(iimg)


plt.subplot(131),plt.imshow(img, cmap = 'gray')
plt.title('original'),plt.axis('off')

plt.subplot(132),plt.imshow(magnitude_spectrum, cmap = 'gray')
plt.title('magnitude_spectrum'),plt.axis('off')

plt.subplot(133),plt.imshow(iimg, cmap = 'gray')
plt.title('iimg'),plt.axis('off')

plt.show()

operation result:

insert image description here

High-pass filtering example

In an image, high-frequency signals and low-frequency signals exist at the same time.

  • Low-frequency signals correspond to slowly changing grayscale components in the image. For example, in an image of a savannah, the low-frequency signal corresponds to a
    broad expanse of grassland that tends to be uniform in color.
  • The high-frequency signal corresponds to the gray-scale component that changes faster and faster in the image, which is caused by the sharp transition of the gray scale. If there is also a lion in the savannah image above, then the high frequency signal corresponds to information such as the edge of the lion.

The filter can allow a certain frequency component to pass or reject it, and can be divided into a low-pass filter and a high-pass filter according to its mode of action.

  • A filter that allows low frequency signals to pass is called a low pass filter. A low-pass filter attenuates high-frequency signals and passes low-frequency signals 会使图像变模糊.
  • A filter that allows high-frequency signals to pass is called a high-pass filter. A high-pass filter attenuates low-frequency signals and passes high-frequency signals, which will enhance sharp details in the image, but will cause the image to lose contrast.

Fourier transform can separate the high-frequency signal and low-frequency signal of the image

insert image description here
To set all the pixel values ​​in the middle of the right picture in Figure 14-12 to zero, it is necessary to calculate the coordinates of its center position first, and then select an area of ​​30 pixels in size from the top to the bottom, left to right, with the coordinates as the center, and set the pixels in this area to zero. The value is set to zero.

The implementation of this filter is:

rows, cols = img.shape
crow,ccol = int(rows/2) , int(cols/2)
fshift[crow-30:crow+30, ccol-30:ccol+30] = 0

Example: Fourier transform an image in Numpy to get its frequency domain image. Then, the value of the low-frequency component is processed as 0 in the frequency domain to realize high-pass filtering. Finally, inverse Fourier transform is performed on the image to obtain the restored original image.

Observe the difference in the image before and after the Fourier transform.

import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread('lena.png',0)
f = np.fft.fft2(img)
fshift = np.fft.fftshift(f)
rows, cols = img.shape
crow,ccol = int(rows/2) , int(cols/2)
fshift[crow-30:crow+30, ccol-30:ccol+30] = 0
ishift = np.fft.ifftshift(fshift)
iimg = np.fft.ifft2(ishift)
iimg = np.abs(iimg)
plt.subplot(121),plt.imshow(img, cmap = 'gray')
plt.title('original'),plt.axis('off')
plt.subplot(122),plt.imshow(iimg, cmap = 'gray')
plt.title('iimg'),plt.axis('off')
plt.show()

operation result:

insert image description here

OpenCV implements Fourier transform

OpenCV 提供了函数 cv2.dft()和 cv2.idft()来实现傅里叶变换和逆傅里叶变换, which will be introduced separately below.

Implement the Fourier transform

The syntax format of the function cv2.dft() is:

Return result = cv2.dft (original image, converted logo)

When using this function, you need to pay attention to the usage specifications of the parameters:

  • For the parameter "original image", first use the np.float32() function to convert the image into np.float32 format.
  • The value of "conversion flag" is usually "cv2.DFT_COMPLEX_OUTPUT", which is used to output a complex array.

The result returned by the function cv2.dft() is consistent with the result obtained by using Numpy for Fourier transform, but the value it returns is dual-channel, the first channel is the real part of the result, and the second channel is the
result The imaginary part of .

After the transformation of the function cv2.dft(), we get the spectral information of the original image.
At this time, the zero-frequency component is not at the center. For the convenience of processing, it needs to be moved to the center, which can be realized by the function numpy.fft.fftshift().

For example, the following statement moves the zero-frequency component in the spectrum image dft to the center of the spectrum, and obtains the spectrum image dftshift with the zero-frequency component at the center.

dftShift = np.fft.fftshift(dft)

After the above processing, the spectral image is only a value composed of real part and imaginary part. To display it, further processing is required.

函数 cv2.magnitude()可以计算频谱信息的幅度. The syntax of this function is:

Return value = cv2.magnitude (parameter 1, parameter 2)

The meanings of the two parameters in the formula are as follows:

  • Parameter 1: The x-coordinate value of floating point type, that is, the real part.
  • Parameter 2: The floating-point y-coordinate value, that is, the imaginary part, which must have the same size as parameter 1 (the size of the value, not the size of the value).

The return value of the function cv2.magnitude() is the square root of the sum of the squares of parameter 1 and parameter 2, the formula is:

insert image description here
In the formula, I represents the original image, and dst represents the target image.

After obtaining the magnitude of the spectrum information, it is usually necessary to further convert the magnitude value so as to display the spectrum information in the form of an image. To put it simply, it is necessary to map the amplitude value to the gray-scale space [0, 255] of the gray-scale image, so that it can be displayed in the form of a gray-scale image.

The formula used here is:

result = 20*np.log(cv2.magnitude(real part, imaginary part))

The following is a Fourier transform of an image to help readers observe the above processing. The following code performs Fourier transform on the image "lena", calculates the magnitude value, and normalizes the magnitude value:

import numpy as np
import cv2
img = cv2.imread('lena.png',0)
dft = cv2.dft(np.float32(img),flags = cv2.DFT_COMPLEX_OUTPUT)
print(dft)
dftShift = np.fft.fftshift(dft)
print(dftShift)
result = 20*np.log(cv2.magnitude(dftShift[:,:,0],dftShift[:,:,1]))
print(result)

The obtained value ranges are shown in Fig. 14-14 respectively. in:

  • The figure on the left shows the spectral value obtained by the function cv2.dft(), which is composed of real and imaginary parts.
  • The middle figure shows the spectral magnitude values ​​calculated by the function cv2.magnitude(), which are not in the standard image grayscale space [0, 255].
  • The figure on the right shows the result of further normalization of the spectral magnitude value calculated by the function cv2.magnitude(), and the value range is now in [0, 255].

insert image description here

Example: Use OpenCV function to perform Fourier transform on an image and display its spectral information

import numpy as np
import cv2
import matplotlib.pyplot as plt
img = cv2.imread('lena.png',0)
dft = cv2.dft(np.float32(img),flags = cv2.DFT_COMPLEX_OUTPUT)
dftShift = np.fft.fftshift(dft)
result = 20*np.log(cv2.magnitude(dftShift[:,:,0],dftShift[:,:,1]))
plt.subplot(121),plt.imshow(img, cmap = 'gray')
plt.title('original'),plt.axis('off')
plt.subplot(122),plt.imshow(result, cmap = 'gray')
plt.title('result'), plt.axis('off')
plt.show()

operation result:

  • The left image is the original image.
  • The image on the right is a spectrum image, which is the result of using the function np.fft.fftshift() to shift the zero-frequency component to the center of the spectrum image.
    insert image description here

Implement the inverse Fourier transform

In OpenCV, use 函数 cv2.idft()实现逆傅里叶变换, this function is
the inverse function of the Fourier transform function cv2.dft(). Its syntax format is:

Return result = cv2.idft (original data)

After Fourier transforming an image, the zero-frequency component is usually shifted to the center of the spectral image. If the
zero-frequency component is moved by using the function numpy.fft.fftshift(), then before inverse Fourier transform, the function
numpy.fft.ifftshift() should be used to restore the zero-frequency component to its original position.

Also note that after the inverse Fourier transform, the resulting value is still a complex number, and
its magnitude needs to be calculated using the function cv2.magnitude().

Example: Use the OpenCV function to perform Fourier transform and inverse Fourier transform on the image, and display the original image and the image obtained after the inverse Fourier transform.

import numpy as np
import cv2
import matplotlib.pyplot as plt
img = cv2.imread('lena.png',0)
#对图像进行傅里叶变换
dft = cv2.dft(np.float32(img),flags = cv2.DFT_COMPLEX_OUTPUT)
#将低频部分移动到图像中心
dftShift = np.fft.fftshift(dft)
#将复数变化成实数
ishift = np.fft.ifftshift(dftShift)
#将傅里叶变换的结果进行反变换
iImg = cv2.idft(ishift)

#取绝对值,取对数的目的为了将数据变化到较小的范围(比如0-255,并显示,结果是复数
iImg= cv2.magnitude(iImg[:,:,0],iImg[:,:,1])
print(iImg.shape)
plt.subplot(121),plt.imshow(img, cmap = 'gray')
plt.title('original'), plt.axis('off')
plt.subplot(122),plt.imshow(iImg, cmap = 'gray')
plt.title('inverse'), plt.axis('off')
plt.show()

operation result:

insert image description here
The image on the right is the image obtained after performing Fourier transform and inverse Fourier transform on the original image img.

Guess you like

Origin blog.csdn.net/hai411741962/article/details/132213252