Chapter 5 Image Processing


foreword

This chapter explains the related content of image processing, including image pyramid, image contour template extraction, histogram, image Fourier transform, etc.

1. Image Pyramid

  • What it means: Image pyramids are a technique used in image processing and computer vision. It is a collection of images in which each image is lower resolution than the previous image, thus forming a pyramidal structure. These images can be generated from different resolution versions of the same image, or they can be scanned and scaled down between different images.
  • Function: In image enhancement, the image can be decomposed into a series of images of different resolutions, and then the images of each resolution level are processed, and finally they are synthesized into an enhanced image. In image detection, the pyramid technique can be used for scale invariant detection of objects. In object tracking, objects can be searched at different scales.
    insert image description here

1. Gaussian Pyramid

  • Downsampling (shrinking the picture):
    1. Gaussian filtering is performed first
    2. Remove even-numbered rows and columns
  • Upsample (enlarge the picture):
    1. Pad even rows, columns with zeros
      [ 10 30 56 96 ] ⇒ [ 10 0 30 0 0 0 0 0 56 0 96 0 0 0 0 0 ] \begin{bmatrix} 10& 30\\ 56&96\end{bmatrix}\Rightarrow \ begin{bmatrix} 10 & 0 & 30 & 0\\ 0& 0 & 0 & 0\\ 56& 0& 96 &0 \\ 0& 0& 0&0\end{bmatrix}[10563096] 10056000003009600000
    2. Perform Gaussian convolution on the enlarged image, so that the zero value can be filled
# 向上采样
cv2.pyrUp(src[, dst[, dstsize[, borderType]]]) -> dst
# 向下采样
cv2.pyrDown(src[, dst[, dstsize[, borderType]]]) -> dst

2. Laplace Pyramid

I i + 1 = I i − P y r U p ( p y r D o w n ( I i ) ) I_{i+1} = I_i - PyrUp(pyrDown(I_i)) Ii+1=IiPyrUp(pyrDown(Ii))
Execute the above formula to get the image of each layer.

Laplacian pyramid and Gaussian pyramid are two important concepts of image pyramid, and they have the following differences:

  1. The algorithm steps are different: the Gaussian pyramid is obtained by continuously downsampling (downsampling) and Gaussian filtering of the original image, and each downsampling will halve the size of the image. The Laplacian pyramid is obtained by sequentially upsampling (enlarging) and subtracting the corresponding low-resolution image from the Gaussian pyramid.

  2. Different pyramid purposes: Gaussian pyramids are mainly used for image downsampling (reduction) and scale space analysis, and can be used for scale-invariant feature description of images, such as SIFT, SURF algorithms, etc. The Laplacian pyramid is mainly used for image enhancement, edge detection and image fusion.

  3. Pyramid layers and size: The number of layers of the Gaussian pyramid is related to the size of the original image. The larger the size, the more layers of the Gaussian pyramid. And the Laplacian pyramid is usually the same size as the Gaussian pyramid because it is derived from the Gaussian pyramid.

  4. Local feature information: Although reversibility and single-scale invariance are useful, they may not be accurate enough when only used to describe local features. Therefore, the Laplacian pyramid can usually provide richer local feature information than the Gaussian pyramid.

Therefore, according to different application scenarios, suitable pyramid structures such as Gaussian pyramid or Laplacian pyramid can be selected to process images.

2. Image outline

1. Contour extraction

contours, hierarchy = cv2.findContours(image, mode, method[, contours[, hierarchy[, offset]]])

in:

  • image: The input image should be a binary image, the value of each pixel is either 0 or 255, representing two colors of foreground and background.
  • mode: Contour retrieval mode, which is an enumeration value, and its value range includes:
    • cv2.RETR_EXTERNAL: Only detect the outermost contour, that is, only return the edge contour;
    • cv2.RETR_LIST: Detect all contours and return their complete list;
    • cv2.RETR_CCOMP: Detect all contours, but only return two-level contour structure, namely outer contour and inner contour;
    • cv2.RETR_TREE: Detect all contours and return the complete contour tree structure.
  • method: Contour approximation method, which is an enumeration value, and the value range includes:
    • cv2.CHAIN_APPROX_NONE: Store all contour points, and the coordinates (x, y) of each point are stored;
    • cv2.CHAIN_APPROX_SIMPLE: Only keep the endpoints of the contour, just store their coordinates, for example, only need the coordinates of 4 points for a rectangular contour.
  • contours: Output parameter, returns the detected contours, each contour is a Numpy array represented by pixel coordinates.
  • hierarchy: output parameter, returns the hierarchical relationship of contours, each contour is represented by 4 values (父级轮廓编号,下一级轮廓编号,第一个子级轮廓编号,前一个兄弟轮廓编号).
    understanding of hierarchy

2. Contour drawing

cv2.drawContours()is a function provided by OpenCV that can draw contours in an image.

The syntax of this function is as follows:

cv2.drawContours(image, contours, contourIdx, color, thickness=None, lineType=None, hierarchy=None, maxLevel=None, offset=None)

The parameters are explained as follows:

  • image: The image to draw the outline on.
  • contours: A list consisting of contour point sets, which can cv2.findContours()be obtained through functions. This parameter can be a single contour (Point collection), or a list of multiple contours.
  • contourIdx: Specifies the number of the contour to be drawn. If negative, it means to draw all contours.
  • color: Outline color, can be an RGB tuple or a BGR tuple.
  • thickness: Contour line thickness, default value is 1.
  • lineType: Line type, the default value is 8-connectivity (ie cv2.LINE_8), and can also be set to 4-connectivity (ie cv2.LINE_4) or CV_AA.
  • hierarchy: Contour level information, optional parameter.
  • maxLevel: The maximum level of contours that can be drawn, optional parameter.
  • offset: Offset for contour calculation, optional parameter.

3. Contour features

In OpenCV, for contours in images, there are many features that can be used to describe and analyze these contours. Some of the main profile features are listed below:

  1. Contour area: the size of the area contained in the contour curve, which can be cv2.contourArea()calculated with functions.

  2. Contour perimeter: the length of the contour curve, which can be cv2.arcLength()calculated with functions.

  3. Contour approximation: approximate the contour shape by reducing the number of contour points, which can be cv2.approxPolyDP()processed by functions.

  4. Contour center of gravity: the centroid point coordinates of the area included in the contour curve, which can be cv2.moments()calculated with functions.

  5. Contour Direction: The direction of the contour curve, which can be calculated by fitting an ellipse or a rectangle with the cv2.fitEllipse()or function, respectively.cv2.minAreaRect()

  6. Contour convex hull: Convex shape containing all points of the contour curve, which can be cv2.convexHull()calculated with functions.

  7. Contour defect: The gap between the convex hull and the contour curve can be cv2.convexityDefects()calculated with a function.

These contour features can be used in combination to analyze contour shapes and features in images. In practical applications, contour features are often used for tasks such as object detection, image classification, and image recognition.

4. Contour Approximation

  • Principle: insert image description here
    firstly connect two points A and B, and then find a point on the arc AB so that the distance from the point to the straight line AB is the largest, record the point as C, and the distance as d, if d is less than the specified value, use The straight line AB replaces the arc AB, otherwise, take C as the middle point, and then judge the approximation of the arc AC and the arc CB respectively.

In OpenCV, cv2.approxPolyDP()the function can approximate the contour to reduce the number of contour points and simplify the contour curve, thereby improving the efficiency of image processing. The syntax of this function is as follows:

cv2.approxPolyDP(curve, epsilon, closed[, approxCurve])

The parameters are explained as follows:

  • curve: The input contour, generally a list or Numpy array of points.
  • epsilon: Specifies the degree of approximation, i.e. the maximum error to the contour. If the specified distance is less than epsilon, then the points are considered to be on the same curve.
  • closed: Specifies whether it is a closed curve, if yes True, it means the curve is closed, otherwise it is an open curve.
  • approxCurve: Approximate contour to output, can be numpy array or empty object. If no output array is specified, the function returns an array of coordinate points that approximate the contour.

The return value of this function represents the coordinate points of the approximate contour, and the number of points on the approximated curve can be determined by the size of the output array. epsilonThe value of is affected by many factors and needs to be set according to the specific situation.

The effect of approximate processing is often affected by the adjustment parameters. If epsilonthe setting is too small and the processing is excessive, important contour information may be lost; if epsilonit is set too large, too much contour information may be retained, which increases the amount of calculation , and may lead to problems such as false detection and missed detection of the contour. Therefore, when performing contour approximation, it is necessary to adjust the parameters according to the specific situation in order to achieve the best processing effect.

5. Contour marking

  • Function: Mark the outline with a shape (rectangle, circle, etc.).
# 背景画布
canvabg = img.copy()
# 获取轮廓
cnt0 = contours[0]
# 矩形边框
startx,starty,width,height = cv2.boundingRect(cnt0)
cv2.rectangle(canvabg,(startx,starty),(startx + width,starty + height),(0,255,0),2)
# 获取轮廓
cnt1 = contours[2]
# 圆圈外框
(cx,cy),radius = cv2.minEnclosingCircle(cnt1)
cv2.circle(canvabg,(int(cx),int(cy)),int(radius),(255,0,0),2)

insert image description here

3. Template matching

  • Idea: Use the template image as a convolution kernel to perform convolution operation with the matched image, and then according to the specificmatching algorithmCalculate the confidence level of each step of the convolution operation, and determine the position of the template image in the matched image according to the confidence level.

In OpenCV, cv2.matchTemplate()the function of image matching can be realized by using functions. Image matching refers to finding the location and number of a certain region of interest (usually a template image) in an image.

cv2.matchTemplate()The syntax of the function is as follows:

cv2.matchTemplate(image, templ, method[, result[, mask]]) → result

The parameters are explained as follows:

  • image: The input image should be an 8-bit or 32-bit floating-point grayscale image.
  • templ: template image, the image region to look for in the input image. Has the same data type and number of channels as the input image.
  • method:
    • Squared Difference Matching ( cv2.TM_SQDIFF): This matching method compares the square of the difference between the pixels in the input image and the template image in turn, and returns the sum of the differences. The smaller the matching result, the higher the matching degree.

    • Normalized square difference matching ( cv2.TM_SQDIFF_NORMED): This matching method is similar to the square difference matching method, but the matching results will be normalized, that is, the smaller the matching result, the higher the matching degree.

    • Correlation matching ( cv2.TM_CCORR): This matching method performs a cross-correlation operation on the input image and the template image, and returns the maximum value of the correlation coefficient. The larger the matching result, the higher the matching degree.

    • Normalized related matching ( cv2.TM_CCORR_NORMED): This matching method is similar to the related matching method, but the matching results are normalized, that is, the larger the matching result, the higher the matching degree.

    • Coefficient Matching ( cv2.TM_CCOEFF): This matching method calculates the correlation coefficient between the input image and the template image, and then finds the maximum value from the correlation coefficient image. The larger the matching result, the higher the matching degree.

    • Normalized coefficient matching ( cv2.TM_CCOEFF NORMED): Calculate the normalized correlation coefficient, the closer the calculated value is to 1, the more relevant it is.

  • result: The matching result. Generally, this parameter does not need to be specified, and the function will be created automatically. Its size is the size of the input image minus the size of the template image plus 1, which is a two-dimensional array.
  • mask: mask, if specified, template matching will only be done within the masked area.
import numpy as np
import cv2
import matplotlib.pyplot as plt


img = cv2.imread("F:/MyOpenCV/ai.jpg")
imgTmp = cv2.imread("F:/MyOpenCV/aitemp.jpg")

result = cv2.matchTemplate(img, imgTmp, cv2.TM_SQDIFF_NORMED)
minVal, maxVal, minLoc, maxLoc = cv2.minMaxLoc(result)
x, y = minLoc
h, w, t = imgTmp.shape
cv2.rectangle(img, minLoc, (x + w, y + h), (255, 0, 0), 2)

cv2.imshow("img", img)
cv2.imshow("imgsim", imgTmp)
cv2.waitKey(0)
cv2.destroyAllWindows()

insert image description here

4. Histogram

1. Contrast

  • Definition: Contrast is the measure of the difference between the lightest and darkest areas of an image, that is, the degree of difference between white and black. In digital image processing, contrast is used to adjust the lightness and darkness of an image, usually by adjusting the brightness and color values ​​of the pixels in the image. Higher contrast ratios can make images clearer and brighter, while lower contrast ratios can make images appear soft, blurry, or indistinguishable.

2. Draw a histogram

insert image description here
The abscissa of the histogram is the value range of the pixel channel value, and the ordinate is the number of occurrences of the value.

cv2.calcHist()It can be used to calculate the histogram of grayscale images, and can also be used to calculate the histogram of color images.

cv2.calcHist()The syntax is as follows:

hist = cv2.calcHist(images, channels, mask, histSize, ranges[, hist[, accumulate]])

The parameters are described as follows:

  • images: The input image, provided as a numpy array. If you want to compute the histogram of a grayscale image, the dimensionality imagesof should be two-dimensional, and for a color image, it should be three-dimensional, where the third dimension represents the color channel of the image. The image passed to the function should be a list, even if you are only using a single image, wrap the images in a list.
  • channels: List of indices of color channels to count. For grayscale images, this parameter should be [0], for color images, it is usually [0, 1, 2]to represent the three color channels.
  • mask: An optional mask image specifying the pixel locations to participate in the histogram calculation. Only when the pixel value at the corresponding position in the mask image is non-zero, the pixel at this position will be included in the histogram calculation.
  • histSize: The number of bins in the histogram, that is, the number of intervals. This parameter should be a list, each element represents the number of bins of a channel.
  • ranges: The pixel value range of the histogram, that is, the interval range. The parameter should be a list, each element represents the pixel value range of a channel.
  • hist: optional output histogram array object.
  • accumulate: Optional accumulation flags.

cv2.calcHist()The return value of is a numpy array representing the computed histogram. For a grayscale image, a one-dimensional array is returned; for a color image, a three-dimensional array is returned, where each dimension represents the histogram of the three channels of BGR.

img = cv2.imread("F:/MyOpenCV/hello.jpg")
b, g, r = cv2.split(img)
imgGray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

histGray = cv2.calcHist([imgGray], [0], None, [256], [0, 256])

# 绘制直方图
plt.plot(histGray)
plt.xlim([0, 256])
plt.xlabel("灰度值")
plt.ylabel("像素数量")
plt.show()

insert image description here

Note: You can also use the matplotlib method, plt.hist(data, histSize)where data is a one-dimensional array of histograms, so the image data calls ravelthe function.

3. Equalization

3.1 Theory

  • Purpose: Transform the original image to obtain a new image with a uniform distribution of gray values ​​in the gray histogram. The gray level with a large number of pixels in the image is stretched, and the gray level with a small number of pixels is reduced. So as to achieve the purpose of clear image. The ideal situation is that after transformation, the pixel gray level probability is exactly the same, but in fact it cannot be so average.
    insert image description here

3.2 code

cv2.equalizeHist(src:image[, dst]) -> dst:image

insert image description here

4. CLASH

  • Histogram equalization problem:

    • A global effect that causes brighter areas to become brighter and dark areas to become darker, resulting in loss of detail.
    • May lead to amplification of noise.
  • Idea: Split the picture, then equalize each part separately, and limit the histogram probability distribution of each part.

  • Algorithm implementation:

    1. Image Blocking
    2. Find the center point of each block
    3. Calculate the gray histogram of each block separately, and limit the threshold
    4. After the histogram distribution of each block is obtained, the center point of each block is equalized according to the histogram equalization algorithm.Equalizing only the center point is done to speed up the computation.
    5. According to the equalized gray value of the center point, the gray value of the remaining pixels of the image block is calculated by difference algorithm.
  • code:

    # 生成自适应均衡化算法 
    # clipLimit :阈值,1 表示不做限制。值越大,对比度越大
    # tileGridSize:如何拆分图像
    clahe = cv2.createCLAHE([, clipLimit[, tileGridSize]]) -> retval
    # 对像素通道进行自适应均值化处理
    dst = clahe.apply(src)
    
    

    The cv2.createCLAHE function accepts three parameters, namely clipLimit, tileGridSize and tileGridSizeX (replacing tileGridSize), where:

    • clipLimit: The limit value for maintaining the contrast level, the smaller the value, the higher the maintained contrast level;
    • tileGridSize: The size of the rectangular blocks into which the image is divided, in pixels. This parameter should be an odd number. If specified in the tileGridSizeX parameter, then this parameter does not need to be entered.
    • tileGridSizeX: Same as tileGridSize, used to specify the size of the rectangular block.

Five, image Fourier transform

5.1 Sine plane wave

insert image description here

  • Intuitive definition: Stretch the one-dimensional sine curve in one longitudinal direction to obtain a three-dimensional waveform, then represent the amplitude change of the waveform in a two-dimensional plane, and then draw the two-dimensional plane wave into a grayscale image, Therefore, the peak is 255 white and the valley is 0 black, and the middle is a grayscale transition.
  • Math parameters:
    • Sine wave: frequency w, amplitude A, phase φ \varphiPhi
    • Stretching direction: In two-dimensional coordinates, the vector can be written as n → = ( ​​μ , v ) \overrightarrow{n} = (\mu, v)n =( m ,v)

5.2 Two-dimensional Fourier transform

  • Idea:正弦平面波 In the two-dimensional Fourier transform, it is considered that the two-dimensional data is composed of countless numbers .
    insert image description here
  • Discrete Fourier transform formula:
    F ( u , v ) = 1 MN ∑ x = 0 M − 1 ∑ y = 0 N − 1 f ( x , y ) e − i 2 π ( ux M + vy N ) F( u,v) = \frac{1}{MN}\sum_{x=0}^{M-1}\sum_{y=0}^{N-1}f(x,y)e^{-i2 \pi(\frac{ux}{M}+\frac{vy}{N})}F(u,v)=MN1x=0M1y=0N1f(x,and ) andi2π(Mux+Nv y)
  • Explanation of parameters:
    insert image description here

5.3 Two-dimensional Fourier transform result F ( u , v ) F(u, v)F(u,v)

  • (u, v) vector of stretch direction
  • w = u 2 + v 2 w = \sqrt{u^2 + v^2} w=u2+v2 The magnitude of the vector represents the frequency of the sine wave
  • F(u, v): complex number, which implies the amplitude A and phase φ \varphi of the sine waveφ . For details, see the extended reading above.

5.4 Realization of Fourier Transform

OpenCV provides dft(src:np.float[, dst[, flags[, nonzeroRows]]]) -> dstFourier transform, the parameter meanings are as follows:

  • src: The input single-channel image, which must be a floating-point type
  • dst: output the result in plural form, the size is srcconsistent with
  • flags: Additional options for transform operations, usually can be set to cv2.DFT_COMPLEX_OUTPUTindicate that the output is complex
  • nonzeroRows: When the size of the input image is not a power of 2, the transformation center must be manually specified (if the input image size is even, the default center is the center of the image, otherwise the center is the upper left corner)
  • n: An optional parameter specifying the size of the transformation, usually srcthe size of
  • The return value is yes 双通道, the first channel is the real part, and the second channel is the imaginary part

insert image description here

Note: Since the discrete Fourier transform hasconjugate symmetryThus only one quarter is valid the other is flipped.
insert image description here

  • Spectrogram centralization:
    convenient for filtering operations.
    # 频谱中心化
    shiftA = np.fft.fftshift(A)
    

insert image description here

5.5 Fourier filtering

  • Ideas:
    1. Perform Fourier transform on the grayscale of the image to obtain the frequency domain result
    2. Set all the Fourier transform results corresponding to the frequency to be deleted to 0+i0
    3. Perform an inverse Fourier transform on the modified Fourier transform result

5.5.1 Low-pass filtering

Set the results of the low frequency part to all zeros

import numpy as np
import cv2
import matplotlib.pyplot as plt


img = cv2.imread("F:/MyOpenCV/ai.jpg")
yuv = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
yfloat = np.float32(yuv[:, :, 0])  # 其实就是取了灰度图

dft = cv2.dft(yfloat, flags=cv2.DFT_COMPLEX_OUTPUT)  # 生成频谱图
dftShift = np.fft.fftshift(dft)  # 中心化

centerRow = int(dftShift.shape[0] / 2)  # 宽的中心
centerCol = int(dftShift.shape[1] / 2)  # 列的中心

mask = np.zeros(dftShift.shape, dtype=np.uint8)  # 构造一个掩膜
mask[centerRow - 30 : centerRow + 30, centerCol - 30 : centerCol + 30, :] = 1
dftShift = dftShift * mask  # 按位乘

dft = np.fft.ifftshift(dftShift)  # 先反中心化
idft = cv2.idft(dft)  # 再反傅里叶

iyDft = cv2.magnitude(idft[:, :, 0], idft[:, :, 1])  # 转为实数
iy = np.uint8(iyDft / iyDft.max() * 255)  # 映射

yuv[:, :, 0] = iy
imgRes = cv2.cvtColor(yuv, cv2.COLOR_YUV2BGR)


cv2.imshow("imgRes", imgRes)


cv2.waitKey(0)
cv2.destroyAllWindows()

insert image description here

5.5.2 High-pass filtering

Similar to the low pass, just change the mask and the effect is as follows (it's a bit scary to use grayscale next time)
insert image description here

Summarize

Recently, I am busy with the project in the laboratory. This article is a bit slow, and I will finish it in the past two days.

Guess you like

Origin blog.csdn.net/zhanghongbin159/article/details/130607590