Image Processing Practice 01-OpenCV Getting Started Guide

Python OpenCV Getting Started Guide

OpenCV is a powerful computer vision library that can be used to process image and video data, as well as perform tasks such as target detection and tracking. , you will learn how to use Python to write OpenCV code for basic and advanced image processing and analysis.

Learning OpenCV can help you master basic image processing techniques, including image reading and processing, threshold processing, morphological functions, template matching, filters, graphics processing, video processing, and face detection. These technologies are basic contents in the fields of computer vision and image processing, and are also the basis of convolutional neural networks. By learning OpenCV, you can better understand the working principles and applications of convolutional neural networks. At the same time, OpenCV is also a very popular image processing library. Mastering it can help you better process and analyze image data.

Reference books: Python Opencv from entry to mastery

Install OpenCV

Before we start writing OpenCV code, we need to install the OpenCV library first. We can install it through the pip package manager:

pip install opencv-python

You can use conda or micromamba to install the virtual environment and install the notebook environment

Print opencv version

import cv2

print("OpenCV version:")
print(cv2.__version__)

OutputOpenCV
version:
4.7.0

Basics

Image reading and display

Before we start working with images, we need to learn how to read and display them. The code below demonstrates how to read and display an image using the OpenCV library:

import cv2

read image

img = cv2.imread('image.jpg')

display image

opencv display

cv2.imshow('Image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()` 

In the above code, we first cv2.imread()read an image.jpgimage file named using the function. We then use cv2.imshow()a function to display this image and use cv2.waitKey()the and cv2.destroyAllWindows()function to wait for the user to press any key and then close the display window.

Note that if you use notebook to execute waitKey(0) display, there will be a problem that it cannot be displayed during the second run. You can set cv2.waitKey(3) to automatically end the specified time.

import matplotlib
import matplotlib.pyplot as plt

matplotlib.rcParams['font.sans-serif'] = ['SimHei']  # 设置中文字体为黑体
matplotlib.rcParams['axes.unicode_minus'] = False  # 解决负号显示问题
img = cv2.imread('image.jpg')
rgbimg = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  #opencv像素顺序是bgr
plt.title(title)
plt.imshow(img, cmap=cmap)
plt.show()

Crop image

Starting at the coordinates (200, 100) of the upper left corner, crop a rectangular area with a width of 400 pixels and a height of 400 pixels.

cropped = img[100:500, 200:600]

Pixel operations

In OpenCV, an image can be represented as a three-dimensional array, where each element is a number representing a pixel value. The dimensions of the image array depend on the size of the image and the number of channels. For a size of height×width height×widthheight×The color image of w i d t h , its array shape is (height, width, 3) where 3 represents three color channels, namely BGR. BGR refers to the three channels of blue, green, and red. This is because the color channels of the image in OpenCV are arranged in the order of B, G, and R.

To access and modify pixel values ​​in an image, you can use numpy array indexing, for example:

import numpy as np

img = cv2.imread("image.jpg")

# 获取图像宽高
height, width = img.shape[:2]

# 获取某个像素的BGR值
# 在OpenCV中,通常使用img[y,x]的方式来访问图像的像素值,其中y是像素的行坐标,x是像素的列坐标。因此,在你# 提到的img[20, 100]中,20是y坐标,100是x坐标。
b, g, r = img[20, 100]

# 设置某个像素的BGR值
img[100, 100] = (255, 255, 255)

# 获取某个通道的所有像素值
blue_channel = img[:, :, 0]
green_channel = img[:, :, 1]
red_channel = img[:, :, 2]

# 修改某个通道的所有像素值
img[:, :, 0] = 0  # 将蓝色通道设为0` 

Note that in OpenCV, the channel order is BGR instead of RGB.

Color spaces and channels

OpenCV supports multiple color spaces, such as RGB, HSV, YCrCb, Lab, etc. Different color spaces correspond to different channels. For example, the RGB color space has three channels, namely red, green, and blue channels. In order to perform image processing, we usually need to convert the color space and channels of the image.
The following are some commonly used color space and channel conversion functions in OpenCV:

  1. cv2.cvtColor(src, code[, dst[, dstCn]]): Convert an image from one color space to another. Among them, src is the input image, code is the color space conversion code, dst is the output image, and dstCn is the number of channels of the output image.

For example, convert an image in BGR format to a grayscale image:


img = cv2.imread('test.jpg')
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
  1. cv2.split(src[, mv]): Split multi-channel images into single-channel images. Among them, src is the input image and mv is a list of output single-channel images.

For example, split an image in BGR format into three channels:


img = cv2.imread('test.jpg')
b, g, r = cv2.split(img)
  1. cv2.merge(mv[, dst]): Merge multiple single-channel images into a multi-channel image. Among them, mv is a list of single-channel images, and dst is the output multi-channel image.

For example, merge three single-channel images into a BGR format image:


b = cv2.imread('test_b.jpg', cv2.IMREAD_GRAYSCALE)
g = cv2.imread('test_g.jpg', cv2.IMREAD_GRAYSCALE)
r = cv2.imread('test_r.jpg', cv2.IMREAD_GRAYSCALE)
img = cv2.merge([b, g, r])
  1. cv2.addWeighted(src1, alpha, src2, beta, gamma[, dst]): Fusion of two images according to a certain proportion. Among them, src1 and src2 are two input images, alpha and beta are the weights of the two images, gamma is the brightness adjustment value, and dst is the output fused image.

For example, fuse two grayscale images in a ratio of 1:2:


img1 = cv2.imread('test1.jpg', cv2.IMREAD_GRAYSCALE)
img2 = cv2.imread('test2.jpg', cv2.IMREAD_GRAYSCALE)
img = cv2.addWeighted(img1, 1, img2, 2, 0)

The BGR color space is based on the three primary colors, which refer to red, green and blue.
Insert image description here

The HSV color space is based on hue, saturation and brightness.
Insert image description here

Among them, hue (H) refers to the color of light. For example, red, orange, yellow, green, cyan, blue, and purple in the rainbow represent different hues. In OpenCV, the hue is taken in the interval [0, 180] value. For example, the hue values ​​representing red, yellow, green, and blue are 0, 30, 60, and 120 respectively.
Saturation (S) refers to the depth of a color. In OpenCV, saturation takes values ​​in the interval [0, 255]. When the saturation is 0, the image becomes a grayscale image.
Brightness (V) refers to the brightness or darkness of light. Like saturation, in OpenCV, brightness takes values ​​in the interval [0, 255]. The larger the brightness value, the brighter the image; when the brightness value is 0, the image is pure black

Image geometric transformation

OpenCV provides many basic image transformation functions, which can be used for image resizing, rotation, translation, cropping and other operations. The following code demonstrates how to use these functions:

Zoom image

Reduced the img image object by half and assigned it to resized

resized = cv2.resize(img, (int(img.shape[1]/2), int(img.shape[0]/2)))

Affine transformation

Affine transformation is a geometric deformation that only occurs in a two-dimensional plane. The transformed image can still maintain the "straightness" and "parallelism" of straight lines. That is to say, the original straight lines are still straight lines and parallel lines after transformation. After transformation, they are still parallel lines. Common affine transformation effects are shown in the figure, including translation, rotation and tilt.
Insert image description here
OpenCV uses the cv2.warpAffine() method to achieve the affine transformation effect. Its syntax is as follows:

 dst = cv2.warpAffine(src, M, dsize, flags, borderMode, borderValue)

Parameter description:
 - src: original image.
 - M: A matrix with 2 rows and 3 columns. The pixel position in the original image is transformed according to the value of this matrix.
 - dsize: The size of the output image.
 - flags: optional parameter, interpolation method, default value is recommended.
 - borderMode: optional parameter, border type, it is recommended to use the default value.
 - borderValue: optional parameter, border value, default is 0, it is recommended to use the default value.
Return value description:

- dst: Output image after reflection transformation.
M is also called an affine matrix. It is actually a 2×3 list with the following format:

 M = [[a, b, c],[d, e, f]]

What kind of affine transformation is performed on the image depends entirely on the value of M. The image output by the affine transformation is calculated according to the following formula:

 新x = 原x × a + 原y × b + c
 新y = 原x × d + 原y × e + f

The original x and original y represent the abscissa and ordinate of the pixel in the original image, and the new x and new y represent the abscissa and ordinate of the same pixel in the new image after affine transformation.

Pan the image

Translation moves all the pixels in the image horizontally or vertically at the same time. To achieve this effect, you only need to set the value of M according to the following format:
M = [[1, 0, horizontal movement distance], [0, 1, vertical movement distance]] The
pixels of the original image will be according to the following formula Transform:
new x = original x × 1 + original y × 0 + horizontal movement distance = original x + horizontal movement distance
new y = original x × 0 + original y × 1 + vertical movement distance = original y + vertical distance moved

M = np.float32([[1, 0, 100], [0, 1, 50]])
translated = cv2.warpAffine(img, M, (img.shape[1], img.shape[0]))

Insert image description here

Rotate image

First, the number of rows and columns of the img image object are obtained, and assigned to the rows and cols variables respectively. Then, use the cv2.getRotationMatrix2D function to generate a rotation matrix M,
in which the first parameter is the coordinates of the rotation center point, here is the image center point (cols/2, rows/2); the
second parameter is the angle of rotation , here it is 30 degrees;
the third parameter is the scaling ratio after rotation, here it is 1, which means no scaling. Finally, use the cv2.warpAffine function to rotate the original image object img according to the rotation matrix M, and assign the result to rotated. The first parameter of the warpAffine function is the original image object that needs to be rotated. The second parameter is the rotation matrix. The third parameter is the size of the output image. The width and height of the original image are used here. The function returns the rotated image object rotated.

rows, cols = img.shape[:2]
M = cv2.getRotationMatrix2D((cols/2, rows/2), 30, 1)
rotated = cv2.warpAffine(img, M, (cols, rows))

Insert image description here

tilt image

OpenCV needs to locate three points of the image to calculate the tilt effect. The positions of the three points are as shown in the figure. These three points are the "upper left corner" point A, the "upper right corner" point B and the "lower left corner" point C. OpenCV will calculate the position changes of other pixels based on the position changes of these three points. Because it is necessary to ensure the "straightness" and "parallelism" of the image, there is no need to use the point in the "lower right corner" as the fourth parameter. The position of the point in the lower right corner is automatically calculated based on the changes in the three points A, B, and C. out.
Insert image description here

"Flatness" means that the straight lines in the image are still straight lines after affine transformation. "Parallelism" means that parallel lines in an image are still parallel lines after affine transformation.
Tilt the image also needs to be achieved through the M matrix, but obtaining this matrix requires very complex operations, so OpenCV provides the getAffineTransform() method to automatically calculate the M matrix of the tilted image. The syntax of the getRotationMatrix2D() method is as follows:
M = cv2.getAffineTransform(src, dst)
parameter description:

- src: 3 point coordinates of the original image, formatted as a list of 32-bit floating point numbers with 3 rows and 2 columns, for example: [[0, 1], [1, 0], [1, 1]].
 - dst: 3 point coordinates of the tilted image, the format is the same as src.
Return value description:

- M: Affine matrix calculated by getAffineTransform() method.
Insert image description here

rows,cols=len(image),len(image[0])
src=np.float32([[0,0],[cols-1,0],[0,rows-1]])
dst=np.float32([[0,50],[cols-1,0],[0,rows-1]])
M=cv2.getAffineTransform(src,dst)
destImg =cv2.warpAffine(image,M=M,dsize=(len(image[0]),len(image)))
plt.title("图像倾斜")
plt.imshow(cv2.cvtColor(destImg, cv2.COLOR_BGR2RGB))
plt.show()
perspective image

If affine allows the image to deform in a two-dimensional plane, then perspective allows the image to deform in a three-dimensional space. Observing objects from different angles, you will see different deformation pictures. For example, a rectangle will become an irregular quadrilateral, a right angle will become an acute or obtuse angle, a circle will become an ellipse, and so on. The picture after this deformation is a perspective view.

Viewed from the bottom of the image), the eye is closer to the bottom of the image, so the width of the bottom of the image remains the same, but if the eye is farther from the top of the image, the width of the top of the image will shrink proportionally, so the observer will see the perspective shown Effect.
Insert image description here
In OpenCV, the perspective effect needs to be calculated by locating 4 points of the image. The positions of the 4 points are shown in Figure 7.16. OpenCV calculates the position changes of other pixels based on the position changes of these 4 points. The perspective effect cannot guarantee the "straightness" and "parallelism" of the image.
The warpPerspective() method also needs to calculate the perspective effect through the M matrix, but obtaining this matrix requires very complex operations, so OpenCV provides the getPerspectiveTransform() method to automatically calculate the M matrix. The syntax of the getPerspectiveTransform() method is as follows:

 M = cv2.getPerspectiveTransform(src, dst,)

Parameter Description:

- src: 4 point coordinates of the original image, formatted as a list of 32-bit floating point numbers with 4 rows and 2 columns, for example: [[0, 0], [1, 0], [0, 1],[1, 1]] .
 - dst: 4 point coordinates of the perspective view, the format is the same as src.
Return value description:

- M: Affine matrix calculated by getPerspectiveTransform() method.

rows=len(image)
cols=len(image[0])

M=cv2.getPerspectiveTransform(np.array([[0,0],[cols-1,0],[0,rows-1],[cols-1,rows-1]],dtype=np.float32),
                              np.array([[100,0],[cols-1-100,0],[0,rows-1],[cols-1,rows-1]],dtype=np.float32)
                            )
dImag=cv2.warpPerspective(image,M,(cols,rows))
plt.imshow(dImag)
plt.title("透视")
plt.show()

Insert image description here

Thresholding

Threshold is a very important concept in image processing, similar to a "standard line of pixel values". All pixel values ​​are compared with this "standard line", and finally three results are obtained: the pixel value is greater than the threshold, the pixel value is smaller than the threshold, or the pixel value is equal to the threshold. The program groups all pixels based on these results, and then "burns" or "lightens" a certain group of pixels, making the outline of the entire image more distinct and easier to identify by a computer or the naked eye.
Insert image description here

threshold processing function

In the process of image processing, the use of thresholds makes the pixel values ​​of the image more uniform, thereby making the image effect simpler. First, convert a color image into a grayscale image, so that the pixel value range of the image can be simplified to 0~255. Then, a threshold value is used to make the converted grayscale image present a visual effect of only pure black and pure white. For example, when the threshold is 127, all pixel values ​​less than 127 are converted to 0 (that is, pure black), and all pixel values ​​greater than 127 are converted to 255 (that is, pure white). Although some grayscale details will be lost, the outline of the grayscale image subject will be more obviously preserved.

Threshold processing occupies a very important position in computer vision technology. It is one of the underlying processing logics of many advanced algorithms. Because binary images ignore details and magnify features, and many advanced algorithms analyze object features based on their contours, binary images are very suitable for complex recognition operations. Before performing the recognition operation, the image should be converted into a grayscale image and then binarized, so that the (rough) contour image of the object required by the algorithm is obtained.

The threshold() method provided by OpenCV is used to threshold the image. The syntax of the threshold() method is as follows:

 retval, dst = cv2.threshold(src, thresh, maxval, type)

Parameter Description:

- src: The image being processed, which can be a multi-channel image.
 - thresh: Threshold, the threshold value is between 125 and 150 for the best effect.
 - maxval: the maximum value used for threshold processing.
 - type: threshold processing type. Common types and meanings.
 Return value description:
 - retval: the threshold used during processing.
 - dst: Thresholded image.

In OpenCV, there are the following threshold processing types, as well as corresponding enumeration values:

  1. THRESH_BINARY: Binarization threshold processing, pixels greater than the threshold are set to the maximum value, and pixels less than or equal to the threshold are set to 0. The enumeration value is 0.
  2. THRESH_BINARY_INV: Anti-binarization threshold processing, pixels smaller than the threshold are set to the maximum value, and pixels greater than or equal to the threshold are set to 0. The enumeration value is 1.
  3. THRESH_TRUNC: Truncate threshold processing, set pixels greater than the threshold to the threshold, and pixels less than or equal to the threshold remain unchanged. The enumeration value is 2.
  4. THRESH_TOZERO: Threshold processing is 0, pixels less than the threshold are set to 0, and pixels greater than or equal to the threshold remain unchanged. The enumeration value is 3.
  5. THRESH_TOZERO_INV: Inverse threshold processing is 0, pixels greater than the threshold are set to 0, and pixels less than or equal to the threshold are unchanged. The enumeration value is 4.

Binarization

Binarization processing is also called binarization threshold processing. This processing allows the image to retain only two pixel values, or all pixels can only take values ​​from two values.

When performing binarization processing, each pixel value will be compared with the threshold, and the pixel value greater than the threshold will be changed to the maximum value, and the pixel value less than or equal to the threshold will be changed to 0. The calculation formula is as follows:

   if 像素值 <= 阈值: 像素值 = 0
   if 像素值 > 阈值: 像素值 = 最大值

Usually, binarization processing uses 255 as the maximum value, because in grayscale images, 255 represents pure white and 0 represents black, which can be clearly distinguished from pure black, so the grayscale image appears "not black" after binarization. Instantly white" effect.

import matplotlib
import matplotlib.pyplot as plt
import cv2
grayimage=cv2.imread("../../images/demo1.png",0) #直接读取灰度图
_,dst=cv2.threshold(grayimage,127,255,cv2.THRESH_BINARY)  
plt.subplot(121)
plt.imshow(grayimage,cmap="gray")
plt.title("灰度图")
plt.subplot(122)
plt.imshow(dst,cmap="gray")
plt.title("二值化图")
plt.show()

Insert image description here

Note that the larger the pixel value, the whiter it is, and the smaller the pixel value, the darker it is.

Anti-binarization processing

Anti-binarization processing is also called anti-binarization threshold processing, and its result is the opposite result of binarization processing. Change pixel values ​​greater than the threshold to 0, and change pixel values ​​less than or equal to the threshold to the maximum value. The white parts of the original image become black, and the black parts become white. Calculated as follows:

     if 像素值 <= 阈值: 像素值 = 最大值
     if 像素值 > 阈值: 像素值 = 0

code

_,dst=cv2.threshold(grayimage,127,255,cv2.THRESH_BINARY_INV)
plt.subplot(121)
plt.imshow(grayimage,cmap="gray")
plt.title("灰度图")
plt.subplot(122)
plt.imshow(dst,cmap="gray")
plt.title("二值化图")
plt.show()

Insert image description here

If OCR usually highlights the text through anti-binarization, it displays white text on a black background, and then expands (because white is a relatively large number).
Insert image description here

Zero processing

below threshold zero handling

Below-threshold zero processing is also called low-threshold zero processing. This processing changes the pixel value below or equal to the threshold to 0, and the pixel value above the threshold keeps the original value. The calculation formula is as follows:

 if 像素值 <= 阈值: 像素值 = 0
 if 像素值 > 阈值: 像素值 = 原值
_,dst=cv2.threshold(grayimage,127,255,cv2.THRESH_TOZERO)
plt.subplot(121)
plt.imshow(grayimage,cmap="gray")
plt.title("灰度图")
plt.subplot(122)
plt.imshow(dst,cmap="gray")
plt.title("低于阈值零[置黑]处理")
plt.show()

Insert image description here

Zero handling beyond threshold

Over-threshold zero processing is also called super-threshold zero processing. This processing changes pixel values ​​greater than the threshold to 0, and pixel values ​​less than or equal to the threshold remain at their original values. Calculated as follows:

 if 像素值 <= 阈值: 像素值 = 原值
 if 像素值 > 阈值: 像素值 = 0
_,dst=cv2.threshold(grayimage,127,255,cv2.THRESH_TOZERO_INV)
plt.subplot(121)
plt.imshow(grayimage,cmap="gray")
plt.title("灰度图")
plt.subplot(122)
plt.imshow(dst,cmap="gray")
plt.title("超阈值零[置黑]处理")
plt.show()

Insert image description here

Truncation

Truncation processing is also called truncation threshold processing. This processing changes the pixel values ​​in the image that are greater than the threshold to the same value as the threshold, and the pixels that are less than or equal to the threshold maintain their original values. The formula is as follows:

 if 像素 <= 阈值: 像素 = 原值
 if 像素 > 阈值: 像素 = 阈值
_,dst=cv2.threshold(grayimage,127,255,cv2.THRESH_TRUNC)
plt.subplot(121)
plt.imshow(grayimage,cmap="gray")
plt.title("灰度图")
plt.subplot(122)
plt.imshow(dst,cmap="gray")
plt.title("截断阈值处理")
plt.show()

Insert image description here

adaptive processing

OpenCV provides an improved thresholding technique: different thresholds are used for different areas in the image. This improved threshold processing technology is called adaptive threshold processing, also known as adaptive processing. The adaptive threshold is calculated according to a specified algorithm based on all pixel values ​​in a square area in the image. Compared with the five threshold processing types explained previously, adaptive processing can better handle images with uneven light and dark distribution and obtain simpler image effects.

OpenCV provides the adaptiveThresHold() method for adaptive processing of images. The syntax of the adaptiveThresHold() method is as follows:
dst = cv2.adaptiveThreshold(src, maxValue, adaptiveMethod, thresholdType, blockSize, C)
Parameter description:

  • src: the image being processed. It should be noted that the image needs to be a grayscale image.
  • maxValue: The maximum value used for threshold processing.
  • adaptiveMethod: Calculation method of adaptive threshold. The calculation method and meaning of the adaptive threshold are shown in Table 8.2.
    Adaptive threshold calculation method and its meaning
    ADAPTIVE_THRESH_MEAN_C: weights all pixels in a square area equally.
    ADAPTIVE_THRESH_GAUSSIAN_C: All pixels in a square area are weighted according to the distance between the pixel and the center point based on the Gaussian function.
  • thresholdType: threshold processing type; it should be noted that the threshold processing type must be one of cv2.THRESH_BINARY or cv2.THRESH_BINARY_INV.
  • blockSize: The size of a square area. For example, 5 refers to a 5×5 area.
  • C: constant. The threshold is equal to the mean or weighted value minus this constant.
    Return value description:
  • dst: image after threshold processing.
    Adaptive processing retains more detailed information in the image and more obviously preserves the outline of the grayscale image subject.
plt.subplot(221)
plt.imshow(grayimage,cmap="gray")
plt.title("灰度图")
plt.subplot(222)
meanImg=cv2.adaptiveThreshold(grayimage,255,cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY,5,3)
plt.imshow(meanImg,cmap="gray")
plt.title("ADAPTIVE_THRESH_MEAN_C图")
plt.show()
plt.subplot(223)
guassImg=cv2.adaptiveThreshold(grayimage,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,5,3)
plt.imshow(guassImg,cmap="gray")
plt.title("ADAPTIVE_THRESH_MEAN_C图")
plt.show()

Insert image description here

Otsu method

In the process of the previous five threshold processing types, the threshold set for each instance is 127, which is not calculated by algorithm. For some images, when the threshold is set to 127, the effect obtained is not good. In this case, you need to try one by one until you find the most suitable threshold.

Finding the most appropriate thresholds one by one is not only a lot of work, but also inefficient. For this purpose, OpenCV provides the Otsu method. The Otsu method can traverse all possible thresholds and find the most appropriate threshold.

The syntax of the Otsu method is basically the same as that of the threshold() method, except that when passing parameters for type, one more parameter must be passed, namely cv2.THRESH_OTSU. The function of cv2.THRESH_OTSU is to implement threshold processing of the Otsu method. The syntax of the Otsu method is as follows:
retval, dst = cv2.threshold(src, thresh, maxval, type)
parameter description:

- src: the image being processed. It should be noted that the image needs to be a grayscale image.
 - thresh: Threshold, and set the threshold to 0.
 - maxval: The maximum value used for threshold processing, which is 255.
 - type: threshold processing type. In addition to selecting a threshold processing type in Table 8.1, one more parameter must be passed, namely cv2.THRESH_OTSU. For example, cv2.THRESH_BINARY+cv2.THRESH_OTSU.
Return value description:

- retval: The most appropriate threshold calculated and used by the Otsu method.
 - dst: Thresholded image.

plt.subplot(221)
plt.imshow(grayimage,cmap="gray")
plt.title("灰度图")
plt.subplot(222)
_,dst=cv2.threshold(grayimage,127,255,cv2.THRESH_BINARY)
plt.imshow(dst,cmap="gray")
plt.title("二值化图")
plt.show()
plt.subplot(223)
_,ostuImg=cv2.threshold(grayimage,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
plt.imshow(ostuImg,cmap="gray")
plt.title("OTSU图")
plt.show()

Insert image description here

Advanced chapter

template matching

The template is the image of the target being searched, and the process of finding the position of the template in the original image is called template matching. The matchTemplate() method provided by OpenCV is the template matching method, and its syntax is as follows:

 result = cv2.matchTemplate(image, templ, method, mask)

Parameter Description:

  • image: original image.
  • templ: Template image, the size must be smaller than or equal to the original image.
  • method: matching method, available parameter values ​​are shown in Table 10.1.
  • mask: optional parameter. Mask, only cv2.TM_SQDIFF and cv2.TM_CCORR_NORMED support this parameter, it is recommended to use the default value.
    Return value description:
  • result: calculated matching result. If the width and height of the original image are W and H respectively, and the width and height of the template image are w and h respectively, the result is a 32-bit floating point array with W-w+1 columns and H-h+1 rows. Each floating point number in the array is the matching result of the corresponding pixel position in the original image, and its meaning needs to be interpreted according to the method parameter.
    During the computation of template matching, the template is moved within the original image. Compare the template and the pixels in the overlapping area one by one, and finally save the comparison result in the array position corresponding to the pixel index position in the upper left corner of the template

OpenCV's matchTemplate function is used to find matches in one image to another image. During the matching process, you can choose different matching methods, which are the method parameters. Commonly used method parameters include the following:
- cv2.TM_SQDIFF: squared difference matching method, the simplest matching method, which calculates the sum of squared differences. The smaller the value, the better the match.
- cv2.TM_SQDIFF_NORMED: Standard squared difference matching method, which also calculates the sum of squared differences, but standardizes the results. Note that the distance used to calculate the matched image must be smaller than the original image
- cv2.TM_CCORR: correlation matching method, #, the larger the value, the better the match.
- cv2.TM_CCORR_NORMED: Standard correlation matching method, standardizes the results. The larger the return value, the higher the degree of matching, and the smaller the value, the lower the degree of matching.
This parameter uses the normalized correlation coefficient matching mode, and
returns the correlation coefficient between the matching image and the template image. The value range is between 0 and 1, with 1 indicating a perfect match and 0 indicating no match.
- cv2.TM_CCOEFF: Correlation coefficient matching method, calculates the correlation coefficient of two images. The larger the value, the better the match.
- cv2.TM_CCOEFF_NORMED: Standard correlation coefficient matching method to standardize the results.

Assume the original image
Insert image description here

Matched graph
Insert image description here

#多目标匹配
image=cv2.imread("./images/2.jpg");
grayImg=cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
matchImg=cv2.imread("./images/2_match_1.jpg",0);
height,width=matchImg.shape
result=cv2.matchTemplate(grayImg,matchImg,cv2.TM_CCORR_NORMED)
showImage=image.copy()
for y in range(len(result)):
    for x in range(len(result[y])):
        if result[y][x]>0.999:
            cv2.rectangle(showImage, (x,y), (x + width, y + height), (255, 0, 0), 1)

plt.imshow(showImage,cmap="gray")

Match the result
Insert image description here

filter

A series of processes such as removing noise in the image and reducing the level of detail information while retaining the original image information as much as possible is called image smoothing (or image blurring). The most commonly used tool for smoothing is the filter. By adjusting the parameters of the filter, you can control the smoothness of the image. OpenCV provides a wide variety of filters. Each filter uses different algorithms, but they can fine-tune the pixel values ​​in the image to give the image a smooth effect. This chapter will introduce the use of mean filter, median filter, Gaussian filter and bilateral filter.

There may be a pixel that is so different from surrounding pixels that it can be visually seen that the pixel cannot form identifiable image information with surrounding pixels, reducing the quality of the entire image. This "out of place" pixel is image noise. If the noise in the image is all random pure black pixels or pure white pixels, such noise is called "salt and pepper noise" or "salt noise". For example, the image shown in Figure 7.1 is an image with only noise, often called "snowflake points".

mean filter

With a pixel as the core, its surrounding pixels can form a matrix of n rows and n columns (referred to as n×n). Such a matrix structure is called a "filter kernel" in filtering operations. The number of rows and columns of the matrix determines the size of the filter kernel. The filter kernel size is 3×3 and contains 9 pixels; the image filter kernel size is 5×5 and contains 25 pixels.
Insert image description here
The mean filter (also called a low-pass filter) can treat each pixel in the image as the core of the filter kernel, then calculate the average of all pixels in the kernel, and finally make the core pixel value equal to this average.
Insert image description here
OpenCV encapsulates the mean filter into the blur() method, whose syntax is as follows:

 dst = cv2.blur(src, ksize, anchor, borderType)

Parameter Description:

  • src: the image being processed.
  • ksize: Filter kernel size, its format is (height, width). It is recommended to use odd side lengths such as (3, 3), (5, 5), (7, 7) with equal width and equal height. The larger the filter kernel, the blurr the processed image will be.
  • Anchor: Optional parameter, anchor point of the filter kernel. It is recommended to use the default value, which can automatically calculate the anchor point.
  • borderType: optional parameter, border style, the default value is recommended.
    Return value description:
  • dst: Image after mean filtering.
import matplotlib.pyplot as plt
import matplotlib
import cv2
matplotlib.rcParams['font.sans-serif'] = ['SimHei']  # 设置中文字体为黑体
matplotlib.rcParams['axes.unicode_minus'] = False  # 解决负号显示问题
image=cv2.imread("./images/1.png");
plt.subplot(221)
plt.title("原始图")
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.subplot(222)
plt.title("滤波核(3,3)")
plt.imshow(cv2.cvtColor(cv2.blur(image,(3,3)), cv2.COLOR_BGR2RGB))
plt.subplot(223)
plt.title("滤波核(5,5)")
plt.imshow(cv2.cvtColor(cv2.blur(image,(5,5)), cv2.COLOR_BGR2RGB))
plt.subplot(224)
plt.title("滤波核(9,9)")
plt.imshow(cv2.cvtColor(cv2.blur(image,(9,9)), cv2.COLOR_BGR2RGB))

Insert image description here

median filter

The principle of the median filter is very similar to the mean filter. The only difference is that it does not calculate the average of the pixels, but sorts all pixel values, takes out the middle pixel value, and assigns it to the core pixel.
OpenCV encapsulates the median filter into the medianBlur() method, whose syntax is as follows:

 dst = cv2.medianBlur(src, ksize)

Parameter Description:

  • src: the image being processed.
  • ksize: The side length of the filter kernel, which must be an odd number greater than 1, such as 3, 5, 7, etc. This method automatically creates a square filter kernel based on this side length.
    Return value description:
  • st: image after median filtering.
import matplotlib.pyplot as plt
import matplotlib
import cv2
matplotlib.rcParams['font.sans-serif'] = ['SimHei']  # 设置中文字体为黑体
matplotlib.rcParams['axes.unicode_minus'] = False  # 解决负号显示问题
image=cv2.imread("./images/1.png");
plt.subplot(221)
plt.title("原始图")
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.subplot(222)
plt.title("滤波核(3,3)")
plt.imshow(cv2.cvtColor(cv2.medianBlur(image,3), cv2.COLOR_BGR2RGB))
plt.subplot(223)
plt.title("滤波核(5,5)")
plt.imshow(cv2.cvtColor(cv2.medianBlur(image,5), cv2.COLOR_BGR2RGB))
plt.subplot(224)
plt.title("滤波核(9,9)")
plt.imshow(cv2.cvtColor(cv2.medianBlur(image,9), cv2.COLOR_BGR2RGB))

Insert image description here

Gaussian filter

Gaussian filtering, also known as Gaussian blur or Gaussian smoothing, is currently the most widely used smoothing algorithm. Gaussian filtering can effectively reduce image noise and detail levels while retaining more image information. The processed image presents a "frosted glass" filter effect.
When performing mean filtering, the weight of each pixel around the core is equal, that is, each pixel is equally important, so just calculate the average. However, in Gaussian filtering, the weight of pixels closer to the core is greater, and the weight of pixels further away from the core is smaller. For example, the weight diagram of a 5×5 Gaussian filter convolution kernel is shown in Figure 11.8. If the pixel weights are different, the average cannot be taken. More information should be taken from the pixels with greater weight, and less information should be taken from the pixels with smaller weight. A simple summary is "whoever you are closer to is more similar to whomever you are closer to".

The calculation process of Gaussian filter involves convolution operation, and there will be a convolution kernel equal to the size of the filter kernel. This section only takes the 3×3 filter kernel as an example to briefly describe the calculation process of Gaussian filtering.

The value saved in the convolution kernel is the weight value of the area covered by the kernel, which follows the rule of Figure 11.8. The sum of all weight values ​​in the convolution kernel is 1. For example, a 3 × 3 convolution kernel can be the value shown in Figure 11.9. As the kernel size and σ standard deviation change, the value in the convolution kernel will also change greatly. Figure 11.9 is only the simplest case.
Insert image description here

import matplotlib.pyplot as plt
import matplotlib
import cv2
matplotlib.rcParams['font.sans-serif'] = ['SimHei']  # 设置中文字体为黑体
matplotlib.rcParams['axes.unicode_minus'] = False  # 解决负号显示问题
image=cv2.imread("./images/1.png");
plt.subplot(221)
plt.title("原始图")
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.subplot(222)
plt.title("滤波核(3,3)")
plt.imshow(cv2.cvtColor(cv2.GaussianBlur(image,(3,3),0,0), cv2.COLOR_BGR2RGB))
plt.subplot(223)
plt.title("滤波核(5,5)")
plt.imshow(cv2.cvtColor(cv2.GaussianBlur(image,(5,5),0,0), cv2.COLOR_BGR2RGB))
plt.subplot(224)
plt.title("滤波核(9,9)")
plt.imshow(cv2.cvtColor(cv2.GaussianBlur(image,(9,9),0,0), cv2.COLOR_BGR2RGB))

Insert image description here

bilateral filter

Whether it is mean filtering, median filtering or Gaussian filtering, the entire image will be smoothed and the boundaries in the image will become blurred. Bilateral filtering is a filtering operation method that can effectively protect boundary information during smoothing.

The bilateral filter automatically determines whether the filter kernel is in the "flat" area or the "edge" area: if the filter kernel is in the "flat" area, an algorithm similar to Gaussian filtering will be used for filtering; if the filter kernel is in the "edge" area, the filter will be increased. The weight of "edge" pixels, keeping those pixel values ​​as constant as possible.
Insert image description here

import matplotlib.pyplot as plt
import matplotlib
import cv2
matplotlib.rcParams['font.sans-serif'] = ['SimHei']  # 设置中文字体为黑体
matplotlib.rcParams['axes.unicode_minus'] = False  # 解决负号显示问题
image=cv2.imread("./images/1.png");
plt.subplot(221)
plt.title("原始图")
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.subplot(222)
plt.title("高斯滤波核(15,15)")
plt.imshow(cv2.cvtColor(cv2.GaussianBlur(image,(15,15),0,0), cv2.COLOR_BGR2RGB))
plt.subplot(223)
plt.title("双边滤波核(15,15)")
plt.imshow(cv2.cvtColor(cv2.bilateralFilter(image,15,120,100), cv2.COLOR_BGR2RGB))

Insert image description here

Morphological operations

Corrosion and expansion are the basic operations of morphology. In addition to opening operations and closing operations, there are several more distinctive operations in morphology. OpenCV provides a morphologyEx() morphology method, which includes all commonly used operations. Its syntax is as follows:
dst = cv2.morphologyEx(src, op, kernel, anchor, iterations, borderType, borderValue)
Parameter description:

  • src: original image.
  • op: operation type, specific values ​​are shown in Table 12.1.
    The specific enumeration values ​​are as follows:
  1. MORPH_ERODE: corrosion operation
  2. MORPH_DILATE: expansion operation
  3. MORPH_OPEN: Open operation
  4. MORPH_CLOSE: Close operation
  5. MORPH_GRADIENT: Morphological gradient
  6. MORPH_TOPHAT: top hat operation
  7. MORPH_BLACKHAT: black hat operation
  • kernel: The core used during the operation.
  • anchor: Optional parameter, anchor point position of the kernel.
  • iterations: Optional parameter, number of iterations, default value is 1.
  • borderType: optional parameter, border style, default recommended.
  • borderValue: Optional parameter, border value, default recommended.
    Return value description:
  • dst: The image obtained after the operation.

corrosion

The erosion operation causes the image to shrink inward along its own boundaries. OpenCV implements shrinkage calculations through "kernels". The English name of "core" is kernel, which can be understood in morphology as "a pixel block composed of n pixels". The pixel block contains a core (the core is usually in the center, but can also be defined at other positions). The pixel block moves at the edge of the image. During the movement, the kernel will erase all the pixels on the edge of the image that coincide with the kernel but do not cross the kernel. The effect is similar to the process shown in Figure 12.1, just like peeling a potato. "Thinning" the image layer by layer.
Insert image description here
OpenCV encapsulates the corrosion operation into the erode() method. The syntax of this method is as follows:

 dst = cv2.erode(src, kernel, anchor, iterations, borderType, borderValue)

Parameter Description:

  • src: original image.
  • kernel: The kernel used by corrosion.
  • anchor: Optional parameter, anchor point position of the kernel.
  • iterations: Optional parameter, the number of iterations of the corrosion operation, the default value is 1.
  • borderType: optional parameter, border style, default recommended.
  • borderValue: Optional parameter, border value, default recommended.
    Return value description:
  • dst: image after corrosion.

After the image is corroded, some external details can be erased. As shown in Figure 12.2, it is a cartoon spider. If a 5×5 pixel block is used as a check for the spider to be corroded, you can get something as shown in Figure 12.3. result. The spider's legs were erased as external details, while the spider's eyes became larger because the core was "sharpened" from the inside.
Insert image description here
After corrosion
Insert image description here

import matplotlib.pyplot as plt
import matplotlib
import cv2
import numpy as np
matplotlib.rcParams['font.sans-serif'] = ['SimHei']  # 设置中文字体为黑体
matplotlib.rcParams['axes.unicode_minus'] = False  # 解决负号显示问题
image=cv2.imread("./images/1.jpg");
plt.subplot(221)
plt.title("原始图")
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.subplot(222)
plt.title("腐蚀")
zeroArray=np.ones((3,3))
plt.imshow(cv2.erode(image, zeroArray))
plt.subplot(223)
plt.title("腐蚀")
dst = cv2.morphologyEx(image, cv2.MORPH_ERODE, zeroArray) #也可以使用这个形态学方法,效果和erode一致
plt.imshow(dst)

Insert image description here

Expansion

The dilation operation is the opposite of the erosion operation, which causes the image to expand inward along its own boundaries. It is also calculated through the kernel. When the kernel moves at the edge of the image, the kernel will fill the edge of the image with new pixels. The effect is similar to the process shown in Figure 12.6, just like applying cement to a wall repeatedly, making the wall Getting thicker and thicker.
Insert image description here
OpenCV encapsulates the expansion operation into the dilate() method. The syntax of this method is as follows:
dst = cv2.dilate(src, kernel, anchor, iterations, borderType, borderValue)
Parameter description:

  • src: original image.
  • kernel: The kernel used for expansion.
  • anchor: Optional parameter, anchor point position of the kernel.
  • iterations: Optional parameter, the number of iterations of the corrosion operation, the default value is 1.
  • borderType: optional parameter, border style, default recommended.
  • borderValue: Optional parameter, border value, default recommended.
    Return value description:
  • dst: image after dilation.

After the image is expanded, some external details can be enlarged, such as the cartoon spider shown in Figure 12.7(a). If a 5×5 pixel block is used as a checker to expand the spider, you can get the image shown in Figure 12.7(b). ), not only did the little spider’s legs become thicker, but its eyes also lost weight.
Insert image description here

matplotlib.rcParams['font.sans-serif'] = ['SimHei']  # 设置中文字体为黑体
matplotlib.rcParams['axes.unicode_minus'] = False  # 解决负号显示问题
image=cv2.imread("./images/2.jpg");
plt.subplot(221)
plt.title("原始图")
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.subplot(222)
plt.title("膨胀")
zeroArray=np.ones((9,9))
plt.imshow(cv2.dilate(image, zeroArray))
plt.subplot(223)
plt.title("膨胀")
dst = cv2.morphologyEx(image, cv2.MORPH_DILATE, zeroArray) #也可以使用这个形态学方法,效果和erode一致
plt.imshow(dst)

Insert image description here

Open operation

The opening operation is to first perform an erosion operation on the image, and then perform a dilation operation. The opening operation can be used to erase detail (or noise) outside the image.
Insert image description here

import matplotlib.pyplot as plt
import matplotlib
import cv2
import numpy as np
matplotlib.rcParams['font.sans-serif'] = ['SimHei']  # 设置中文字体为黑体
matplotlib.rcParams['axes.unicode_minus'] = False  # 解决负号显示问题
image=cv2.imread("./images/2.jpg");
plt.subplot(221)
plt.title("原始图")
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.subplot(222)
plt.title("开运算")
zeroArray=np.ones((5,5))
#腐蚀掉噪音,然后在膨胀特点
dest=cv2.erode(image, zeroArray)
dest=cv2.dilate(dest, zeroArray)
plt.imshow(dest)
plt.subplot(223)
plt.title("开运算")
dst = cv2.morphologyEx(image, cv2.MORPH_OPEN, zeroArray) #也可以使用这个形态学方法,效果和erode一致
plt.imshow(dst)

renderings
Insert image description here

closed operation

The closing operation is to first perform a dilation operation on the image, and then perform an erosion operation. Closing operations can erase details (or noise) inside the image.
Insert image description here

Gradient operation

The gradient here refers to the image gradient, which can be simply understood as the degree of change of pixels. If the pixel value span of several consecutive pixels is larger, the gradient value will be larger.

The operation process of gradient operation is shown in Figure 12.15, which is to subtract the erosion map of the original image from the expansion map of the original image. Because the expansion map is larger than the original image and the corrosion map is smaller than the original image, the corrosion map is used to hollow out the expansion map to obtain the outline of the original image.
Insert image description here

top hat operation

The operation process of the top hat operation is shown in Figure 12.17. Let the original image be reduced by the open operation graph of the original image. Because the open operation erases the external details of the image, and the image with "external details" subtracts the image with "no external details", the result will only have external details. Therefore, after the top hat operation, the little spider will only be a spider. Legs.
Insert image description here

black hat computing

The operation process of the black hat operation is shown in Figure 12.19. Let the original image be subtracted from the closed operation graph of the original image. Because the closed operation erases the internal details of the image, the image with "no internal details" is subtracted from the image with "internal details", and the result is only the internal details. Therefore, after the black hat operation, the little spider is left with Spots, patterns and eyes.
Insert image description here

Graphic detection

image outline

Contours refer to the outer edge lines of figures or objects in an image. The outline of a simple geometric figure is composed of smooth lines and is easy to identify, but the outline of an irregular figure may be composed of many points, making it difficult to identify.

The findContours() method provided by OpenCV can determine the edges of the image by calculating the image gradient, and then encapsulates the edge points into an array and returns it. The syntax of the findContours() method is as follows:

 contours, hierarchy = cv2.findContours(image, mode, methode)

Parameter Description:

  • image: The image being detected must be an 8-bit single-channel binary image. If the original image is a color image, it must be converted to grayscale and binarized.
  • mode: the retrieval mode of the outline, the specific values ​​are shown in the table.
    cv2.RETR_EXTERNAL: Retrieve only external contours.
    cv2.RETR_LIST: Retrieve all contours and store them in a list.
    cv2.RETR_CCOMP: Retrieve all contours and organize them into a two-level hierarchy. In the top layer, there are only outer contours, while in the second layer, there are inner contours. If the inner contour also has holes, it is considered level 3.
    cv2.RETR_TREE: Retrieve all contours and organize them into a complete hierarchical tree.
  • methode: The method used when detecting contours. The specific values ​​are shown in Table 13.2.
    cv2.CHAIN_APPROX_NONE: Store all contour points, and the pixel position difference between two adjacent contour points does not exceed 1.
    cv2.CHAIN_APPROX_SIMPLE: Compress redundant points in the horizontal, vertical and diagonal directions, retaining only adjacent endpoints. For example, a rectangular outline only needs to store its four vertices.
    cv2.CHAIN_APPROX_TC89_L1 or cv2.CHAIN_APPROX_TC89_KCOS: Applying one of the Teh-Chin chain approximation algorithms can further reduce the number of points in the contour, but requires longer calculation time.

Return value description:

  • ontours: All detected contours, list type, each element is an array of pixel coordinates of a certain contour.
  • hierarchy: hierarchical relationship between contours.

After finding the image contours through the findContours() method, in order to facilitate the developer's observation, it is best to draw the contours, so OpenCV provides the drawContours() method to draw these contours. The syntax of the drawContours() method is as follows:

  image = cv2.drawContours(image, contours, contourIdx, color, thickness, lineTypee, hierarchy, maxLevel, offse)

Parameter Description:

  • mage: The original image of the drawn outline, which can be a multi-channel image.
  • contours: The list of contours obtained by the findContours() method.
  • contourIdx: the index of drawing the contour, if it is -1, all contours are drawn.
  • color: Draw color, using BGR format.
  • thickness: Optional parameter, the thickness of the brush. If the value is -1, a solid outline will be drawn.
  • lineTypee: Optional parameter, line type for drawing the outline.
  • hierarchy: Optional parameter, hierarchical relationship obtained by findContours() method.
  • maxLevel: Optional parameter, the depth of the layer for drawing the outline, the deepest drawing is the maxLevel layer.
  • offset: Optional parameter, offset, which can change the position of the drawing result.

Return value description:

  • image: The same as the image in the parameter. After execution, the original image will contain the drawn outline. You can save the result without using this return value.
matplotlib.rcParams['font.sans-serif'] = ['SimHei']  # 设置中文字体为黑体
matplotlib.rcParams['axes.unicode_minus'] = False  # 解决负号显示问题
img=cv2.imread("./images/2.jpg")
grayImg=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
plt.title("灰度图")
plt.subplot(221)
plt.imshow(grayImg,cmap="gray")
#二值化
_,dst=cv2.threshold(grayImg,127,255,cv2.THRESH_BINARY)
contours, hierarchy = cv2.findContours(dst, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)
plt.subplot(222)
plt.imshow(cv2.drawContours(img.copy(), contours, 2, (0, 0, 255), 5))
"""
矩形包围框是指图像轮廓的最小矩形边界。OpenCV提供的boundingRect()方法可以自动计算轮廓最小矩形边界的坐标、宽和高。boundingRect()方法的语法如下:
     retval = cv2.boundingRect (array)
参数说明:
 array:轮廓数组。
返回值说明:
 retval:元组类型,包含4个整数值,分别是最小矩形包围框的:左上角顶点的横坐标、左上角顶点的纵坐标、矩形的宽和高。所以也可以写成x, y, w, h = cv2.boundingRect (array)的形式。
"""
x,y,w,h = cv2.boundingRect (contours[2])
print(x,y,w,h)
dstImg=img.copy()
plt.subplot(223)
cv2.rectangle(dstImg,(x,y),(x+w,y+h),(0,0,255),2)
plt.imshow(cv2.cvtColor(dstImg, cv2.COLOR_BGR2RGB))
plt.show()

Insert image description here

Contour fitting

Fitting refers to connecting a series of points on the plane with a smooth curve. Contour fitting is to express the uneven contours with flat geometric figures. This section explains how to draw rectangular and circular bounding boxes according to their outlines.

rectangular bounding box

The rectangular bounding box refers to the smallest rectangular boundary of the image outline. The boundingRect() method provided by OpenCV can automatically calculate the coordinates, width and height of the minimum rectangular boundary of the outline. The syntax of the boundingRect() method is as follows:
retval = cv2.boundingRect (array)
parameter description:

  • rray: outline array.

Return value description:

  • retval: tuple type, containing 4 integer values, which are the minimum rectangular bounding box: the abscissa of the upper left vertex, the ordinate of the upper left vertex, and the width and height of the rectangle. So it can also be written in the form of x, y, w, h = cv2.boundingRect (array).

Same as the image outline example above

circular bounding box

A circular bounding box, like a rectangular bounding box, is the smallest circular boundary of the image outline. The minEnclosingCircle () method provided by OpenCV can automatically calculate the center and radius of the minimum circular boundary of the outline. The syntax of the minEnclosingCircle() method is as follows:
center, radius = cv2.minEnclosingCircle(points)
parameter description:

  • points: outline array.

Return value description:

  • enter: Tuple type, containing 2 floating point values, which are the abscissa and ordinate of the center of the minimum circular bounding box.
  • radius: Floating point type, the radius of the minimum circular bounding box.
    Effect
    Insert image description here
polygonal bounding box

The cv2.approxPolyDP function is a function for contour approximation in OpenCV. It can approximate the points in the contour according to certain accuracy requirements, thereby simplifying the number of points in the contour and facilitating subsequent processing.
The syntax of this function is as follows:

epsilon = cv2.arcLength(curve, closed)
approx = cv2.approxPolyDP(curve, epsilon, closed)

Among them, curve represents the input contour, epsilon represents the approximate accuracy, and closed represents whether the contour is closed. The function returns an approximate outline.
The working principle of the cv2.approxPolyDP function is implemented through the Douglas-Peucker algorithm. The basic idea of ​​this algorithm is to find the longest line segment in the contour, use it as an approximate line segment of the contour, and divide the contour into two parts. The two parts are then processed recursively until the accuracy requirements are met. After such processing, the number of contour points obtained will be greatly reduced, but the shape of the contour can still be retained.

It should be noted that the smaller the value of epsilon, the more approximate contour points are obtained, but the shape accuracy of the contour is higher; conversely, the larger the value of epsilon, the fewer the contour points are obtained, but the shape of the contour is The accuracy is also lower. Therefore, choosing an appropriate epsilon value is very important for the effect of contour approximation.

img=cv2.imread("./images/2.jpg")
grayImg=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_,dst=cv2.threshold(grayImg,127,255,cv2.THRESH_BINARY)
contours, hierarchy = cv2.findContours(dst, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)
#近似轮廓
# 对每个轮廓进行近似
for i,e in enumerate([0.01,0.05,0.1,1]):
    img1=img.copy()
    for cnt in contours:
        epsilon = e * cv2.arcLength(cnt, True)  #获取轮廓的周长
        approx = cv2.approxPolyDP(cnt, epsilon, True)
    
        # 绘制近似的轮廓
        cv2.drawContours(img1, [approx], 0, (0, 255, 0), 3)
    plt.subplot(int("22"+str(i+1)))
    plt.title(e)
    plt.imshow(cv2.cvtColor(img1, cv2.COLOR_BGR2RGB))
plt.show()

Insert image description here

convex hull

We have introduced rectangular bounding boxes and circular bounding boxes before. Although these two bounding boxes have approached the edges of the graphics, in order to maintain the geometric shape, these bounding boxes have poor fit with the real contours of the graphics. If you can find the outermost endpoints of the figure and connect these endpoints, you can form a minimum bounding box surrounding the figure. This bounding box is called a convex hull.

The convex hull is the polygon that most closely approximates the contour. Every part of the convex hull is convex, that is, the interior angle formed by any three points is less than 180°. For example, Figure 13.12 is a convex hull, but Figure 13.13 is not a convex hull.
Insert image description here
The convexHull() method provided by OpenCV can automatically find the convex hull of the contour. The syntax of this method is as follows:

  hull = cv2.convexHull(points, clockwise, returnPoints)

Parameter Description:

  • oints: outline array.
  • clockwise: optional parameter, Boolean type. When the value is True, the points in the convex hull are arranged clockwise, and when it is False, the points are arranged counterclockwise.
  • returnPoints: optional parameter, Boolean type. When the value is True, the point coordinates are returned, and when it is False, the point index is returned. The default value is True.

Return value description:

  • hull: convex hull lattice array
img=cv2.imread("./images/2.jpg")
grayImg=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_,dst=cv2.threshold(grayImg,127,255,cv2.THRESH_BINARY)
contours, hierarchy = cv2.findContours(dst, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)
# 根据轮廓面积从大到小排序
sorted_contours = sorted(contours, key=cv2.contourArea, reverse=True)
hull=cv2.convexHull(sorted_contours[0])
cv2.polylines(img1,[hull],True,(0,0,255),2)
plt.imshow(cv2.cvtColor(img1, cv2.COLOR_BGR2RGB))
plt.show()

Insert image description here

Canny edge detection

The Canny edge detection algorithm is a multi-level edge detection algorithm developed by John F. Canny in 1986. This algorithm finds image edges based on the gradient changes of pixels, and can ultimately draw very fine binary edge images.

OpenCV encapsulates the Canny edge detection algorithm in the Canny() method. The syntax of this method is as follows:

     edges = cv2.Canny(image, threshold1, threshold2, apertureSize, L2gradient)

Parameter Description:

  • image: The original image of the detection.
  • threshold1: The first threshold used in the calculation process, which can be the minimum threshold or the maximum threshold. It is usually used to set the minimum threshold.
  • threshold2: The second threshold used in the calculation process, usually used to set the maximum threshold.
  • apertureSize: Optional parameter, aperture size of Sobel operator.
  • L2gradient: Optional parameter, the identifier for calculating the image gradient, the default value is False. When the value is True, a more accurate algorithm is used for calculation.

Return value description:

  • dges: The calculated edge image is a binary grayscale image.
import cv2
import matplotlib.pyplot as plt
import matplotlib
matplotlib.rcParams['font.sans-serif'] = ['SimHei']  # 设置中文字体为黑体
matplotlib.rcParams['axes.unicode_minus'] = False  # 解决负号显示问题
img=cv2.imread("./images/1.jpg")
plt.subplot(221)
plt.title("原始图")
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
#二值化
r1=cv2.Canny(img,10,50)
plt.subplot(222)
plt.title("Canny")
plt.imshow(cv2.cvtColor(r1, cv2.COLOR_BGR2RGB))

Insert image description here

Hough line

Hough transform is a type of feature detection that uses an algorithm to identify features of an image to determine special shapes in the image, such as straight lines and circles.

Linear detection

The Hough straight line transformation determines whether the points in the image form a straight line through the mapping relationship between the straight line of the Hough coordinate system and the points of the Cartesian coordinate system. OpenCV encapsulates this algorithm into two methods, namely cv2.HoughLines() and cv2.HoughLinesP(). The former is used to detect infinitely extended straight lines, and the latter is used to detect line segments. The
HoughLinesP() method name has a capital P at the end. , this method can only detect binary grayscale images, that is, black and white images with only two pixel values. This method finally saves the coordinates of the two endpoints of all the found line segments into an array.

The syntax of the HoughLinesP() method is as follows:

 lines = cv2.HoughLinesP(image, rho, theta, threshold, minLineLength, maxLineGap)

Parameter Description:

  • image: The original image of the detection.
  • rho: The radius step used to detect straight lines. When the value is 1, it means detecting all possible radius steps.
  • theta: Search the angle of the straight line. When the value is π/180°, it means detecting all angles.
  • threshold: Threshold value. The smaller the value, the more straight lines will be detected.
  • minLineLength: The minimum length of the line segment. Line segments smaller than this length are not recorded in the result.
  • maxLineGap: The minimum distance between line segments.
    Return value description:
  • lines: an array whose elements are all detected line segments. Each line segment is an array, representing the horizontal and vertical coordinates of the two endpoints of the line segment. The format is [[[x1, y1, x2, y2], [x1, y1, x2, y2]]].
import cv2
import matplotlib.pyplot as plt
import matplotlib
import numpy as np
from utils import common

def show(dilate, title, cmap=None, debug=False):
    if debug:
        plt.title(title)
        plt.imshow(dilate, cmap=cmap)
        plt.show()
matplotlib.rcParams['font.sans-serif'] = ['SimHei']  # 设置中文字体为黑体
matplotlib.rcParams['axes.unicode_minus'] = False  # 解决负号显示问题
img=cv2.imread("./images/1.jpg")
grayImg=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
common.show(grayImg,"原图",cmap="gray",debug=True)
edges = cv2.Canny(grayImg, 20, 40)
common.show(edges,"边缘检测图",cmap="gray",debug=True)
lines = cv2.HoughLinesP(edges, 1, np.pi/180, 15, 100, 18)
img1=img.copy()
for line in lines:
    x1,y1,x2,y2=line[0]
    cv2.line(img1,(x1,y1),(x2,y2),(0,0,255),2)
common.show(img1,"直线",cmap="gray",debug=True)    

Insert image description here

Ring detection

The principle of Hough ring transform is similar to Hough straight line transform. The HoughCircles() method provided by OpenCV is used to detect circles in the image. This method performs two rounds of screening during the detection process: the first round of screening finds the coordinates of the center of the circle that may be the circle, and the second round of screening calculates the possible corresponding coordinates of the center of the circle. the length of the radius. This method finally encapsulates the circle center coordinates and radius into a floating point array.

The syntax of the HoughCircles() method is as follows:

 circles = cv2.HoughCircles(image, method, dp, minDist, param1, param2, minRadius, maxRadius)

Parameter Description:

  • image: The original image of the detection.
  • method: detection method. OpenCV 4.0.0 and previous versions only provide cv2.HOUGH_GRADIENT as the only available method.
  • dp: The reciprocal of the ratio of the accumulator resolution to the original image resolution. A value of 1 gives the accumulator the same resolution as the original image; a value of 2 makes the accumulator have 1/2 the resolution of the original image. Usually 1 is used as parameter.
  • minDist: Minimum distance between circle centers.
  • param1: Optional parameter, the maximum threshold used by Canny edge detection.
  • param2: Optional parameter, the number of votes to detect the ring result. Only rings with more votes than this value in the first round of screening will enter the second round of screening. The larger the value, the fewer rings will be detected, but the accuracy will be higher.
  • minRadius: Optional parameter, the minimum radius of the ring.
  • maxRadius: Optional parameter, the maximum radius of the ring.

Return value description:

  • circles: an array whose elements are all detected circles. Each circle is also an array. The content is the horizontal and vertical coordinates and radius length of the center of the circle. The format is: [[[x1,y1, r1], [x2, y2, r2]]].

video processing

OpenCV is not only able to process images, but also videos. Videos are made up of a large number of images that are taken from the video at regular intervals. In this way, these images can be processed using image processing methods to achieve the purpose of processing videos. To process a video, you need to first perform related operations such as reading, displaying, and saving the video. For this purpose, OpenCV provides related methods of the VideoCapture class and VideoWriter class.
The VideoCapture class provides the constructor VideoCapture() to complete the initialization of the camera. VideoCapture() syntax
apture = cv2.VideoCapture(index|video file path)
parameter description:

  • video: the video to be opened.
  • filename: The file name of the opened video. For example, company promotion.avi, etc.
import cv2

# 打开视频文件
cap = cv2.VideoCapture('video.avi')
while True:
    # 读取视频帧
    ret, frame = cap.read()
    # 如果视频结束或者读取失败,退出循环
    if not ret:
        break
    # 显示当前帧
    cv2.imshow('frame', frame)
    # 等待按键输入
    key = cv2.waitKey(1) & 0xFF
    # 如果按下 'q' 键,退出循环
    if key == ord('q'):
        break
# 释放资源
cap.release()
cv2.destroyAllWindows()

Face Detection

Face recognition is a biometric identification technology based on people's facial feature information. It is also a technology focused on the development of computer vision. After the birth of machine learning algorithms, computers can automatically analyze the content information contained in images through input devices such as cameras. With the continuous development of technology, there are now a variety of face recognition algorithms. This chapter will introduce the use of various image tracking technologies and three face recognition technologies that come with OpenCV.
Insert image description here

Cascade classifier

A series of simple classifiers are cascaded together in a certain order to form a cascade classifier. Programs using cascade classifiers can identify samples through a series of simple judgments. For example, a sample that meets the three conditions of "has 6 legs," "has wings," and "has a head, thorax, and abdomen" in order can be initially judged as an insect. However, if any one of the conditions is not met, it will not be considered an insect. insect

OpenCV provides some trained cascade classifiers. These cascade classifiers are saved as XML files in the following path: ...
\Python\Lib\site-packages\cv2\data
My window is: D:/ condaenv/tensorflowcpu/Library/etc/haarcascades/
OpenCV requires two steps to implement face detection: loading the cascade classifier and using the classifier to identify the image. There are corresponding methods for these two steps.

The first is to load the cascade classifier. OpenCV creates a classifier object through the CascadeClassifier() method. The syntax is as follows:
<CascadeClassifier object> = cv2.CascadeClassifier(filename)
parameter description:

  • filename: XML file name of the cascade classifier.
    Return value description:
  • object: classifier object.

Then use the created classifier to identify the image. This process requires calling the detectMultiScale() method of the classifier object. Its syntax is as follows:

objects = cascade.detectMultiScale(image, scaleFactor, minNeighbors, flags, minSize, maxSize)

Object description:

cascade: existing classifier object.
Parameter Description:

  • image: The image to be analyzed.
  • scaleFactor: Optional parameter, the scaling ratio when scanning the image.
  • minNeighbors: Optional parameter, how many detection results should be retained in each candidate area before it can be determined as a face. The larger the value, the smaller the error of the analysis.
  • flags: optional parameters, parameters of the old version of OpenCV, it is recommended to use the default values.
  • minSize: Optional parameter, minimum target size.
  • maxSize: optional parameter, the maximum target size.
    Return value description:
  • objects: Array of captured target areas. Each element in the array is a target area. Each target area contains 4 values, namely: abscissa of the upper left corner point, ordinate of the upper left corner point, area width, and area height. . The format of object is: [[244 203 111 111] [432 81 133 133]].
    Original picture:
    Insert image description here
#%%

import cv2
import matplotlib.pyplot as plot
#加载人脸模型
xml_dir="D:/condaenv/tensorflowcpu/Library/etc/haarcascades/"
# 加载人脸检测器

face_cascade = cv2.CascadeClassifier(xml_dir+'haarcascade_frontalface_alt2.xml')
print(face_cascade)
# 读取要处理的图片
img = cv2.imread('../images/people.png')

# 转换为灰度图像
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# 检测人脸
faces = face_cascade.detectMultiScale(gray)

# 在图像中框出人脸
for (x, y, w, h) in faces:
    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)

# 显示结果
plot.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plot.show()

Insert image description here

Guess you like

Origin blog.csdn.net/liaomin416100569/article/details/131205012