Usage of some functions in [opencv]

1.cv2.bitwise_and()

cv2.bitwise_and()It is one of the bit operation functions in OpenCV, used to perform a bitwise "AND" operation on two binary images. Specifically, for each pixel, a bitwise AND operation is performed on the pixel values ​​at the corresponding positions of the two input images. The corresponding pixel value of the output result image is the bitwise AND of the corresponding pixel values ​​of the two input images. result.

cv2.bitwise_and()The syntax of the function is as follows:

dst = cv2.bitwise_and(src1, src2[, mask])

Among them, src1 and src2 represent the two input images to be performed with bitwise AND operation; mask is an optional parameter. If a mask is specified, only the bitwise AND operation will be performed on the pixels at the corresponding position of the mask. The return value of the function dst represents the result of the bitwise AND operation.

import cv2
import numpy as np

# 创建两幅二值图像
img1 = np.zeros((300, 300), dtype=np.uint8)
img1[100:200, 100:200] = 100
img2 = np.zeros((300, 300), dtype=np.uint8)
img2[150:250, 150:250] = 255

# 对两幅二值图像进行按位"与"操作
result = cv2.bitwise_and(img1, img2)

# 显示结果图像
cv2.imshow('img1', img1)
cv2.imshow('img2', img2)
cv2.imshow('1', result)
cv2.waitKey(0)
cv2.destroyAllWindows()

6d68193aa3be402fb51e598b7a75d14b.png

Here is an example using the mask parameter:

import cv2
import numpy as np

# 读取图像
img = cv2.imread(r"C:\Users\asus\Desktop\20230104200344_6edae.thumb.1000_0.jpg")
# 创建与图像相同大小的掩膜
mask = np.zeros(img.shape[:2], dtype=np.uint8)
# 创建一个圆形掩膜,半径为100,中心为图像中心
mask = cv2.circle(mask, (img.shape[1]//2, img.shape[0]//2), 100, 100, -1)
# 将图像与掩膜进行位运算
masked_img = cv2.bitwise_and(img, img, mask=mask)

# 显示结果
cv2.imshow('image', img)
cv2.imshow('mask', mask)
cv2.imshow('masked_image', masked_img)
cv2.waitKey(0)
cv2.destroyAllWindows()

 17df1cdb30144d4680510762d2669c29.png1d63920ab53445ceb6cd2f91659d9d91.png24a355554de9414dba60fdd258e07195.png

 In this example, an image is first read image.jpg, and then a mask is created with the same size as the image. Then using cv2.circle() I created a circular mask with a radius of 100 and center at the center of the image. Finally, use cv2.bitwise_and() to perform bit operations on the image and the mask to obtain an image masked_img that only contains the circular part. Finally, the image, mask and mask operation results are displayed.

 2.cv2.threshold()/cv2.adaptiveThreshold()

(1)cv2.threshold()It is a function used for image processing in the OpenCV library in Python. It is used to apply a fixed level of threshold to grayscale or color images.

This function accepts the following parameters:

  • src:Input image
  • thresh:threshold
  • maxval: The maximum value to which pixel values ​​exceeding the threshold will be assigned. For grayscale images, this is usually set to 255.
  • type: The type of threshold to apply. This can be one of two types:
    • cv2.THRESH_BINARY: The pixel value greater than the threshold is set tomaxval, and the pixel value less than the threshold is set to 0.
    • cv2.THRESH_BINARY_INV: If the pixel value is greater than the threshold, it is set to 0, and if it is less than the threshold, it is set tomaxval.
  • dst: Output image.

The function returns a tuple containing two values:

  • retval: Threshold value to use
  • dst:Thresholded image
    import cv2
    import numpy as np
    
    
    img2 = cv2.imread("C:/Users/asus/Desktop/20220802203106_eaafd.jpeg")
    img2gray = cv2.cvtColor(img2,cv2.COLOR_BGR2GRAY)
    ret, pic = cv2.threshold(img2gray, 100, 255, cv2.THRESH_BINARY)
    cv2.imshow('pic',pic)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

 bba9d7a04860452ea9df525bb7f835f7.png

(2)cv2.adaptiveThresholdis a function in OpenCV that performs adaptive thresholding on a given image.

Thresholding is a common image processing technique used to segment an image into two regions, usually to separate an object of interest from the background. In thresholding, the pixel value is compared with a threshold and if it exceeds the threshold, it is assigned a certain value, otherwise it is assigned another value.

In adaptive thresholding, the threshold value is not fixed but changes locally based on image features near each pixel. This allows for better handling of changes in lighting, contrast, and noise in images.

cv2.adaptiveThresholdThe function has the following parameters:

  • src: Input image.
  • maxValue: The maximum value to be assigned when the pixel exceeds the threshold. Typically set to 255.
  • adaptiveMethod: Adaptive threshold processing algorithm. There are two options: cv2.ADAPTIVE_THRESH_MEAN_C and cv2.ADAPTIVE_THRESH_GAUSSIAN_C. The former uses the average value of pixel values ​​in the local neighborhood as the threshold, and the latter uses the weighted sum of pixel values ​​in the local neighborhood as the threshold.
  • thresholdType: threshold processing type. There are two options: cv2.THRESH_BINARY and cv2.THRESH_BINARY_INV. The former assigns pixels above the threshold to maxValue and pixels below the threshold to 0, while the latter does the opposite.
  • blockSize: The size of the local neighborhood used to calculate the threshold. This should be an odd number, usually set to 3, 5, 7, etc.
  • C: A constant value subtracted from the average or weighted sum of pixel values ​​when calculating the threshold. This can help adjust the sensitivity of thresholding.
import cv2

# 读取图像
img = cv2.imread('image.jpg', 0)

# 对图像执行自适应阈值处理
img_thresh = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2)

# 显示原始图像和处理后的图像
cv2.imshow('Original', img)
cv2.imshow('Adaptive Threshold', img_thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this example, we first read a grayscale image and then perform adaptive thresholding on it using the cv2.adaptiveThreshold function. We chose the cv2.ADAPTIVE_THRESH_MEAN_C algorithm to calculate the threshold, and set blockSize to 11 and C to 2, which means that when calculating Thresholding uses an 11x11 local neighborhood and subtracts 2 from the average of the pixel values. Finally, we set thresholdType to cv2.THRESH_BINARY, which means pixels above the threshold are assigned 255 and pixels below the threshold are assigned 0.

After running the above code, we will see the original image and the processed image displayed in two windows respectively. In the processed image, pixels whose values ​​are greater than the adaptive threshold are assigned white (255), and pixels whose values ​​are lower than the threshold are assigned black (0).

 3.HSV image space

HSV (i.e. Hue, Saturation, Value) is a commonly used color space that is often used for color analysis and recognition in image processing. The HSV color space represents color as three components: Hue, Saturation, and Value.

The HSV image space in OpenCV has the following characteristics:

  1. The HSV image space separates the three elements of color brightness, saturation and hue to facilitate color analysis and processing.

  2. In the HSV image space, the brightness of the color is represented by the V (value) channel, with a value range of 0-255, indicating the brightness of the color. A value of 0 represents black, and a value of 255 represents the brightest white.

  3. The saturation of a color is represented by the S (saturation) channel, with a value ranging from 0-255, indicating the purity of the color. A saturation of 0 represents gray, and a saturation of 255 represents the most vivid color.

  4. The hue of a color is represented by the H (hue) channel, with a value ranging from 0-180, indicating the color's position in the color spectrum. The H channel is an angle measure, so it ranges from 0-360 degrees, but OpenCV divides it by 2, making it 0-180. Hue represents the type of color, such as red, green, blue, etc.

Therefore, HSV image space is widely used in image processing, such as color segmentation, target tracking, image enhancement, etc.

4.cv2.inRange

The cv2.inRange() function is an image segmentation function in OpenCV, which is used to divide the pixels in the input image into two categories according to the set threshold range: pixels that meet the conditions and pixels that do not meet the conditions.

The syntax of the cv2.inRange() function is as follows:

cv2.inRange(src, lowerb, upperb, dst=None)

Among them, the meaning of the parameters is as follows:

  • src: Input image (single channel or three channels)
  • lowerb: lower threshold bound (can be a scalar or array)
  • upperb: upper threshold value (can be a scalar or array)
  • dst: The output image is the same size as the input image and the data type isnp.uint8

The return value of the function is the output image dst.

The cv2.inRange() function returns a binary image, in which the value of the pixel is 0 or 255, indicating whether the corresponding pixel in the input image meets the set threshold range. If the value of a pixel is within the threshold range, its corresponding pixel value in the output image is 255; otherwise, its corresponding pixel value in the output image is 0.

The data type of the output image is np.uint8, which can be a single-channel grayscale image or a three-channel color image. If the input image is a single-channel image, then the output image is also single-channel; if the input image is a three-channel image, then the output image is also three-channel, but the pixel value of each channel is the same.

Example: Tracking blue objects

import cv2
import numpy as np
cap=cv2.VideoCapture(0)
while(1):
# 获取每一帧
ret,frame=cap.read()
# 转换到 HSV
hsv=cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)
# 设定蓝色的阈值
lower_blue=np.array([110,50,50])
upper_blue=np.array([130,255,255])
# 根据阈值构建掩模
mask=cv2.inRange(hsv,lower_blue,upper_blue)
# 对原图像和掩模进行位运算
res=cv2.bitwise_and(frame,frame,mask=mask)
# 显示图像
cv2.imshow('frame',frame)
cv2.imshow('mask',mask)
cv2.imshow('res',res)
k=cv2.waitKey(5)&0xFF
if k==27:
break
# 关闭窗口
cv2.destroyAllWindows()

24ef8185dc594f62ac54b91080689ab6.png

 4.Affine transformation

Original link: https://blog.csdn.net/m0_50294896/article/details/120577389

Affine Transformation refers to the process of performing a linear transformation (multiplying a matrix) and a translation (adding a vector) in a vector space to another vector space.
Affine transformation represents the mapping relationship between two pictures. The affine transformation matrix is ​​a 2x3 matrix, such as the matrix M in the figure below, in which B plays the role of translation, and The diagonal line in A determines scaling, and the anti-diagonal line determines rotation or miscutting.
98e59f0fd99846d6b0a28e4fb70d34f8.png#pic_center

 The original pixel point coordinates (x, y), the coordinates of the point after affine transformation are T, then the basic algorithm principle of matrix affine transformation:

6eb0ee97ef6c4e1ebbce68b6924c2b8d.png#pic_center

 (1)cv2.getAffineTransform

cv2.getAffineTransform()Is the function used in OpenCV to obtain the affine transformation matrix.

This function accepts three point pairs, representing three points in the source image and three points in the target image. From these point pairs, the function can calculate a 2 x 3 affine transformation matrix that maps points in the source image to points in the target image. The matrix returned by the function can be used in the cv2.warpAffine() function to perform affine transformations.

(2)cv2.warpAffine()

cv2.warpAffine()is a function in OpenCV used to perform affine transformations. Affine transformation is a linear transformation that can be described by translation, rotation and scaling. This function accepts an input image, an affine transformation matrix, and an output image size. It transforms the input image using the specified affine transformation matrix and gives the transformed image as output.

import cv2
import numpy as np

# 定义源图像中的三个点和目标图像中的三个点
src_pts = np.float32([[50,50], [200,50], [50,200]])
dst_pts = np.float32([[10,100], [200,50], [100,250]])

# 计算仿射变换矩阵
M = cv2.getAffineTransform(src_pts, dst_pts)

# 对源图像进行仿射变换
img = cv2.imread('image.jpg')
img_out = cv2.warpAffine(img, M, (img.shape[1], img.shape[0]))

# 显示结果
cv2.imshow('Input', img)
cv2.imshow('Output', img_out)
cv2.waitKey(0)
cv2.destroyAllWindows()

(3)cv2.getRotationMatrix2D

cv2.getRotationMatrix2D()Is the function used in OpenCV to obtain the rotation matrix. This function returns a 2x3 rotation matrix used to perform image rotation based on the given rotation center and rotation angle.

This function needs to pass three parameters: rotation center, rotation angle and scaling factor. Among them, the rotation center is a tuple (x,y), indicating the coordinates of the rotation center. The rotation angle is a floating point number in degrees that represents the angle to rotate. The scaling factor is a floating point number that represents the scaling factor of the output image relative to the input image.

import cv2
import numpy as np

# 读取图像
img = cv2.imread('image.jpg')

# 获取旋转矩阵
(h, w) = img.shape[:2]
center = (w // 2, h // 2)
angle = 45
scale = 1.0
M = cv2.getRotationMatrix2D(center, angle, scale)

# 对图像进行旋转
img_rotated = cv2.warpAffine(img, M, (w, h))

# 显示结果
cv2.imshow('Input', img)
cv2.imshow('Output', img_rotated)
cv2.waitKey(0)
cv2.destroyAllWindows()

5.Otsu’s binarization

Otsu's binarization is an adaptive image binarization method. Its core idea is to segment the image into two areas, the foreground and the background, by finding the optimal threshold to maximize the inter-class variance of the segmented image. , thereby achieving the best binarization effect.

Specifically, the implementation process of Otsu’s binarization can be divided into the following steps:

  1. Calculate the grayscale histogram of the image and count the number of pixels at each grayscale level.

  2. Normalize the grayscale histogram to obtain the pixel occurrence probability of each grayscale level.

  3. From gray level 0 to 255, iteratively calculate the inter-class variance when each gray level is used as a threshold, and find the maximum inter-class variance and its corresponding threshold.

    Specifically, assuming that the current iteration reaches gray level t, the image is divided into two parts: the foreground part composed of pixels with gray level less than or equal to t and the background part composed of pixels with gray level greater than t. Let W1 and W2 represent the proportion of the foreground and background parts respectively, μ1​ and μ2​ represent the average gray value of the foreground and background parts respectively, G represents the average gray value of the entire image, then the inter-class variance var can be calculated as :

    a266dcb1e188415ca7d9c0c45b92f8d2.png

    When var takes the maximum value, the corresponding threshold is the optimal threshold.

  4. Binarize the image based on the optimal threshold.

Another explanation:

"Minimum intra-peak variance" refers to another interpretation based on Otsu's algorithm, that is, when the gray value is divided into two parts, the intra-peak variance of the two parts is minimized.

Under this explanation, we assume that the current iteration reaches gray level t, then the image is divided into two parts: the foreground part composed of pixels with gray level less than or equal to t and the pixels with gray level greater than t Composed background part. Let eq?w_1 and eq?w_2 represent the proportions of the foreground and background parts respectively,

μ1​ and μ2 represent the average gray value of the foreground and background parts respectively, G represents the average gray value of the entire image, eq?%5Csigma_1%5E2 and eq?%5Csigma_2%5E2 Represents the grayscale variance of the foreground and background parts respectively. Then, the intra-peak variance can be expressed as:6e56047bcc5f4794bbc3796a30d065ac.png

In OpenCV, we can use the cv2.threshold function to implement Otsu’s binarization, which has the following parameters:

  • src: Input image.
  • thresh: Manually specified threshold. Here we set it to 0. (At this time, the threshold should be set to 0. Then the algorithm will find the optimal threshold, which is the return valueretVal. If you do not use Otsu 2 value, the returned retVal value is equal to the set threshold)
  • maxval: The maximum value to be assigned when the pixel exceeds the threshold. Typically set to 255.
  • type: threshold processing type. In Otsu’s binarization, we should use cv2.THRESH_BINARY+cv2.THRESH_OTSU, where cv2.THRESH_OTSU means enabling Otsu’s automatic threshold determination method.
  • dst: Output image. We don't need an output image here, so we can set it to None.
    import cv2
    
    # 读取图像
    img = cv2.imread('image.jpg', 0)
    
    # 将图像进行 Otsu’s 二值化
    ret, img_thresh = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
    
    # 显示原始图像和处理后的图像
    cv2.imshow('Original', img)
    cv2.imshow('Otsu Threshold', img_thresh)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    

    6.cv2.filter2D

cv2.filter2D is a function in OpenCV that is used to perform two-dimensional convolution operations on images, also called filtering operations. This function can perform general linear filtering on images, including average filtering, Gaussian filtering, sharpening, etc.

dst = cv2.filter2D(src, ddepth, kernel[, dst[, anchor[, delta[, borderType]]]])

Parameter Description:

  • src: Input image.
  • ddepth: The depth of the output image. If it is -1, the depth of the output image is the same as the input image.
  • kernel: Convolution kernel, which can be a numpy array or Mat object.
  • dst: Output image, can be empty.
  • anchor: Anchor point position, the default is (-1, -1) indicating the center point.
  • delta: Optional translation value, default is 0.
  • borderType: Boundary filling method, which can be cv2.BORDER_CONSTANT (constant filling), cv2.BORDER_REPLICATE (copy boundary), cv2.BORDER_REFLECT (border symmetric reflection), etc.

The implementation process of this function is to use the convolution kernel as a sliding window and traverse the image starting from the upper left corner. Each time, the pixels in the window and the weights of the corresponding positions of the convolution kernel are multiplied and accumulated, and the results are written to the output. The pixel value at the corresponding location in the image.

import cv2
import numpy as np

img = cv2.imread('test.jpg')
kernel = np.ones((5, 5), np.float32) / 25
dst = cv2.filter2D(img, -1, kernel)
cv2.imshow('Original', img)
cv2.imshow('Filtered', dst)
cv2.waitKey(0)

7.cv2.GaussianBlur()

dst = cv2.GaussianBlur(src, ksize, sigmaX[, dst[, sigmaY[, borderType]]])

Among them, the parameter description is as follows:

  • src: Input image, which can be a grayscale image or a color image. The data type is uint8, float32 or float64 .
  • ksize: The size of the Gaussian kernel, which must be a positive odd number. If it is (0, 0), it will be automatically calculated based on the sigma value.
  • sigmaX: The standard deviation of the Gaussian kernel in the horizontal direction. If it is 0, it is automatically calculated based on ksize.
  • dst: Output image, with the same size and data type as the input image, optional parameter.
  • sigmaY: The standard deviation of the Gaussian kernel in the vertical direction. If it is 0, the default is the same as sigmaX. Optional parameter.
  • borderType: Boundary filling method, optional parameter.
  • import cv2
    
    # 读取输入图像
    img = cv2.imread('input.png')
    
    # 对图像进行高斯平滑处理,sigmaX=sigmaY=3
    blur1 = cv2.GaussianBlur(img, (5, 5), 3, 3)
    
    # 对图像进行高斯平滑处理,sigmaX=5, sigmaY=1
    blur2 = cv2.GaussianBlur(img, (5, 5), 5, 1)
    
    # 对图像进行高斯平滑处理,sigmaX=1, sigmaY=5
    blur3 = cv2.GaussianBlur(img, (5, 5), 1, 5)
    
    # 显示结果图像
    cv2.imshow('input', img)
    cv2.imshow('output1', blur1)
    cv2.imshow('output2', blur2)
    cv2.imshow('output3', blur3)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    

    8.cv2.medianBlur

      ​ is a function in the OpenCV library that is used to perform median filtering on images. Median filtering is a nonlinear filtering technique used to remove noise from images. It's particularly useful for removing salt and pepper noise, a type of noise that can appear as random black and white pixels in an image.

This function accepts the following parameters:

  • src: The input image to be filtered.
  • ksize: The size of the kernel used for filtering. It should be a positive odd number.
  • dst: Filtered output image.

This function applies median filtering to the input image using a checkbox of size ksize and returns the filtered image into dst. Median filtering replaces each pixel in the image with the median value of pixels in a neighborhood defined by a kernel. The size of the neighborhood is defined by the size of the kernel.

import cv2
import numpy as np

# 加载图像
img = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)

# 应用中值滤波
filtered_img = cv2.medianBlur(img, 5)

# 显示滤波后的图像
cv2.imshow('Filtered Image', filtered_img)
cv2.waitKey(0)
cv2.destroyAllWindows()

9.cv2.Laplacian

dst = cv2.Laplacian(src, ddepth[, dst[, ksize[, scale[, delta[, borderType]]]]])
  • src: The input image that needs to be Laplacian filtered can be an image with any number of channels, and the data types are uint8, float32 and float64.
  • ddepth: Depth of the output image, usually set to -1 (same as input image) or CV_64F (64-bit float).
  • dst: Output image with the same size and data type as src. If not specified, an output image with the same size and data type as src will be created. image.
  • ksize: The size of the Laplacian operator, must be a positive odd number (1, 3, 5, 7, etc.), the default value is 
  • scale: Scale factor of the Laplacian filter, used to scale the Laplacian operator, the default value is 1.
  • delta: Optional offset in the output image, default value is 0.
  • borderType: Image filling method, available for use cv2.BORDER_DEFAULT (clear), cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, < a i=4>, etc. cv2.BORDER_REFLECTcv2.BORDER_WRAP
  • example
    import cv2
    
    # 读取图像
    img = cv2.imread('input_image.jpg')
    
    # 将图像转换为灰度图像
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
    # 进行拉普拉斯滤波
    laplacian = cv2.Laplacian(gray, cv2.CV_64F, ksize=3)
    
    # 显示输出图像
    cv2.imshow('Laplacian Filter', laplacian)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    

    The discrete form of the Laplacian filter can be expressed as a two-dimensional difference operator, as follows:

    [0  1  0]
    [1 -4  1]
    [0  1  0]
    

    This operator can be regarded as the second derivative approximation of the pixel value in a circular area. In an image, the purpose of the Laplacian filter is to find the places where pixel values ​​change the fastest, usually at the edges.

    In OpenCV, the cv2.Laplacian function convolves the image with this operator to get an output image. During the convolution process, the value of each pixel is the weighted average of the values ​​of its surrounding pixels and the Laplacian operator. Finally, the value of each pixel in the output image is the response value of the Laplacian operator at that pixel position.

    It should be noted that the Laplacian filter will enhance high-frequency noise in the image, so in practical applications, its results usually need to be smoothed. This can be achieved by Gaussian blurring the image before Laplacian filtering, or by adding a smooth prior model.

10.cv2.Sobel

cv2.Sobelfunction is one of the functions used in OpenCV to calculate the gradient of an image. It can calculate the first derivative of the image in the x and y directions, and the second derivative in both directions based on the Sobel operator.

Specifically, the parameters of the cv2.Sobel function include:

  • Input image: It can be a single-channel grayscale image or a multi-channel color image.
  • Depth of the output image: usually set to cv2.CV_64F, indicating that the pixel value type of the output image is floating point.
  • Derivative order in x direction and y direction: can be set to 0, 1 or 2, indicating the order of derivative to be calculated. For example, a setting of 1 computes the first derivative, and a setting of 2 computes the second derivative.
  • The convolution kernel size of the Sobel operator: can be specified through the ksize parameter. Usually, a 3x3 or 5x5 convolution kernel is sufficient. If set to -1, it means using the default convolution kernel size, that is, 3.
  • Scale factor and delta value: The scaling factor and offset of the output image can be controlled through the scale and delta parameters. By default, both scale and delta are 1, which means no scaling or biasing.
  • import cv2
    import numpy as np
    
    # 读取输入图像
    img = cv2.imread('lena.jpg', cv2.IMREAD_GRAYSCALE)
    
    # 计算 x 方向和 y 方向上的一阶导数
    sobelx = cv2.Sobel(img, cv2.CV_64F, 1, 0, ksize=3)
    sobely = cv2.Sobel(img, cv2.CV_64F, 0, 1, ksize=3)
    
    # 将计算结果转换为 uint8 类型的图像
    sobelx = cv2.convertScaleAbs(sobelx)
    sobely = cv2.convertScaleAbs(sobely)
    
    # 将 x 方向和 y 方向上的导数相加,得到梯度图像
    sobel = cv2.addWeighted(sobelx, 0.5, sobely, 0.5, 0)
    
    # 显示结果图像
    cv2.imshow('Sobel', sobel)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    

    In practical applications, we can choose to calculate the derivative in the x direction or y direction as needed. For example, if we want to detect edges in the horizontal direction in the image, we can use the Sobel operator to calculate in the x direction. Derivative, that is, cv2.Sobel(img, cv2.CV_64F, 1, 0, ksize=3), where 1 means to calculate the derivative in the x direction, and 0 means not to calculate the derivative in the y direction. . If you want to detect edges in the vertical direction, you need to swap the positions of these two parameters, i.e. cv2.Sobel(img, cv2.CV_64F, 0, 1, ksize=3).

10.cv.scharr

If ksize=-1, the 3x3 Scharr filter will be used, which is better than the 3x3 Sobel filter (and has the same speed, so you should try to use the Scharr filter when using the 3x3 filter).

cv2.Scharr()The function is used similarly to the cv2.Sobel() function, accepting the following parameters:

  • src: Input image.
  • ddepth: The depth of the output image, you can use -1 to mean the same as the input image.
  • dx: Indicates the order of the derivative to be calculated in the x direction.
  • dy: Indicates the order of the derivative to be calculated in the y direction.
  • scale: Scaling factor, which can be used to scale the pixel values ​​of the output image.
  • delta: Offset, which can be used to adjust the brightness of the output image.
  • borderType: Indicates the processing method of image boundaries. You can refer to the parameters in cv2.Sobel().

Here is a sample code for edge detection using the cv2.Scharr() function:

import cv2

img = cv2.imread('image.jpg', 0)
sobelx = cv2.Scharr(img, cv2.CV_64F, 1, 0)
sobely = cv2.Scharr(img, cv2.CV_64F, 0, 1)
sobel = cv2.addWeighted(sobelx, 0.5, sobely, 0.5, 0)

cv2.imshow('Scharr', sobel)
cv2.waitKey(0)
cv2.destroyAllWindows()

11.cv2.findContours

cv2.findContoursis a function in OpenCV used to find contours in binary images. This function returns information about all contours in the image, including the coordinates and hierarchical relationship of each contour. Its syntax is as follows:

contours, hierarchy = cv2.findContours(image, mode, method[, contours[, hierarchy[, offset]]])

Among them, the parameter description is as follows:

  • image: The input binary image must be an 8-bit single-channel image (such as grayscale image).

  • mode: Contour retrieval mode, specifies how to retrieve contours. The following four modes are available:

    1.cv2.RETR_EXTERNAL: Only detect the outer contour.

       2.cv2.RETR_LIST: Detect all contours, but do not establish a hierarchical relationship between contours.

   3.cv2.RETR_CCOMP: Detect all contours and organize them into a two-level hierarchical structure. The top layer is the outer contour of the object, and the bottom layer is the inner contour of the object.

   4.cv2.RETR_TREE: Detect all contours and reconstruct the contour hierarchical relationship.

  • method: Contour approximation method, specifying how to approximate the contour. There are three methods available:

   1.cv2.CHAIN_APPROX_NONE: Store all boundary points.

   2.cv2.CHAIN_APPROX_SIMPLE: Stores only endpoints.

   3.cv2.CHAIN_APPROX_TC89_L1 or cv2.CHAIN_APPROX_TC89_KCOS: Apply the Teh-Chin chain approximation algorithm, which outputs fewer points, but the result may be less accurate.

  • contours(optional): Detected contours. This is a list of points, each point represents a coordinate on the contour. If this parameter is not required, it can be omitted.

  • hierarchy(Optional): Level information of the outline. This is a list of four elements representing the parent contour, child contour, previous contour, and next contour of this contour. If this parameter is not required, it can be omitted.

  • offset(optional): Offset of contour point coordinates. If this parameter is not required, it can be omitted.

This function returns two values, which are the detected contour listcontours and the level information of the contourhierarchy. This information can be used for further processing of the outline, such as drawing, cropping, etc.

import cv2
import numpy as np

# load an image
image = cv2.imread('image.png')

# convert the image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# threshold the image to obtain a binary image
ret, thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)

# find the contours
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

# draw the contours on the original image
cv2.drawContours(image, contours, -1, (0, 255, 0), 3)

# display the image
cv2.imshow('image', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

The code above first loads an image using the cv2.imread function and then converts it to a grayscale image using the cv2.cvtColor function. Next, use the cv2.threshold function to threshold the image to obtain a binary image. This function sets all pixels with a value greater than the threshold to white (255) and all pixels with a value less than or equal to the threshold to black (0). In this example, the threshold is set to 127, which means that all pixels with a value greater than 127 will be set to white.

Then use the cv2.findContours function to find contours in the image. This function takes a binary image as input and returns a list of contours and a hierarchy. The contour list contains all found contours, each contour is a Numpy array containing the coordinates of all points of the contour. Hierarchy describes the relationship between contours. In this example, the cv2.RETR_TREE mode is used, which means that all contours will be found and a complete contour hierarchy will be generated. The contour is approximated using the cv2.CHAIN_APPROX_SIMPLE method, which means that only the endpoints of horizontal, vertical and diagonal segments are stored, thus reducing the number of stored contour points.

Finally, use the cv2.drawContours function to draw all the contours on the original image. The function takes an image, a list of contours, the index of the contour (-1 means drawing all contours), the color of the contour, and the line width as arguments. In this example, the outline is drawn as a green line with a line width of 3. Finally use the cv2.imshow function to display the result, use the cv2.waitKey function to wait for the user to press any key, and finally use the cv2.destroyAllWindows function

 

Guess you like

Origin blog.csdn.net/weixin_63062756/article/details/130472181