OpenCV - Canny edge detection algorithm

Problem Description

Image segmentation is the process of subdividing a digital image into multiple sub-regions, which is widely used in the field of computer vision/machine vision. Its purpose is to simplify or change the representation of an image so that it is easier to understand and analyze. Common image segmentation methods include thresholding, clustering, edge detection, and region growing. Solving image segmentation problems usually requires combining domain knowledge to improve the solution effect.

Edge detection is a commonly used image segmentation method, which is realized by extracting the features of discontinuous parts in the image. Currently, common edge detection operators include difference operator, Roberts operator, Sobel operator, Prewitt operator, Log operator and Canny operator. The Canny operator is an edge detection operator proposed by John F. Canny in 1986, and is considered to be one of the most complete edge detection algorithms at present. Many commonly used image processing tools (such as MATLAB, OpenCV) have built-in Canny operator API.

Canny edge detection

Canny edge detection is a classic edge detection algorithm, which was proposed by John F. Canny in 1986. It is widely used in the fields of computer vision and image processing to detect edge information in images.

Paper information:

标题：A Computational Approach to Edge Detection

By John F. Canny

Year of publication: 1986

The steps of the Canny edge detection algorithm

The goal of the Canny edge detection algorithm is to find strong edges in the image and try to eliminate noise and weak edges. The steps of the algorithm are as follows:

Noise suppression: First, the image is smoothed using a Gaussian filter to reduce the effect of noise.
Calculate Gradients: Then, calculate the gradient strength and direction for each pixel in the image. This can be achieved by applying a filter such as Sobel.
Non-Maximum Suppression: Next, non-maximum suppression is performed on the gradient strength image to refine edges and remove edge responses.
Dual Thresholding: Then, use dual thresholding to determine the strength of the edges. According to the set threshold, the edge pixels are divided into strong edge, weak edge and non-edge pixels.
Edge connection: Finally, the edge connection algorithm is used to connect strong edge pixels and adjacent weak edge pixels to form a complete edge.

Canny edge detection algorithm is widely praised in the field of image processing. It can detect subtle edges and has good robustness to noise.

1. Apply Gaussian filtering to remove image noise

The Gaussian filter is a smoothing filter that reduces the difference between pixel values by performing convolution operations on the image, thereby reducing the influence of noise.

The kernel (or template) of the Gaussian filter is a two-dimensional weight matrix, where the weight values are calculated by the Gaussian function. The size and standard deviation (σ) of this weight matrix are two key parameters of the Gaussian filter, which determine the smoothness of the filter.

A two-dimensional Gaussian function can be expressed as:

$G(x, y) = \frac{1}{2\pi \sigma ^{2}} exp(-\tfrac{x^{2}+y^{2}}{2\sigma ^{2}})$

Among them, G(x, y) represents the value of the two-dimensional Gaussian function at point (x, y), and σ represents the standard deviation.

For each pixel in the image, applying a Gaussian filter can be achieved with the following convolution operation:

I' = G * I

Among them, I' represents the filtered image, G represents the Gaussian filter kernel, and I represents the original image.

Specifically, for each pixel, the Gaussian filter kernel is convolved with the neighborhood centered on the pixel, and the weighted average of the pixel values in the neighborhood is calculated as the value of the pixel in the filtered image. In this way, the Gaussian filter blurs the image and reduces the effect of noise.

It should be noted that the size and standard deviation of the Gaussian filter need to be selected according to the specific application scenario. Larger filter sizes and smaller standard deviations can provide stronger smoothing effects, but may lead to blurring of edge information. Conversely, smaller filter sizes and larger standard deviations preserve more detail but may not suppress noise effectively.

The standard deviation σ of the 2D Gaussian filter is equal in the horizontal and vertical directions to keep the filter isotropic. Choosing an appropriate standard deviation depends on the noise level and image characteristics, a larger standard deviation can provide a stronger smoothing effect, but may blur edges and details.

2. Use the Sobel operator to calculate the pixel gradient

The Sobel operator is a commonly used discrete differential operator, which can calculate the gradient strength and direction of each pixel in the image. In Canny edge detection, the Sobel operator is usually used to calculate the gradient of the image in the horizontal and vertical directions.

For grayscale images, we can calculate the gradient of pixels through the following Sobel operator:

Gradient in the horizontal direction (Gx):

      -1  0  1
Gx =  -2  0  2
      -1  0  1

Gradient in the vertical direction (Gy):

      -1 -2 -1
Gy =   0  0  0
       1  2  1

The following are the specific steps to calculate the pixel gradient using the Sobel operator:

Convert the image to grayscale (if not grayscale).
Apply the Sobel operator in the horizontal direction (Gx) and the Sobel operator in the vertical direction (Gy) to the image.
Compute the gradient strength and direction for each pixel:
```
G = sqrt(Gx^2 + Gy^2)       （梯度强度）
θ = arctan(Gy / Gx)         （梯度方向）
```
Among them, G represents the gradient strength, and θ represents the gradient direction.

In this way, we can get the gradient strength and direction of each pixel in the image. This gradient information will be used in subsequent steps of the Canny edge detection algorithm to detect and connect edges.

3. Non-maximum pixel gradient suppression

In this step, we need to check the adjacent pixels in the gradient direction of each pixel, and keep the pixel with the largest gradient strength, and suppress the other pixels to zero.

In order to better describe the process of non-maximum suppression, we discretize the gradient direction of the assumed image into four main directions: 0° (horizontal direction), 45° (diagonal direction), 90° (vertical direction) and 135° (diagonal direction). For each pixel, we compare its gradient strength with that of two neighboring pixels along the gradient direction.

If the gradient direction of a pixel is 0° (horizontal direction), we compare its gradient strength with the gradient strength of its two neighbors to the left and right.

Suppose the current pixel point is (x, y), its gradient direction is 0°, and its gradient strength is G(x, y). We compare G(x, y) with the gradient strengths of two neighboring pixels: G(x-1, y) and G(x+1, y).

If G(x, y) is the largest of the three values, we keep that pixel value, otherwise suppress it to zero.

Specifically, we use linear interpolation to determine whether to preserve pixel values. If G(x, y) is between G(x-1, y) and G(x+1, y), then we calculate the corresponding weight w by linear interpolation:

$w = \frac{ |G(x, y) - G(x-1, y)|}{ |G(x+1, y) - G(x-1, y)|}$

Ultimately, the conditions under which we preserve pixel values are:

$G(x, y) >= w * G(x+1, y) + (1 - w) * G(x-1, y)$

Depending on the specific image data and gradient orientation, the above process will be applied to other orientations (45°, 90°, and 135°), respectively.

4. Threshold hysteresis processing

After completing the above steps, the strong edge in the image is already in the currently acquired edge image. However, some ghost edges may also be within the edge image. These virtual edges may be produced by real images or by noise. For the latter, it has to be knocked out.

In this step, we use dual thresholding to determine the strength of edges.

First, according to the set high and low thresholds, the pixels of the gradient intensity image are classified into three categories: strong edges, weak edges and non-edges.
Strong edge pixels have gradient strengths above the high threshold,
Gradient strengths of non-edge pixels are below a low threshold,
Gradient strengths of weak edge pixels are between two thresholds.

5. Edge connections

In the last step, strong edge pixels and adjacent weak edge pixels are connected by edge connection algorithm to form a complete edge. In general, a weak edge pixel is considered to be part of an edge if it is adjacent to at least one strong edge pixel. This process can be implemented recursively or iteratively.

Program flow chart

Image link: https://pic4.zhimg.com/80/v2-7400b8669ef4c750aeff27ed699ba7c7_720w.webp

Canny function and use

OpenCV provides the function cv2.Canny() to implement Canny edge detection, and its syntax is as follows:

edges = cv.Canny( image, threshold1, threshold2[, apertureSize[, L2gradient]])

in:

edges is the calculated edge image.
image is an 8-bit input image.
threshold1 represents the first threshold in processing.
threshold2 represents the second threshold during processing.
apertureSize represents the aperture size of the Sobel operator.
L2gradient is the logo for calculating the gradient magnitude of the image. Its default value is False. If True, use the more accurate L2 norm for calculations (that is, the square root of the derivatives in both directions), otherwise use the L1 norm (directly add the absolute values of the two direction derivatives).

Example:
Use the function cv2.Canny() to get the edges of the image and try different sizes of threshold1 and threshold2.

import cv2
o=cv2.imread("lena.bmp",cv2.IMREAD_GRAYSCALE)
r1=cv2.Canny(o,128,200)
r2=cv2.Canny(o,32,128)
cv2.imshow("original",o)
cv2.imshow("result1",r1)
cv2.imshow("result2",r2)
cv2.waitKey()
cv2.destroyAllWindows()

operation result:

It can be seen from the running results of the program that when the parameters threshold1 and threshold2 of the function cv2.Canny() are smaller, more edge information can be captured.