Canny edge detection algorithm-python implementation (with code)


Abstract : Canny edge detection algorithm was proposed by computer scientist John F. Canny in 1986. It not only provides algorithms, but also brings a set of edge detection theory, explaining how to implement edge detection in stages. The Canny detection algorithm consists of the following stages:

  • Image grayscale
  • Gaussian Blur
  • Image gradient, gradient magnitude, gradient direction calculation
  • NMS (Non-Maximum Suppression)
  • Boundary Selection with Dual Thresholds

1. Call opencv for canny edge detection

If you just want to apply canny to get the edge of the picture, then there is no need to read the specific principles and implementation of canny. Because python's opencv library provides a good function to achieve this function. Let's use an example to illustrate how to use opencv for canny edge detection:

This is the original image and the result image of canny edge detection using opencv:

insert image description here

Here is the code implementation:

import cv2 #导入opencv库
 #读取图片
img = cv2.imread("images/2007\_000032.jpg")
#进行canny边缘检测
edge = cv2.Canny(img,50,150)
#保存结果
cv2.imwrite('test.jpg',edge)

The key to these four lines of code is the cv2.Canny function. We explain its parameters in detail. Hope it will help you. The prototype of the Canny function in OpenCV-Python is:

cv2.Canny(image, threshold1, threshold2[, edges[, apertureSize[, L2gradient]]])

Required parameters:

  • The first parameter is the original image to be processed, which must be a single-channel grayscale image;
  • The second parameter is threshold 1;
  • The third parameter is Threshold2.

Among them, the larger threshold 2 is used to detect obvious edges in the image, but in general, the detection effect is not so perfect, and the edge detection is intermittent. So at this time, a smaller first threshold is used to connect these discontinuous edges.

The apertureSize in the optional parameter is the size of the Sobel operator. The L2gradient parameter is a Boolean value. If it is true, the more accurate L2 norm is used for calculation (that is, the sum of the squares of the reciprocals in the two directions is opened), otherwise the L1 norm is used (directly the absolute of the two direction derivatives value added).

At this point, you can actually apply canny edge detection to your project. Of course, if you want to understand the principle of canny edge detection, please continue reading.

2. Image grayscale

For a picture, when we only care about its boundary, a single-channel picture is enough to provide information for detecting the boundary. Therefore, we can grayscale the 3-channel images of R, G, and B and even higher-dimensional hyperspectral remote sensing images. Grayscale is actually a dimensionality reduction operation, which reduces redundant data and reduces computational overhead. The following is the method of grayscale RGB image:

# 灰度化
def gray(self, img\_path):
	"""
	计算公式:
	Gray(i,j) = [R(i,j) + G(i,j) + B(i,j)] / 3
	or :
	Gray(i,j) = 0.299 \* R(i,j) + 0.587 \* G(i,j) + 0.114 \* B(i,j)
	"""
	
	# 读取图片
	img = plt.imread(img_path)
	# BGR 转换成 RGB 格式
	img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
	# 灰度化
	img_gray = np.dot(img_rgb[...,:3], [0.299, 0.587, 0.114])
	return img_gray

3. Gaussian blur processing

Gaussian blur is actually denoising the grayscaled image. From a mathematical point of view, the Gaussian blur process of the image is the convolution of the image with the normal distribution. Before performing Gaussian filtering, a Gaussian filter (kernel) needs to be obtained first. How to get a Gaussian filter? In fact, the Gaussian function is discretized, and the corresponding horizontal and vertical coordinate indexes in the filter are substituted into the Gaussian function to obtain the corresponding value. Filters of different sizes have different values. The following is the calculation formula of the two-dimensional Gaussian function and the (2k+1)x(2k+1) filter:

Gaussian filter commonly used size is 5x5, σ=1.4 Gaussian filter. The following is the implementation code of the 5x5 Gaussian filter:

# 去除噪音 - 使用 5x5 的高斯滤波器
def smooth(self, img\_gray):
	# 生成高斯滤波器
	"""
	要生成一个 (2k+1)x(2k+1) 的高斯滤波器,滤波器的各个元素计算公式如下:
	H[i, j] = (1/(2\*pi\*sigma\*\*2))\*exp(-1/2\*sigma\*\*2((i-k-1)\*\*2 + (j-k-1)\*\*2))
	"""
	sigma1 = sigma2 = 1.4
	gau_sum = 0
	gaussian = np.zeros([5, 5])
	for i in range(5):
		for j in range(5):
			gaussian[i, j] = math.exp((-1/(2*sigma1*sigma2))*(np.square(i-3)+ np.square(j-3)))/(2*math.pi*sigma1*sigma2)
			gau_sum = gau_sum + gaussian[i, j]
	
	# 归一化处理
	gaussian = gaussian / gau_sum
	
	# 高斯滤波
	W, H = img_gray.shape
	new_gray = np.zeros([W-5, H-5])
	
	for i in range(W-5):
		for j in range(H-5):
			new_gray[i, j] = np.sum(img_gray[i:i+5, j:j+5] * gaussian)
	
	return new_gray

4. Calculation of image gradient, gradient magnitude and gradient direction

The importance of this step cannot be overstated. Intuitively speaking, we know that the pixel values ​​near the boundary on an image change greatly. Most of the pixel values ​​inside the object are similar. In this way, we can calculate the difference between the pixel value of the current pixel and its nearby pixels to determine whether the pixel is inside or at the boundary of the object. This difference is called the image gradient. The gradient magnitude and gradient direction are calculated from the image gradient.
Specifically, we use the first derivative to compute the gradient:

insert image description here

For the above formula, the actual operation is to subtract the current pixel from the next pixel of the current pixel. At this time Δ x = 1 \Delta x=1Δx _=1;

The gradient includes the gradient in the x direction and the gradient in the y direction. They are two vectors. The gradient magnitude is the vector sum of these two vectors:
insert image description here

Now that the gradient magnitude is a vector, we need to calculate its direction:

[

We use the following code to achieve:

# 计算梯度幅值
def gradients(self, new_gray):
	"""
	:type: image which after smooth
	:rtype:
		dx: gradient in the x direction
		dy: gradient in the y direction
		M: gradient magnitude
		theta: gradient direction
	"""
	W, H = new_gray.shape
	dx = np.zeros([W-1, H-1])
	dy = np.zeros([W-1, H-1])
	M = np.zeros([W-1, H-1])
	theta = np.zeros([W-1, H-1])
	
	for i in range(W-1):
		for j in range(H-1):
		dx[i, j] = new\_gray[i+1, j] - new\_gray[i, j]
		dy[i, j] = new\_gray[i, j+1] - new\_gray[i, j]
		# 图像梯度幅值作为图像强度值
		M[i, j] = np.sqrt(np.square(dx[i, j]) + np.square(dy[i, j]))
		# 计算 θ - artan(dx/dy)
		theta[i, j] = math.atan(dx[i, j] / (dy[i, j] + 0.000000001))
	return dx, dy, M, theta

In the calculated gradient magnitude, we have actually obtained the boundary of the image (that is, M in the function return value). as follows:
insert image description here

However, it's easy to see that there are two problems with this edge:

  • thicker edges;
  • Lots of choppy edges.

For these two problems, there are the following two steps of NMS and double-threshold boundary selection.

5. NMS (non-maximum suppression)

Ideally, the resulting edges should be very thin. Therefore, non-maximum suppression needs to be performed to thin the edges. The principle is simple: traverse all points on the gradient matrix and keep the pixels with maximum values ​​in the edge direction. Just like the picture below. The black and gray in the figure represent the boundaries. We use NMS to find the local maximum (that is, the black in the picture) and set the value of other positions (that is, the gray in the picture) to 0.

insert image description here

Let's talk about the details of NMS. NMS is performed in eight fields: up, down, left, right, upper left, lower left, upper right, lower right (of course, you don’t need to compare this point with the other eight points when comparing. Just compare it with its gradient direction This is easy to understand. Because we only need the current value to be the local maximum on the edge to which it belongs, but not to be the local maximum on other edges.) As shown in the figure below, the The 8 points are the eight nearby fields.

insert image description here

NMS is to find the local maximum, therefore, it is necessary to compare the gradient of the current pixel with other directions. As shown in the figure below, g1, g2, g3, and g4 are 4 points in the eight domains of C, and the blue line is the gradient direction of C. If C is a local maximum, the gradient amplitude of point C will be greater than the gradient amplitude of the two intersection points of the gradient direction line and g1g2, g4g3, that is, greater than the gradient amplitude of points dTemp1 and dTemp2. As mentioned above, this method cannot achieve the best effect, because dTemp1 and dTemp2 are not integer pixels, but sub-pixels. Sub-pixel means that there are pixels between two physical pixels. So, how to find the gradient magnitude of the sub-pixel? The linear interpolation method can be used to calculate the weight of dTemp1 between g1 and g2, and then its gradient magnitude can be obtained. Calculated as follows:

weight = |gx| / |gy| or |gy| / |gx|
dTemp1 = weight*g1 + (1-weight)*g2
dTemp2 = weight*g3 + (1-weight)*g4

There are two cases in the calculation (the current pixel is compared with the size of dtemp1 and dtemp2, if it is larger than these two values, it will be kept, if it is smaller than any one of them, its value will be 0):

  • The following two figures are the case where the gradient value in the y direction is relatively large, that is, the gradient direction is close to the y axis. Therefore, g2 and g4 are at the upper and lower positions of C, and weight = |gy| / |gx| at this time. The figure on the left is the case where the signs of the gradients in the x and y directions are the same, and the picture on the right is the case where the signs of the gradients in the x and y directions are opposite.
    insert image description here

  • The following two figures are the case where the gradient value in the x direction is relatively large, that is, the gradient direction is close to the x axis. Therefore, g2 and g4 are at the left and right positions of C, and weight = |gy| / |gx| at this time. The figure on the left is the case where the signs of the gradients in the x and y directions are the same, and the picture on the right is the case where the signs of the gradients in the x and y directions are opposite.

The code is implemented as follows:

def NMS(self, M, dx, dy):
	d = np.copy(M)
	W, H = M.shape
	NMS = np.copy(d)
	NMS[0, :] = NMS[W-1, :] = NMS[:, 0] = NMS[:, H-1] = 0
	for i in range(1, W-1):
		for j in range(1, H-1):
			# 如果当前梯度为0,该点就不是边缘点
			if M[i, j] == 0:
				NMS[i, j] = 0
			else:
				gradX = dx[i, j] # 当前点 x 方向导数
				gradY = dy[i, j] # 当前点 y 方向导数
				gradTemp = d[i, j] # 当前梯度点
				
				# 如果 y 方向梯度值比较大,说明导数方向趋向于 y 分量
				if np.abs(gradY) > np.abs(gradX):
					weight = np.abs(gradX) / np.abs(gradY) # 权重
					grad2 = d[i-1, j]
					grad4 = d[i+1, j]

					# 如果 x, y 方向导数符号一致
					# 像素点位置关系
					# g1  g2
					#     c
					#     g4  g3

					if gradX * gradY > 0:
						grad1 = d[i-1, j-1]
						grad3 = d[i+1, j+1]

					# 如果 x,y 方向导数符号相反
					# 像素点位置关系
					#     g2  g1
					#     c
					# g3  g4
					
					else:
						grad1 = d[i-1, j+1]
						grad3 = d[i+1, j-1]

				# 如果 x 方向梯度值比较大
				else:
					weight = np.abs(gradY) / np.abs(gradX)
					grad2 = d[i, j-1]
					grad4 = d[i, j+1]
					
					# 如果 x, y 方向导数符号一致
					# 像素点位置关系
					#      g3
					# g2 c g4
					# g1
					if gradX * gradY > 0:
						grad1 = d[i+1, j-1]
						grad3 = d[i-1, j+1]
					
					# 如果 x,y 方向导数符号相反
					# 像素点位置关系
					# g1
					# g2 c g4
					#      g3
					else:
						grad1 = d[i-1, j-1]
						grad3 = d[i+1, j+1]

				# 利用 grad1-grad4 对梯度进行插值
				gradTemp1 = weight \* grad1 + (1 - weight) \* grad2
				gradTemp2 = weight \* grad3 + (1 - weight) \* grad4
				
				# 当前像素的梯度是局部的最大值,可能是边缘点
				if gradTemp >= gradTemp1 and gradTemp >= gradTemp2:
					NMS[i, j] = gradTemp
				else:
					# 不可能是边缘点
					NMS[i, j] = 0

	return NMS

6. Boundary selection of double threshold

This stage decides which edges are real edges and which are not. For this, two thresholds need to be set, minVal and maxVal. Any edge with a gradient greater than maxVal is definitely a true edge, while an edge below minVal is definitely a non-edge and is therefore discarded. Edges lying between these two thresholds are classified as edges or non-edges based on their connectivity, and are considered part of an edge if they are connected to a "reliable edge" pixel. Otherwise, it will also be discarded. The code looks like this:

def double\_threshold(self, NMS):
	W, H = NMS.shape
	DT = np.zeros([W, H])
	
	# 定义高低阈值
	TL = 0.1 \* np.max(NMS)
	TH = 0.3 \* np.max(NMS)
	
	for i in range(1, W-1):
		for j in range(1, H-1):
			# 双阈值选取
			if (NMS[i, j] < TL):
				DT[i, j] = 0
			elif (NMS[i, j] > TH):
				DT[i, j] = 1
			# 连接
			elif (NMS[i-1, j-1:j+1] < TH).any() or (NMS[i+1, j-1:j+1].any() or (NMS[i, [j-1, j+1]] < TH).any()):
				DT[i, j] = 1	
	return DT

After completing all the steps, the result is shown in the figure below:

Guess you like

Origin blog.csdn.net/xijuezhu8128/article/details/129856373
Recommended