Image processing in OpenCV - image gradient + Canny edge detection + image pyramid

Image processing in OpenCV - image gradient + Canny edge detection + image pyramid


1. Image gradient

First of all, let's take a look at what is an image gradient: an image gradient can regard an image as a two-dimensional discrete function. The image gradient is the derivative of this two-dimensional function. The edge of the image is generally realized by performing gradient operations on the image.

In the part of image gradient, we will touch to find image gradient, edge, etc. This part involves three main functions: cv.Sobel (), cv.Scharr (), cv.Laplacian (), correspondingly, the three provided by OpenCV Types of gradient filters (high-pass filters), namely Sobel, Scharr and Laplacian

In the previous part of 2D convolution, that is, image filtering content, we talked about the main application directions of low-pass filter (LPF) and high-pass filter (HPF). LPF is used to eliminate noise, and HPF is used to find edges. In the part of image gradient We use three high pass filters to find edges in the image

1.1 Sobel and Scharr operators

The Sobel operator is a combination of Gaussian smoothing and differential operations, so its anti-noise ability is very good. We can set the derivation direction (xorder or yorder), and the convolution kernel size ksize used

When the size of the convolution kernel we set is -1, the 3x3 Scharr filter will be used by default. Its effect is better than that of the 3x3 Sobel, and the processing speed is the same, so Scharr should be used when using the 3x3Sobel filter. filter instead

From the concepts mentioned above, we can understand that the Sobel filter using the 3x3 kernel is not equal to the Scharr filter, but the Scharr filter is a high-efficiency filter with a 3x3 kernel. If we need a Sobel filter with a 3x3 kernel, then We recommend using the Scharr filter, that is, when using the Sobel filter, set its kernel size to -1

After understanding the kernel of the Sobel and Scharr high-pass filters, let's take a look at the parameters of the cv.Sobel() and cv.Scharr() functions. The cv.Sobel(img,cv.CV_64F,dx,dy,ksize) function needs to be passed The parameters are the original image, cv.CV_64F is the image depth, generally written as -1, dx and dy represent the operators in the x-axis direction and y-axis direction respectively, and ksize is the kernel size

The Scharr high-pass filter is a 3x3 kernel, so the parameters of cv.Scharr() are compared with the cv.Sobel() function by passing a ksize parameter.

1.2 Laplacian operator

Laplace actually uses the operation of the Sobel operator. It calculates the derivatives of the image in the x and y directions through the Sobel operator, and obtains the result of the Laplace transform. It is like an upgraded version of the Sobel operator.

Here are two examples for your understanding

import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt

img = cv.imread(r'E:\image\test06.png', 0)
laplacian = cv.Laplacian(img, cv.CV_64F)
sobelx = cv.Sobel(img, cv.CV_64F, 1, 0, ksize=5)
sobely = cv.Sobel(img, cv.CV_64F, 0, 1, ksize=5)
plt.subplot(2, 2, 1), plt.imshow(img, cmap='gray')
plt.title('Original'), plt.xticks([]), plt.yticks([])
plt.subplot(2, 2, 2), plt.imshow(laplacian, cmap='gray')
plt.title('Laplacian'), plt.xticks([]), plt.yticks([])
plt.subplot(2, 2, 3), plt.imshow(sobelx, cmap='gray')
plt.title('Sobel X'), plt.xticks([]), plt.yticks([])
plt.subplot(2, 2, 4), plt.imshow(sobely, cmap='gray')
plt.title('Sobel Y'), plt.xticks([]), plt.yticks([])
plt.show()

insert image description here

NOTE : In the example above, the output is of data type cv.CV_8U or np.uint8, and here lies the problem, the black-to-white transition is seen as having a positive slope (with a positive value), while the white-to-black transition is treated as Treated as negative slopes (with negative values), when we convert the data to np.uint8, all negative slopes are set to zero, meaning we miss this edge information

When two edges are to be detected, a better option is to keep the output datatype as the higher form, take its absolute value and then convert back to cv.CV_8U

This is a necessary and important issue that cannot be ignored, so let's look at it with an example

import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt

img = cv.imread('E:/image/test07.png', 0)
sobelx8u = cv.Sobel(img, cv.CV_8U, 1, 0, ksize=5)

sobelx64f = cv.Sobel(img, cv.CV_64F, 1, 0, ksize=5)
abs_sobel64f = np.absolute(sobelx64f)
sobel_8u = np.uint8(abs_sobel64f)
plt.subplot(1, 3, 1), plt.imshow(img, cmap='gray')
plt.title('Original'), plt.xticks([]), plt.yticks([])
plt.subplot(1, 3, 2), plt.imshow(sobelx8u, cmap='gray')
plt.title('Sobel CV_8U'), plt.xticks([]), plt.yticks([])
plt.subplot(1, 3, 3), plt.imshow(sobel_8u, cmap='gray')
plt.title('Sobel abs(CV_64F)'), plt.xticks([]), plt.yticks([])
plt.show()

insert image description here

2. Canny edge detection

Canny Edge Detection is a popular edge detection algorithm invented by John F. Canny, which is a multi-stage, mainly divided into: Gaussian filtering, gradient calculation, non-maximum suppression and dual threshold detection

2.1 Multi-stage Canny edge detection algorithm

Gaussian filtering (noise reduction)

Since the edge detection is easily affected by the noise in the image, the first step when we process the image through the Canny edge detection algorithm is to use a 5x5 Gaussian filter to remove the noise in the image

The specific method of Gaussian filtering is to generate a Gaussian template and use convolution to perform time-domain filtering

gradient calculation

The smoothed image is filtered horizontally and vertically using a Sobel kernel to get the first derivatives horizontally and vertically

non-maximum suppression

After obtaining the gradient magnitude and direction, the image is fully scanned to remove any unwanted pixels that may not form an edge, for this it is checked at each pixel whether the pixel is a local maximum near its gradient direction

insert image description here

(The image comes from the official Chinese document of OpenCV4.1)

Point A is on the edge (vertically). Gradient direction is perpendicular to the edge. Points B and C are in the direction of the gradient. So point A is checked against points B and C to see if a local maximum is formed. If so, it is considered for the next stage, otherwise it is suppressed (set to zero). In short, what you get as a result is a binary image with "thin edges"

Hysteresis threshold (dual threshold detection)

At this stage, it will be determined which edges are real edges. For this we need to provide two thresholds minVal and maxVal. Any edge whose intensity gradient is greater than maxVal must be an edge, and any edge smaller than minVal must not be an edge. If they are connected to edge pixels. then treat them as part of the edge or discard them

Edge A is on maxVal, so it is considered a "definite edge", although C is below minVal it is connected to A, so it is considered valid, and we get the full curve

But although B is on minVal and is in the unity region with C, it is not connected to any guaranteed edge, so it also discards

Note : We have to choose the corresponding minVal and maxVal to get the correct result

2.2 Canny Edge detection in OpenCV

OpenCV puts the four stages of the Canny edge detection algorithm in a singular cv.Canny(), we only need to use it correctly to obtain our edge detection requirements

Let's take a look at the parameters of the cv.Canny() function. The first parameter is the image resource, and the second and third parameters are the two thresholds minVal and maxVal for the hysteresis threshold (dual threshold detection) stage, respectively. The four parameters are picture_size, which is used to find the size of the Sobel kernel for image gradients. The default is 3. The fifth parameter is L2gradient, which specifies the equation used to find the gradient. If it is True, a more accurate formula will be used. If it is False uses the default

Let's take a look at an example

import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt

img = cv.imread('E:/image/test08.png', 0)
edges = cv.Canny(img, 100, 200)
plt.subplot(121), plt.imshow(img, cmap='gray')
plt.title('Original Image'), plt.xticks([]), plt.yticks([])
plt.subplot(122), plt.imshow(edges, cmap='gray')
plt.title('Edge Image'), plt.xticks([]), plt.yticks([])
plt.show()

insert image description here

3. Image pyramid

3.1 Pyramid theory basis

In what we have learned before, images with constant pixel sizes are used, but in some cases, we do not know the specific size of the object we need (or what size the object is in) when processing the image appear in the image)

In this case we need to create a set of identical images with different resolutions and search for the target object in these images, these sets of images with different pixel sizes are our image pyramids

(Because when they are stacked on the bottom, with the highest resolution image on top and the lowest resolution image on top, it looks like a pyramid

Generally speaking, there are two kinds of pyramids: Gaussian pyramid and Laplacian pyramid

3.1.1 Gaussian Pyramid

The higher levels (lower resolutions) in the Gaussian pyramid are formed by first convolving the image with a Gaussian kernel and then removing the even rows and columns, and then each pixel of the higher level is formed by the contribution of 5 pixels from the base level with Gaussian weights , through such an operation, the M x N image becomes an M/2 x N/2 image, so the area is reduced to a quarter of the original. We call it Octave. When our pyramid is higher, this mode becomes more to continue.

Downsampling method: 1. Gaussian kernel convolution on the image; 2. Remove all even rows and columns

The lower level (high resolution) of the image is doubled in each dimension by the higher level (low resolution), and the newly added rows and columns (even rows and columns) are filled with 0, and then used The specified filter performs convolution to estimate an approximation of missing pixels.

Up-sampling method: 1. Expand the image to twice the original size in each dimension, and fill the new rows and columns with 0; 2. Use the same original kernel (x4) to convolve with the image after the method to obtain a new Approximate value of upscaled pixels

During the scaling process, some information is lost. If you want to reduce the loss of information during the scaling process, you need to use the Laplacian pyramid

Reference from: cv.pyrUp() and cv.pyrDown()

cv.pyrUp(src) function: only one parameter needs to be passed in, which represents the image resource and is used to upsample the image

cv.pyrDown() function: The parameter passing is consistent with cv.pyrUp(), which is used to downsample the image, and usually can also be used to blur the image

3.1.2 Laplacian Pyramid

Laplacian Pyramid is formed by Gaussian Pyramid, there is no dedicated function, Laplacian Pyramid image only represents edge image, most of its elements are 0, they are usually used for image compression

The layers of the Laplacian pyramid are formed by the difference between the layers of the Gaussian pyramid and the expanded version of the upper level of the Gaussian pyramid

We show the Laplacian pyramid through an example

import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt

img = cv.imread(r'E:\image\test06.png', )
loser_reso = cv.pyrDown(img)
higher_reso = cv.pyrUp(loser_reso)
lapPyr = img - higher_reso
test = higher_reso
loser_reso2 = cv.pyrDown(test)
higher_reso2 = cv.pyrUp(loser_reso2)
lapPyr2 = test - higher_reso2
cv.imshow('lapPyr', lapPyr)
cv.imshow('lapPyr2', lapPyr2)
cv.waitKey(0)
cv.destroyAllWindows()

insert image description here

3.2 Image Fusion Using Image Pyramid

One application of image pyramid is image fusion. If we do simple image stitching, we stack two pictures together, and the effect will not look good because of the discontinuity between the images. In this case, the pyramid blended image can be used without seam blending without retaining large amounts of data in the image

To achieve the effect of image blending, the following steps need to be completed:

  • Load two images that need to be blended
  • Find the Gaussian pyramids of two images, then find their Laplacian pyramids within their Gaussian pyramids
  • Add half of the two images to each Laplacian pyramid level
  • Finally reconstruct the original image from this joint image pyramid
import cv2 as cv
import numpy as np, sys

A = cv.imread('E:/image/horse.png')
B = cv.imread('E:/image/cow.png')
# 生成A的高斯金字塔
G = A.copy()
gpA = [G]
for i in range(6):
    G = cv.pyrDown(G)
    gpA.append(G)
# 生成B的高斯金字塔
G = B.copy()
gpB = [G]
for i in range(6):
    G = cv.pyrDown(G)
    gpB.append(G)
# 生成A的拉普拉斯金字塔
lpA = [gpA[5]]
for i in range(5, 0, -1):
    GE = cv.pyrUp(gpA[i])
    L = cv.subtract(gpA[i - 1], GE)
    lpA.append(L)
# 生成B的拉普拉斯金字塔
lpB = [gpB[5]]
for i in range(5, 0, -1):
    GE = cv.pyrUp(gpB[i])
    L = cv.subtract(gpB[i - 1], GE)
    lpB.append(L)
# 现在在每个级别中添加左右两半图像
LS = []
for la, lb in zip(lpA, lpB):
    rows, cols, dpt = la.shape
    ls = np.hstack((la[:, 0:cols / 2], lb[:, cols / 2:]))
    LS.append(ls)
# 现在重建
ls_ = LS[0]
for i in range(1, 6):
    ls_ = cv.pyrUp(ls_)
    ls_ = cv.add(ls_, LS[i])
# 图像与直接连接的每一半
real = np.hstack((A[:, :cols / 2], B[:, cols / 2:]))
cv.imwrite('Pyramid_blending2.jpg', ls_)
cv.imwrite('Direct_blending.jpg', real)

(Note: For the content of the article, refer to the official Chinese document of OpenCV4.1)
If the article is helpful to you, remember to support it with one click and three links

Guess you like

Origin blog.csdn.net/qq_50587771/article/details/123680995