OpenCV image processing (on) geometric transformation + morphological operation

1. Geometric transformation

1. Image scaling

Scaling is the resizing of an image, that is, the image is enlarged or reduced.

API

cv2.resize(src,dsize,fx=0,fy=0,interpolation=cv2.INTER_LINEAR)

parameter:

  • src : input image
  • dsize: absolute size, directly specify the size of the adjusted image
  • fx, fy: Relative size, set dsize to None, then set fx and fy as scale factors
  • interpolation: interpolation method,
    insert image description here

example

import cv2 as cv
# 1. 读取图片
img1 = cv.imread("./image/dog.jpeg")
# 2.图像缩放
# 2.1 绝对尺寸
rows,cols = img1.shape[:2]
res = cv.resize(img1,(2*cols,2*rows),interpolation=cv.INTER_CUBIC)

# 2.2 相对尺寸
res1 = cv.resize(img1,None,fx=0.5,fy=0.5)

# 3 图像显示
# 3.1 使用opencv显示图像(不推荐)
cv.imshow("orignal",img1)
cv.imshow("enlarge",res)
cv.imshow("shrink)",res1)
cv.waitKey(0)

# 3.2 使用matplotlib显示图像
fig,axes=plt.subplots(nrows=1,ncols=3,figsize=(10,8),dpi=100)
axes[0].imshow(res[:,:,::-1])
axes[0].set_title("绝对尺度(放大)")
axes[1].imshow(img1[:,:,::-1])
axes[1].set_title("原图")
axes[2].imshow(res1[:,:,::-1])
axes[2].set_title("相对尺度(缩小)")
plt.show()

insert image description here

2. Image panning

Image translation moves the image to the corresponding position according to the specified direction and distance.

API

cv.warpAffine(img,M,dsize)

parameter:

  • img: input image

  • M: 2∗3 moving matrix

  • For the pixel at (x,y), move it to ( x + tx , y + ty ) (x+t_x,y+t_y)(x+tx,y+ty) , the M matrix should be set as follows:
    insert image description here
    Note: set M to a Numpy array of type np.float32.

  • dsize: the size of the output image
    Note: the size of the output image, it should be in the form of (width, height). Remember, width = number of columns, height = number of rows .

example

The requirement is to move the pixels of the image by a distance of (50,100):

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1. 读取图像
img1 = cv.imread("./image/image2.jpg")

# 2. 图像平移
rows,cols = img1.shape[:2]
M = M = np.float32([[1,0,100],[0,1,50]])# 平移矩阵
dst = cv.warpAffine(img1,M,(cols,rows))

# 3. 图像显示
fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100)
axes[0].imshow(img1[:,:,::-1])
axes[0].set_title("原图")
axes[1].imshow(dst[:,:,::-1])
axes[1].set_title("平移后结果")
plt.show()

insert image description here

3. Image rotation

Image rotation refers to the process of rotating the image at a certain angle according to a certain position, and the image still maintains the original size during the rotation. After the image is rotated, the horizontal axis of symmetry, the vertical axis of symmetry, and the origin of the central coordinates of the image may be transformed, so the coordinates in the image rotation need to be converted accordingly.
How is the image rotated? As shown below:

insert image description here

Assuming that the image is rotated counterclockwise \thetaθ, then according to the coordinate transformation, the rotation conversion can be obtained as:

insert image description here

in:

insert image description here

Into the above formula, there are:

insert image description here

It can also be written as:
insert image description here
At the same time, we need to correct the position of the origin, because the origin of the coordinates in the original image is in the upper left corner of the image, the size of the image will change after rotation, and the origin also needs to be corrected.

Assuming that the rotation center is used as the coordinate origin during rotation, the coordinate origin needs to be moved to the upper left corner of the image after the rotation is completed, that is, a transformation is required.

insert image description here
insert image description here

In OpenCV, the image rotation first obtains the rotation matrix according to the rotation angle and the rotation center, and then transforms according to the rotation matrix to achieve the rotation effect of any angle and any center.

API

cv2.getRotationMatrix2D(center, angle, scale)

parameter:

  • center: center of rotation
  • angle: rotation angle
  • scale: zoom ratio

return:

  • M: rotation matrix
    call to cv.warpAffinecomplete the rotation of the image

example

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 读取图像
img = cv.imread("./image/image2.jpg")

# 2 图像旋转
rows,cols = img.shape[:2]
# 2.1 生成旋转矩阵
M = cv.getRotationMatrix2D((cols/2,rows/2),90,1)
# 2.2 进行旋转变换
dst = cv.warpAffine(img,M,(cols,rows))

# 3 图像展示
fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100)
axes[0].imshow(img1[:,:,::-1])
axes[0].set_title("原图")
axes[1].imshow(dst[:,:,::-1])
axes[1].set_title("旋转后结果")
plt.show()

insert image description here

4. Affine transformation

The affine transformation of the image involves the change of the shape, position and angle of the image. It is a function often used in deep learning preprocessing. The affine transformation is mainly a combination of operations such as scaling, rotation, flipping and translation of the image.

Then what is the affine transformation of the image, as shown in the figure below, the points 1, 2 and 3 in Figure 1 are mapped one by one with the three points in Figure 2, and still form a triangle, but the shape has been greatly changed. Through such two groups of three Points (points of interest) to find the affine transformation, and then we can apply the affine transformation to all the points in the image to complete the affine transformation of the image.

insert image description here

In OpenCV, the matrix of affine transformation is a 2×3 matrix,

insert image description here
where the left 2×2 sub-matrix AAA is a linear transformation matrix, the 2×1 sub-matrixBBB is the translation term:
insert image description here
For any position (x, y) on the image, the affine transformation performs the following operations: It
insert image description here
should be noted that for the image, the width direction is x, the height direction is y, and the coordinates The order is consistent with the subscript corresponding to the image pixel. So the position of the origin is not the lower left corner but the upper right corner, and the direction of y is not upward but downward.

In an affine transformation, all parallel lines in the original image are also parallel in the resulting image. To create this matrix we need to find three points from the original image and their positions in the output image. Then cv2.getAfineTransform will create a 2x3 matrix, and finally this matrix will be passed to the function cv2.warpAfine.

example

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 图像读取
img = cv.imread("./image/image2.jpg")

# 2 仿射变换
rows,cols = img.shape[:2]
# 2.1 创建变换矩阵
pts1 = np.float32([[50,50],[200,50],[50,200]])
pts2 = np.float32([[100,100],[200,50],[100,250]])
M = cv.getAffineTransform(pts1,pts2)
# 2.2 完成仿射变换
dst = cv.warpAffine(img,M,(cols,rows))

# 3 图像显示
fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100)
axes[0].imshow(img[:,:,::-1])
axes[0].set_title("原图")
axes[1].imshow(dst[:,:,::-1])
axes[1].set_title("仿射后结果")
plt.show()

insert image description here

5. Transmission transformation

The transmission transformation is the result of the change of the viewing angle. It refers to using the condition that the perspective center, the image point and the target point are collinear, and according to the law of perspective rotation, the bearing surface (perspective surface) is rotated by a certain angle around the trace line (perspective axis). Destroying the original projected light beams can still keep the transformation of the projected geometric figure on the image bearing surface unchanged.

insert image description here

Its essence projects the image to a new viewing plane, and its general transformation formula is:

insert image description here
Among them, (u, v) are the original image pixel coordinates, w takes a value of 1, and (x=x'/z', y=y'/z') is the result of the transmission transformation. The latter matrix is ​​called the perspective transformation matrix. In general, we divide it into three parts:
insert image description here
among them: T1 represents the linear transformation of the image, T2 represents the translation of the image, and T3 represents the projection transformation of the image, a 22 a_{22 }a22​​Generally set to 1.

In opencv, we need to find four points, any three of which are not collinear, then obtain the transformation matrix T, and then perform transmission transformation. The transformation matrix is ​​found by the function cv.getPerspectiveTransform, and cv.warpPerspective is applied to this 3x3 transformation matrix.

example

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 读取图像
img = cv.imread("./image/image2.jpg")
# 2 透射变换
rows,cols = img.shape[:2]
# 2.1 创建变换矩阵
pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])
pts2 = np.float32([[100,145],[300,100],[80,290],[310,300]])

T = cv.getPerspectiveTransform(pts1,pts2)
# 2.2 进行变换
dst = cv.warpPerspective(img,T,(cols,rows))

# 3 图像显示
fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100)
axes[0].imshow(img[:,:,::-1])
axes[0].set_title("原图")
axes[1].imshow(dst[:,:,::-1])
axes[1].set_title("透射后结果")
plt.show()

insert image description here

6. Image Pyramid

Image pyramid is a kind of multi-scale expression of image, and it is mainly used for image segmentation. It is an effective but conceptually simple structure to explain images with multiple resolutions.

Image pyramids are used for machine vision and image compression. An image pyramid is a series of images arranged in a pyramid shape with gradually reduced resolution and derived from the same original image. It is obtained by down-sampling in steps, and the sampling is stopped until a certain termination condition is reached.

The bottom of the pyramid is a high-resolution representation of the image to be processed, while the top is a low-resolution approximation, with higher levels resulting in smaller images with lower resolution.

insert image description here

API

cv.pyrUp(img)       #对图像进行上采样
cv.pyrDown(img)        #对图像进行下采样

example

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 图像读取
img = cv.imread("./image/image2.jpg")
# 2 进行图像采样
up_img = cv.pyrUp(img)  # 上采样操作
img_1 = cv.pyrDown(img)  # 下采样操作
# 3 图像显示
cv.imshow('enlarge', up_img)
cv.imshow('original', img)
cv.imshow('shrink', img_1)
cv.waitKey(0)
cv.destroyAllWindows()

insert image description here

2. Morphological operation

1. Connectivity

In an image, the smallest unit is a pixel, and there are 8 adjacent pixels around each pixel. There are three common adjacency relationships: 4-adjacency, 8-adjacency, and D-adjacency. As shown in the figure below:

insert image description here

  • 4 adjacency: the 4 neighbors of pixel p(x, y) are: (x+1, y); (x-1, y); (x, y+1); (x, y-1), use N 4 ( p ) N_4(p)N4( p ) represents the 4-adjacency of pixel p

  • D-adjacency: The D-neighborhood of pixel p(x,y) is: the point on the diagonal (x+1,y+1); (x+1,y-1); (x-1,y+1) ;(x-1,y-1), with ND ( p ) N_D(p)ND( p ) denotes the D neighborhood of pixel p

  • 8-neighborhood: The 8-neighborhood of pixel p(x,y) is: 4-neighborhood point + D-neighborhood point, use N 8 ( p ) N_{8}(p)N8( p ) represents the 8-neighborhood of pixel p

Connectivity is an important concept to describe regions and boundaries. The two necessary conditions for two pixels to be connected are:

  1. Whether the positions of two pixels are adjacent

  2. Whether the gray values ​​of two pixels satisfy a certain similarity criterion (or are they equal

According to the definition of connectivity, there are 4 Unicoms, 8 Unicoms and m Unicoms.

  • 4 Unicom: For a value with value VVPixels p and q of V , if qqq in the setN 4 ( p ) N_4(p)N4( p ) , the two pixels are said to be 4-connected.
  • 8 Unicom: For a value with value VVPixels p and q of V , if qqq in the setN 8 ( p ) N_8(p)N8( p ) , the two pixels are said to be 8-connected.

insert image description here

  • For values ​​with value VVPixelpp of Vp andqqq if:

insert image description here

Then the two pixels are said to be m-connected, that is, a mixed connection of 4-connected and D-connected.

insert image description here

2. Morphological operations

Morphological transformations are some simple operations based on the shape of an image. It is usually performed on binary images. Erosion and dilation are two basic morphological operators. Then its variant forms such as opening operation, closing operation, top hat and black hat, etc.

2.1 Corrosion and expansion

Corrosion and expansion are the most basic morphological operations, and both corrosion and expansion are for the white part (highlight part).

Expansion is to expand the highlighted part of the image, and the effect image has a larger highlight area than the original image; corrosion means that the highlight area in the original image is eroded, and the effect image has a smaller highlight area than the original image. Dilation is an operation to find a local maximum, and erosion is an operation to find a local minimum.

(1) Corrosion

The specific operation is: use a structural element to scan each pixel in the image, and use each pixel in the structural element and the pixels covered by it to perform an "AND" operation. If both are 1, the pixel is 1, otherwise it is 0. As shown in the figure below, structure A is corroded by structure B:

insert image description here

The function of erosion is to eliminate the boundary points of objects, shrink the target, and eliminate noise points smaller than structural elements.

API

cv.erode(img,kernel,iterations)

parameter:

  • img: the image to process
  • kernel: kernel structure
  • iterations: the number of times to corrode, the default is 1

(2) Expansion

The specific operation is: use a structural element to scan each pixel in the image, and perform an "AND" operation with each pixel in the structural element and the pixels covered by it. If both are 0, the pixel is 0, otherwise it is 1. As shown in the figure below, after structure A is expanded by structure B:

insert image description here

The role of dilation is to merge all the background points that are in contact with the object into the object, so that the target can be enlarged and the holes in the target can be filled.
API:

cv.dilate(img,kernel,iterations)

Parameters :

  • img: the image to process
  • kernel: kernel structure
  • iterations: the number of times to corrode, the default is 1

Example
We use a 5*5 convolution kernel to implement erosion and dilation operations:

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 读取图像
img = cv.imread("./image/image3.png")
# 2 创建核结构
kernel = np.ones((5, 5), np.uint8)

# 3 图像腐蚀和膨胀
erosion = cv.erode(img, kernel) # 腐蚀
dilate = cv.dilate(img,kernel) # 膨胀

# 4 图像展示
fig,axes=plt.subplots(nrows=1,ncols=3,figsize=(10,8),dpi=100)
axes[0].imshow(img)
axes[0].set_title("原图")
axes[1].imshow(erosion)
axes[1].set_title("腐蚀后结果")
axes[2].imshow(dilate)
axes[2].set_title("膨胀后结果")
plt.show()

insert image description here

2.2 Opening and closing operation

The opening operation and closing operation process erosion and expansion in a certain order. But the two are not reversible, that is, the original image cannot be obtained after opening and closing first.

(1) Open operation

The opening operation is to corrode first and then expand, and its function is to separate objects and eliminate small areas. Features: Eliminate noise and remove small interference blocks without affecting the original image.

insert image description here

(2) Close operation

The closing operation is the opposite of the opening operation. It expands first and then corrodes. Its function is to eliminate/"close" the holes in the object. Features: it can fill the closed area.

insert image description here

API

cv.morphologyEx(img, op, kernel)

parameter:

  • img: the image to process
  • op: processing method: if open operation is performed, set to cv.MORPH_OPEN, if close operation is performed, set to cv.MORPH_CLOSE
  • Kernel: nuclear structure

example

The realization of the opening and closing operation of the convolution using a 10*10 kernel structure.

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 读取图像
img1 = cv.imread("./image/image5.png")
img2 = cv.imread("./image/image6.png")
# 2 创建核结构
kernel = np.ones((10, 10), np.uint8)
# 3 图像的开闭运算
cvOpen = cv.morphologyEx(img1,cv.MORPH_OPEN,kernel) # 开运算
cvClose = cv.morphologyEx(img2,cv.MORPH_CLOSE,kernel)# 闭运算
# 4 图像展示
fig,axes=plt.subplots(nrows=2,ncols=2,figsize=(10,8))
axes[0,0].imshow(img1)
axes[0,0].set_title("原图")
axes[0,1].imshow(cvOpen)
axes[0,1].set_title("开运算结果")
axes[1,0].imshow(img2)
axes[1,0].set_title("原图")
axes[1,1].imshow(cvClose)
axes[1,1].set_title("闭运算结果")
plt.show()

insert image description here

2.3 Top hat and black hat

(1) Top hat calculation

The difference between the original image and the result image of the "open operation" is calculated as follows:

insert image description here

Because the result of the opening operation is to enlarge the crack or the local low-brightness area, therefore, subtracting the image after the opening operation from the original image, the resulting image highlights the brighter area than the area around the outline of the original image. And this operation is related to the size of the selected kernel.

The top-hat operation is used to separate patches that are lighter than their neighbors. When an image has a large background and small objects are more regular, the top-hat operation can be used for background extraction.

(2) Black hat operation

It is the difference between the result image of "closed operation" and the original image. The mathematical expression is:

insert image description here

The effect map after the black hat operation highlights the darker area than the area around the outline of the original image, and this operation is related to the size of the selected kernel.

Black hat operation is used to separate patches that are darker than neighboring points.

API

cv.morphologyEx(img, op, kernel)

Parameters :

  • img: the image to process
  • op: processing method:
    insert image description here
  • Kernel: nuclear structure

example

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 读取图像
img1 = cv.imread("./image/image5.png")
img2 = cv.imread("./image/image6.png")
# 2 创建核结构
kernel = np.ones((10, 10), np.uint8)
# 3 图像的礼帽和黑帽运算
cvOpen = cv.morphologyEx(img1,cv.MORPH_TOPHAT,kernel) # 礼帽运算
cvClose = cv.morphologyEx(img2,cv.MORPH_BLACKHAT,kernel)# 黑帽运算
# 4 图像显示
fig,axes=plt.subplots(nrows=2,ncols=2,figsize=(10,8))
axes[0,0].imshow(img1)
axes[0,0].set_title("原图")
axes[0,1].imshow(cvOpen)
axes[0,1].set_title("礼帽运算结果")
axes[1,0].imshow(img2)
axes[1,0].set_title("原图")
axes[1,1].imshow(cvClose)
axes[1,1].set_title("黑帽运算结果")
plt.show()

insert image description here

Guess you like

Origin blog.csdn.net/mengxianglong123/article/details/125904390