4. Advanced operations of opencv-python image processing (1) - Geometric transformation

geometric transformation

Learning Content:

Master image scaling, translation, and rotation
Understand affine and projective transformations of digital images

Mainly written about the set transformation of images.

1. Image scaling

Zooming is to adjust the size of the image, that is, to enlarge or reduce the size of the image.

1、API

cv2.resize(src,dsize,fx=0,fy=0,interpolation=cv2.INTER_LINEAR)

Parameters:
src: input image
dsize: absolute size, directly specify the size of the adjusted image
fx, fy: relative size, set dsize to None, and then set fx and fy as scale factors
interpolation: interpolation method,

2. Sample code

import cv2 as cv
# 1. 读取图片
img1 = cv.imread("./image/dog.jpeg")
# 2.图像缩放
# 2.1 绝对尺寸
rows,cols = img1.shape[:2]
res = cv.resize(img1,(2*cols,2*rows),interpolation=cv.INTER_CUBIC)

# 2.2 相对尺寸
res1 = cv.resize(img1,None,fx=0.5,fy=0.5)

# 3 图像显示
# 3.1 使用opencv显示图像(不推荐)
cv.imshow("orignal",img1)
cv.imshow("enlarge",res)
cv.imshow("shrink)",res1)
cv.waitKey(0)

# 3.2 使用matplotlib显示图像
fig,axes=plt.subplots(nrows=1,ncols=3,figsize=(10,8),dpi=100)
axes[0].imshow(res[:,:,::-1])
axes[0].set_title("绝对尺度(放大)")
axes[1].imshow(img1[:,:,::-1])
axes[1].set_title("原图")
axes[2].imshow(res1[:,:,::-1])
axes[2].set_title("相对尺度(缩小)")
plt.show()

Feel free to try it out with your own pictures here.

2. Image translation

Image translation moves the image to the corresponding position according to the specified direction and distance.

1、API

cv.warpAffine(img,M,dsize)

Parameters:
img: Input image
M: 2**3 movement matrix.
For the pixel at (x, y), when you want to move it to (x+tx, y+ty)​​, the M matrix should be set as follows: Insert image description here
Note: Set MM to a Numpy array of type np.float32.
dsize: size of output image

2. Example

If you want to move the pixels of the image by a distance of (50,100), analyze it first. The image should be translated to the lower right corner. You can first imagine the effect.

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1. 读取图像
img1 = cv.imread("./image/image2.jpg")

# 2. 图像平移
rows,cols = img1.shape[:2]
M = M = np.float32([[1,0,100],[0,1,50]])# 平移矩阵
dst = cv.warpAffine(img1,M,(cols,rows)) # 这里的宽高仍然是原有宽高

# 3. 图像显示
fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100)
axes[0].imshow(img1[:,:,::-1])
axes[0].set_title("原图")
axes[1].imshow(dst[:,:,::-1])
axes[1].set_title("平移后结果")
plt.show()

I won’t include the renderings, the code must be able to run successfully.

3. Image rotation

Image rotation refers to the process of rotating an image at a certain angle according to a certain position. The image still maintains its original size during rotation. After the image is rotated, the horizontal symmetry axis, vertical symmetry axis and center coordinate origin of the image may be transformed, so the coordinates in image rotation need to be converted accordingly.
So how to rotate the image? ? As shown in the figure below:
Insert image description here
Assuming that the image is rotated counterclockwise by θ, the rotation transformation can be obtained according to the coordinate transformation:
Insert image description here
where:
Insert image description here

1、API

cv2.getRotationMatrix2D(center, angle, scale)

Parameters:
center: rotation center
angle: rotation angle
scale: scaling ratio
Return:
M: rotation matrix
Call cv.warpAffine to complete the rotation of the image.
Finally, the rotation matrix M can be generated through the API, and then other operations can be performed through the rotation matrix M.

2. Example

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 读取图像
img = cv.imread("./image/image2.jpg")

# 2 图像旋转
rows,cols = img.shape[:2]
# 2.1 生成旋转矩阵
M = cv.getRotationMatrix2D((cols/2,rows/2),90,1)
# 2.2 进行旋转变换
dst = cv.warpAffine(img,M,(cols,rows))

# 3 图像展示
fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100)
axes[0].imshow(img1[:,:,::-1])
axes[0].set_title("原图")
axes[1].imshow(dst[:,:,::-1])
axes[1].set_title("旋转后结果")
plt.show()

3. The obtained visualization effect

Insert image description here

4. Affine transformation

Affine transformation of an image involves changes in the shape, position, and angle of the image. It is a common function in deep learning preprocessing. Affine transformation is mainly a combination of operations such as scaling, rotation, flipping, and translation of the image.
So what is the affine transformation of the image? As shown in the figure below, points 1, 2 and 3 in Figure 1 are mapped one by one to the three points in Figure 2. They still form a triangle, but the shape has been greatly changed. Through these two groups of three Points (points of interest) find the affine transformation. Next, we can apply the affine transformation to all points in the image, completing the affine transformation of the image.
Insert image description here
In an affine transformation, all parallel lines in the original image are equally parallel in the resulting image. To create this matrix we need to find three points from the original image and their positions in the output image. Then cv2.getAffineTransform will create a 2x3 matrix, and finally this matrix will be passed to the function cv2.warpAffine.

1. Example

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 图像读取
img = cv.imread("./image/image2.jpg")

# 2 仿射变换
rows,cols = img.shape[:2]
# 2.1 创建变换矩阵
pts1 = np.float32([[50,50],[200,50],[50,200]])
pts2 = np.float32([[100,100],[200,50],[100,250]])
M = cv.getAffineTransform(pts1,pts2)
# 2.2 完成仿射变换
dst = cv.warpAffine(img,M,(cols,rows))

# 3 图像显示
fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100)
axes[0].imshow(img[:,:,::-1])
axes[0].set_title("原图")
axes[1].imshow(dst[:,:,::-1])
axes[1].set_title("仿射后结果")
plt.show()

5. Transmission transformation

1. Introduction to transmission transformation

Transmission transformation is the result of changes in viewing angle. It refers to using the condition that the perspective center, image point, and target point are collinear, and according to the law of perspective rotation to rotate the image-bearing surface (perspective surface) at a certain angle around the trace line (perspective axis). Destroying the original projection light beam, the projection geometry on the supporting surface can still be transformed unchanged.
Insert image description here
Its essence is to project the image to a new plane, and its general transformation formula is:
Insert image description here
where (u, v) is the original image pixel coordinates, w takes the value 1, (x=x'/z',y=y '/z') is the result of transmission transformation. The latter matrix is ​​called the perspective transformation matrix. Generally, we divide it into three parts:
Insert image description here
among them: T1 represents the linear transformation of the image, T2 represents the translation of the image, T3 represents the projection transformation of the image, a22
​​​​ Generally is 1.

In opencv, we need to find four points, any three of which are not collinear, then obtain the transformation matrix T, and then perform transmission transformation. Find the transformation matrix through the function cv.getPerspectiveTransform, apply cv.warpPerspective to this 3x3 transformation matrix.

2. Example

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 读取图像
img = cv.imread("./image/image2.jpg")
# 2 透射变换
rows,cols = img.shape[:2]
# 2.1 创建变换矩阵
pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])
pts2 = np.float32([[100,145],[300,100],[80,290],[310,300]])

T = cv.getPerspectiveTransform(pts1,pts2)
# 2.2 进行变换
dst = cv.warpPerspective(img,T,(cols,rows))

# 3 图像显示
fig,axes=plt.subplots(nrows=1,ncols=2,figsize=(10,8),dpi=100)
axes[0].imshow(img[:,:,::-1])
axes[0].set_title("原图")
axes[1].imshow(dst[:,:,::-1])
axes[1].set_title("透射后结果")
plt.show()

6. Image Pyramid

1 Introduction

Image pyramid is a kind of multi-scale expression of images. It is mainly used for image segmentation. It is a conceptually simple structure that explains images at multiple resolutions.
It is used in machine vision and image compression. The pyramid of an image is a collection of images with gradually decreasing resolutions arranged in a pyramid shape and derived from the same original image. It is obtained through ladder down sampling, and the sampling is not stopped until a certain termination condition is reached.

The base of the pyramid is a high-resolution representation of the image to be processed, while the top is a low-resolution approximation. The higher the level, the smaller the image and the lower the resolution.
Insert image description here

2、API

cv.pyrUp(img)       #对图像进行上采样
cv.pyrDown(img)        #对图像进行下采样

3. Example

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# 1 图像读取
img = cv.imread("./image/image2.jpg")
# 2 进行图像采样
up_img = cv.pyrUp(img)  # 上采样操作
img_1 = cv.pyrDown(img)  # 下采样操作
# 3 图像显示
cv.imshow('enlarge', up_img)
cv.imshow('original', img)
cv.imshow('shrink', img_1)
cv.waitKey(0)
cv.destroyAllWindows()

Insert image description here

Guess you like

Origin blog.csdn.net/weixin_44463519/article/details/125907709