Image processing 002_Geometric transformation of images in OpenCV

The main content of this article comes from the image processing part of OpenCV in the OpenCV-Python tutorial . The main content of this part is as follows:

Target

  • Learn to apply different geometric transformations to images, such as translation, rotation, affine transformation, etc.
  • We will see these functions: cv.getPerspectiveTransform .

transform

OpenCV provides two transformation functions, cv.warpAffine  and  cv.warpPerspective , through which we can perform all kinds of transformations. cv.warpAffine  receives a 2x3 transformation matrix, while cv.warpPerspective receives a 3x3 transformation matrix as a parameter.

zoom

Scaling simply changes the size of the image. OpenCV has a function cv.resize() for completing this operation. The size of the image can be specified manually, or a scaling factor can be specified. Different interpolation methods can be used when scaling. The preferred interpolation methods for zooming out are cv.INTER_AREA and for zooming in,   cv.INTER_CUBIC  (slow) &  cv.INTER_LINEAR . By default, the cv.INTER_LINEAR interpolation method is used for all resizing. We can change the size of an input image as follows:

def scaling():
    cv.samples.addSamplesDataSearchPath("/media/data/my_multimedia/opencv-4.x/samples/data")
    img = cv.imread(cv.samples.findFile('messi5.jpg'))
    res = cv.resize(img, None, fx=2, fy=2, interpolation=cv.INTER_CUBIC)
    cv.imshow('frame', res)

    # OR
    height, width = img.shape[:2]
    res = cv.resize(img, (2 * width, 2 * height), interpolation=cv.INTER_CUBIC)
    cv.waitKey()

    cv.destroyAllWindows()

The scaling operation makes sense for many operations that take multiple images as parameters. When the input of an operation is multiple images, and there is a certain limit on the size of the input image, and the actual input image is difficult to meet this limit, scaling can help us change the size of some images to meet the target Input requirements for the operation. For example, the addition operation of multiple images, horizontal splicing and vertical splicing of images, etc.

Pan

Translation is moving the position of an object. If you know the offset in the (x,y) direction and make it ( tx t_xtx, t y t_y ty), then we can create the following transformation matrix:
[ 1 0 tx 0 1 tx ] \begin{bmatrix} 1&0&t_x \\ 0&1&t_x \end{bmatrix}[1001txtx]

d s t ( x , y ) = s r c ( M 11 ∗ x + M 12 ∗ y + M 13 , M 21 ∗ x + M 22 ∗ y + M 23 ) = s r c ( 1 ∗ x + 0 ∗ y + t x , 0 ∗ x + 1 ∗ y + t y ) dst(x,y) = src(M_{11} ∗ x+M_{12}∗ y+M_{13}, M_{21}∗ x+M_{22}∗ y+M_{23}) \\ = src(1∗x+0∗y+t_x, 0∗x+1∗y+t_y) dst(x,y)=src(M11x+M12y+M13,M21x+M22y+M23)=src(1x+0y+tx,0x+1y+ty)

We can put it into a Numpy array of type np.float32 and pass it to the cv.warpAffine() function. You can refer to the following example of moving (100,50):

def translation():
    cv.samples.addSamplesDataSearchPath("/media/data/my_multimedia/opencv-4.x/samples/data")
    img = cv.imread(cv.samples.findFile('messi5.jpg'))

    rows, cols, _ = img.shape
    M = np.float32([[1, 0, 100], [0, 1, 50]])
    dst = cv.warpAffine(img, M, (cols, rows))
    
    dst = cv.hconcat([img, dst])

    cv.imshow('frame', dst)
    cv.waitKey()

    cv.destroyAllWindows()

The third parameter of the cv.warpAffine() function is the size of the output image, and its form should be (width, height) . Remember width = number of columns, and height = number of rows.

The result you see should look like this:

Translation

rotate

Rotating an image by angle θ can be achieved through the following transformation matrix:
M = [ cos θ − sin θ sin θ cos θ ] M = \begin{bmatrix} cosθ&−sinθ \\ sinθ&cosθ \end{bmatrix}M=[cosθsinθsinθcosθ]

But OpenCV provides scaled rotation with adjustable rotation center, so we can rotate at any position we like. The modified transformation matrix is ​​given by

[ α β ( 1 − α ) ⋅ center . x − β ⋅ center . y − β α β ⋅ center . x + ( 1 − α ) ⋅ center . y ] \begin{bmatrix} α&β&(1−α)⋅center.x−β⋅center.y \\ −β&α&β⋅center.x+(1−α)⋅center.y \end{bmatrix}[a bba( 1 a )center.xβcenter.ybcenter.x+( 1 a )center.y]

Among them:
α = scale ⋅ cos θ , β = scale ⋅ sin θ α=scale⋅cosθ, \\ β=scale⋅sinθa=scalecosθ,b=scalesinθ

In order to obtain this rotation matrix, OpenCV provides a function, cv.getRotationMatrix2D . Check out the example below, which rotates the image 120 degrees relative to the center and magnifies it to 1.2x.

def rotation():
    cv.samples.addSamplesDataSearchPath("/media/data/my_multimedia/opencv-4.x/samples/data")
    img = cv.imread(cv.samples.findFile('messi5.jpg'))

    rows, cols, _ = img.shape

    # cols-1 and rows-1 are the coordinate limits.
    M = cv.getRotationMatrix2D(((cols - 1) / 2.0, (rows - 1) / 2.0), 120, 1.2)
    dst = cv.warpAffine(img, M, (cols, rows))

    dst = cv.hconcat([img, dst])
    cv.imshow('frame', dst)
    cv.waitKey()

    cv.destroyAllWindows()

Let’s take a look at the results:

Rotation

radiation transformation

In affine transformation, all parallel lines in the original image remain parallel in the output image. To find the transformation matrix, we need three points in the input image and their corresponding positions in the output image. cv.getAffineTransform will create a 2x3 matrix, which will be passed to cv.warpAffine .

Checking the example below, we will also see the selected points (they are marked in green):

def affine_transformation():
    img = np.zeros((512, 512, 3), np.uint8)
    cv.rectangle(img, (0, 0), (512, 512), (255, 255, 255), -1)

    cv.line(img, (0, 50), (512, 50), (0, 0, 0), 3)
    cv.line(img, (0, 150), (512, 150), (0, 0, 0), 3)
    cv.line(img, (0, 300), (512, 300), (0, 0, 0), 3)
    cv.line(img, (0, 450), (512, 450), (0, 0, 0), 3)

    cv.line(img, (100, 0), (100, 512), (0, 0, 0), 3)
    cv.line(img, (256, 0), (256, 512), (0, 0, 0), 3)
    cv.line(img, (412, 0), (412, 512), (0, 0, 0), 3)

    cv.rectangle(img, (60, 170), (430, 400), (0, 0, 0), 3)

    # img, center, radius, color, thickness=None
    cv.circle(img, (60, 50), 8, (0, 255, 0), -1)
    cv.circle(img, (280, 50), 8, (0, 255, 0), -1)
    cv.circle(img, (60, 270), 8, (0, 255, 0), -1)

    rows, cols, ch = img.shape

    pts1 = np.float32([[50, 50], [200, 50], [50, 200]])
    pts2 = np.float32([[10, 100], [200, 50], [100, 250]])

    M = cv.getAffineTransform(pts1, pts2)

    dst = cv.warpAffine(img, M, (cols, rows))

    plt.subplot(121), plt.imshow(img), plt.title('Input')
    plt.subplot(122), plt.imshow(dst), plt.title('Output')
    plt.show()


if __name__ == "__main__":
    affine_transformation()

You can see the following results:

image

perspective transformation

For perspective transformation, we need a 3x3 transformation matrix. Even after conversion, the line will remain straight. To find this transformation matrix, we need 4 points on the input image and the corresponding points on the output image. Of these 4 points, 3 should not be collinear. The transformation matrix can then be found via the function cv.getPerspectiveTransform . Then apply cv.warpPerspective with this 3x3 transformation matrix .

You can look at the following code:

def perspective_transformation():
    cv.samples.addSamplesDataSearchPath("/media/data/my_multimedia/opencv-4.x/samples/data")
    img = cv.imread(cv.samples.findFile('sudoku.png'))
    rows, cols, ch = img.shape

    pts1 = np.float32([[70, 80], [490, 70], [30, 510], [515, 515]])
    pts2 = np.float32([[0, 0], [515, 0], [0, 515], [515, 515]])
    M = cv.getPerspectiveTransform(pts1, pts2)
    dst = cv.warpPerspective(img, M, (515, 515))

    cv.line(img, (0, int(rows / 2)), (cols, int(rows / 2)), (0, 255, 0), 3)
    cv.line(img, (int(cols / 2), 0), (int(cols / 2), rows), (0, 255, 0), 3)

    cv.circle(img, (70, 80), 8, (0, 255, 0), -1)
    cv.circle(img, (490, 70), 8, (0, 255, 0), -1)
    cv.circle(img, (30, 510), 8, (0, 255, 0), -1)
    cv.circle(img, (515, 515), 8, (0, 255, 0), -1)

    plt.subplot(121), plt.imshow(img), plt.title('Input')

    cv.line(dst, (0, int(rows / 2)), (cols, int(rows / 2)), (0, 255, 0), 3)
    cv.line(dst, (int(cols / 2), 0), (int(cols / 2), rows), (0, 255, 0), 3)
    plt.subplot(122), plt.imshow(dst), plt.title('Output')
    plt.show()


if __name__ == "__main__":
    perspective_transformation()

The final result is as shown below:

image

Other resources

  1. “Computer Vision: Algorithms and Applications”, Richard Szeliski

Reference documentation

Geometric Transformations of Images

Commonly used LaTex mathematical formulas in Markdown

Markdown math formula syntax

Cmd Markdown formula guide

Done.

Guess you like

Origin blog.csdn.net/tq08g2z/article/details/123930589