[OpenCV-Python] 20 image pyramid

OpenCV-Python: IV Image Processing in OpenCV

20 image pyramid

Goals
  • Learn the image pyramid
  • Use the image to create a new fruit: "orange apple"
  • The functions to be learned are: cv2.pyrUp(), cv2.pyrDown().

20.1 Principle

Under normal circumstances, we have to deal with an image with a fixed resolution. But in some cases, we need to process sub-images of different resolutions of the same image. For example, if we want to find a certain target in an image, such as a face, we don't know the size of the target in the image. In this case, we need to create a set of images, which are original images with different resolutions. We call this group of images an image pyramid (in short, it is a collection of sub-images of the same image with different resolutions). If we put the largest image at the bottom and the smallest at the top, it looks like a pyramid, hence the name image pyramid.
  There are two types of image pyramids: Gaussian pyramids and Laplace pyramids.
  The top of the Gaussian pyramid is obtained by removing consecutive rows and columns in the bottom image. The value of each pixel in the top image is equal to the Gaussian weighted average of 5 pixels in the next layer of image. In this way, an MxN image at a time becomes an M/2xN/2 image. So the area of ​​this image becomes a quarter of the original image area. This is called Octave. Continuously performing such operations will result in an image pyramid with a decreasing resolution. We can use the functions cv2.pyrDown() and cv2.pyrUp() to build an image pyramid.
  The function cv2.pyrDown() builds up a gold tower from a high-resolution large-size image (the size becomes smaller and the resolution decreases).

img = cv2.imread('messi5.jpg')
lower_reso = cv2.pyrDown(higher_reso)

The picture below is a four-layer image pyramid.

img

The function cv2.pyrUp() builds a gold tower from a low-resolution and small-size image down (the size becomes larger, but the resolution will not increase).

higher_reso2 = cv2.pyrUp(lower_reso)

Gaussian Pyramid

What you have to remember is that higher_reso2 and higher_reso are different. Because once cv2.pyrDown() is used, the resolution of the image will be reduced and the information will be lost. The image below is the image of the third layer of the image pyramid (from bottom to top) generated from cv2.pyrDown() using the function cv2.pyrUp(). The resolution is much worse than the original image.
  The Laplacian Pyramid can be calculated by the Gaussian Pyramid, and the formula is as follows:
    img

The image of the Lapla Pyramid looks like a boundary map, where many pixels are 0. They are often used in image compression. The picture below is a three-story Laplace pyramid:

Laplacian Pyramid

20.2 Using pyramids for image fusion

One application of the image pyramid is image fusion. For example, in image stitching, you need to stack two images together, but due to the discontinuity of image pixels in the connected area, the effect of the entire image will look poor. At this time, the image pyramid can be used, and it can help you achieve seamless connection. A classic case here is to merge two fruits into one. Look at the picture below and maybe you will understand what I'm talking about.
   Pyramid Blending
You can learn more about image fusion and the details of Laplace Pyramid by reading more resources below.
The steps to achieve the above effects are as follows:

  1. Read in two images, apples and sentences
  2. Build a Gaussian pyramid of apples and oranges (6 levels)
  3. Calculate the Laplacian pyramid based on the Gaussian pyramid
  4. Perform image fusion on each layer of Laplace (the left side of the apple and the right side of the orange are fused)
  5. The original image is reconstructed according to the merged image pyramid.
import cv2
import numpy as np,sys

A = cv2.imread('apple.jpg')
B = cv2.imread('orange.jpg')

# generate Gaussian pyramid for A
G = A.copy()
gpA = [G]
for i in xrange(6):
    G = cv2.pyrDown(G)
    gpA.append(G)

# generate Gaussian pyramid for B
G = B.copy()
gpB = [G]
for i in xrange(6):
    G = cv2.pyrDown(G)
    gpB.append(G)

# generate Laplacian Pyramid for A
lpA = [gpA[5]]
for i in xrange(5,0,-1):
    GE = cv2.pyrUp(gpA[i])
    L = cv2.subtract(gpA[i-1],GE)
    lpA.append(L)

# generate Laplacian Pyramid for B
lpB = [gpB[5]]
for i in xrange(5,0,-1):
    GE = cv2.pyrUp(gpB[i])
    L = cv2.subtract(gpB[i-1],GE)
    lpB.append(L)

# Now add left and right halves of images in each level
LS = []
for la,lb in zip(lpA,lpB):
    rows,cols,dpt = la.shape
    ls = np.hstack((la[:,0:cols/2], lb[:,cols/2:]))
    LS.append(ls)

# now reconstruct
ls_ = LS[0]
for i in xrange(1,6):
    ls_ = cv2.pyrUp(ls_)
    ls_ = cv2.add(ls_, LS[i])

# image with direct connecting each half
real = np.hstack((A[:,:cols/2],B[:,cols/2:]))

cv2.imwrite('Pyramid_blending2.jpg',ls_)
cv2.imwrite('Direct_blending.jpg',real)

For more information, please pay attention to the official account:
img

Guess you like

Origin blog.csdn.net/yegeli/article/details/113422402