opencv study notes 6--image features [harris+SIFT]+feature matching

Image Features (SIFT-Scale Invariant Feature Transform)

image scale space

Within a certain range, no matter whether the object is large or small, the human eye can distinguish it. However, it is difficult for a computer to have the same ability. Therefore, in order for the machine to have a unified understanding of objects at different scales, it is necessary to It is necessary to consider the characteristics of images that exist at different scales.

The acquisition of scale space is usually achieved using Gaussian blur.

title

title

Gaussian functions with different σ determine the smoothness of the image. A larger σ value corresponds to a blurrier image.

multi-resolution pyramid

title

Difference of Gaussian Pyramid (DOG)

title

title

DoG space extreme value detection

In order to find the extreme point of the scale space, each pixel point is compared with all adjacent points in its image domain (same scale space) and scale domain (adjacent scale space). When it is greater than (or smaller than) all the corresponding When it is an adjacent point, the point is the extreme point. As shown in the figure below, the middle detection point needs to be compared with 8 pixels in the 3×3 neighborhood of the image where it is located, and 18 pixels in the 3×3 areas of the adjacent upper and lower layers, for a total of 26 pixels. .

title

Precise positioning of key points

These candidate key points are local extreme points in the DOG space, and these extreme points are discrete points. One way to accurately locate the extreme points is to perform curve fitting on the scale space DoG function and calculate its extreme points. , thereby achieving precise positioning of key points.

title

title

Eliminate borderline responses

title

The main direction of the feature point

title

Each feature point can obtain three pieces of information (x, y, σ, θ), namely position, scale and direction. Key points with multiple directions can be copied into multiple copies, and then the direction values ​​are assigned to the copied feature points respectively. One feature point generates multiple feature points with the same coordinates and scales, but different directions.

Generate feature descriptions

After completing the gradient calculation of the key points, the histogram is used to count the gradient and direction of the pixels in the neighborhood.

title

In order to ensure the rotation invariance of the feature vector, the feature point should be taken as the center and the coordinate axis should be rotated by an angle θ in the nearby neighborhood, that is, the coordinate axis should be rotated to the main direction of the feature point.

title

Take an 8x8 window with the main direction after rotation as the center, and find the gradient amplitude and direction of each pixel. The direction of the arrow represents the gradient direction, and the length represents the gradient amplitude. Then use a Gaussian window to weight it, and finally in each 4x4 Draw a gradient histogram in 8 directions on a small patch, and calculate the cumulative value of each gradient direction to form a seed point, that is, each feature is composed of 4 seed points, and each seed point has 8 directions. vector information.

title

The paper recommends using 4x4 total 16 seed points to describe each key point, so that a key point will generate a 128-dimensional SIFT feature vector.

title

opencv SIFT function

import cv2
import numpy as np
import matplotlib.pyplot as plt#Matplotlib是RGB

img = cv2.imread('test_1.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
def cv_show(img,name):
    b,g,r = cv2.split(img)
    img_rgb = cv2.merge((r,g,b))
    plt.imshow(img_rgb)
    plt.show()
def cv_show1(img,name):
    plt.imshow(img)
    plt.show()
    cv2.imshow(name,img)
    cv2.waitKey()
    cv2.destroyAllWindows()


cv2.__version__ #3.4.1.15 pip install opencv-python==3.4.1.15 pip install opencv-contrib-python==3.4.1.15
'3.4.1'

Get feature points

sift = cv2.xfeatures2d.SIFT_create()
kp = sift.detect(gray, None)
img = cv2.drawKeypoints(gray, kp, img)
cv_show(img,'drawKeypoints')
# cv2.imshow('drawKeypoints', img)
# cv2.waitKey(0)
# cv2.destroyAllWindows()

png

Compute features

kp, des = sift.compute(gray, kp)
print (np.array(kp).shape)
(6827,)
des.shape
(6827, 128)
des[0]
array([  0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,  21.,   8.,   0.,
         0.,   0.,   0.,   0.,   0., 157.,  31.,   3.,   1.,   0.,   0.,
         2.,  63.,  75.,   7.,  20.,  35.,  31.,  74.,  23.,  66.,   0.,
         0.,   1.,   3.,   4.,   1.,   0.,   0.,  76.,  15.,  13.,  27.,
         8.,   1.,   0.,   2., 157., 112.,  50.,  31.,   2.,   0.,   0.,
         9.,  49.,  42., 157., 157.,  12.,   4.,   1.,   5.,   1.,  13.,
         7.,  12.,  41.,   5.,   0.,   0., 104.,   8.,   5.,  19.,  53.,
         5.,   1.,  21., 157.,  55.,  35.,  90.,  22.,   0.,   0.,  18.,
         3.,   6.,  68., 157.,  52.,   0.,   0.,   0.,   7.,  34.,  10.,
        10.,  11.,   0.,   2.,   6.,  44.,   9.,   4.,   7.,  19.,   5.,
        14.,  26.,  37.,  28.,  32.,  92.,  16.,   2.,   3.,   4.,   0.,
         0.,   6.,  92.,  23.,   0.,   0.,   0.], dtype=float32)


Image features-harris corner detection

title

Fundamental

title

title

title

title

title

title

cv2.cornerHarris()

  • img: Input image with data type float32
  • blockSize: The size of the specified area in corner detection
  • ksize: window size used in Sobel derivation
  • k: The value parameter is [0,04,0.06]
import cv2 
import numpy as np
import matplotlib.pyplot as plt#Matplotlib是RGB

img = cv2.imread('chessboard.jpg')
print ('img.shape:',img.shape)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# gray = np.float32(gray)
dst = cv2.cornerHarris(gray, 2, 3, 0.04)
print ('dst.shape:',dst.shape)
img.shape: (512, 512, 3)
dst.shape: (512, 512)
def cv_show(img,name):
    b,g,r = cv2.split(img)
    img_rgb = cv2.merge((r,g,b))
    plt.imshow(img_rgb)
    plt.show()
def cv_show1(img,name):
    plt.imshow(img)
    plt.show()
    cv2.imshow(name,img)
    cv2.waitKey()
    cv2.destroyAllWindows()


img[dst>0.01*dst.max()]=[255,255,255]
cv_show(img,'dst')
# cv2.imshow('dst',img) 
# cv2.waitKey(0) 
# cv2.destroyAllWindows()

png


feature matching

Brute-Force brute force matching

import cv2 
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
img1 = cv2.imread('box.png', 0)
img2 = cv2.imread('box_in_scene.png', 0)
def cv_show(img,name):
    b,g,r = cv2.split(img)
    img_rgb = cv2.merge((r,g,b))
    plt.imshow(img_rgb)
    plt.show()
def cv_show1(img,name):
    plt.imshow(img)
    plt.show()
    cv2.imshow(name,img)
    cv2.waitKey()
    cv2.destroyAllWindows()


cv_show1(img1,'img1')

png

cv_show1(img2,'img2')

png

sift = cv2.xfeatures2d.SIFT_create()
kp1, des1 = sift.detectAndCompute(img1, None)
kp2, des2 = sift.detectAndCompute(img2, None)
# crossCheck表示两个特征点要互相匹,例如A中的第i个特征点与B中的第j个特征点最近的,并且B中的第j个特征点到A中的第i个特征点也是 
#NORM_L2: 归一化数组的(欧几里德距离),如果其他特征计算方法需要考虑不同的匹配计算方式
bf = cv2.BFMatcher(crossCheck=True) #蛮力匹配

1 to 1 match

matches = bf.match(des1, des2)
matches = sorted(matches, key=lambda x: x.distance)#排序
img3 = cv2.drawMatches(img1, kp1, img2, kp2, matches[:10], None,flags=2)
cv_show(img3,'img3')

png

k best matches

bf = cv2.BFMatcher()
matches = bf.knnMatch(des1, des2, k=2)#1对K匹配 
good = []
for m, n in matches:
    if m.distance < 0.75 * n.distance:
        good.append([m])
img3 = cv2.drawMatchesKnn(img1,kp1,img2,kp2,good,None,flags=2)
cv_show(img3,'img3')

png

If you need to complete the operation faster, you can try using cv2.FlannBasedMatcher

Random sample consensus algorithm (RANSAC)

title

Select initial sample points for fitting, give a tolerance range, and continue to iterate

title

After each fitting, there will be a corresponding number of data points within the tolerance range. Find the situation with the largest number of data points, which is the final fitting result.

title

homography matrix

title


Guess you like

Origin blog.csdn.net/weixin_41756645/article/details/125549354