opencv study notes 6--image features [harris+SIFT]+feature matching
Image Features (SIFT-Scale Invariant Feature Transform)
image scale space
Within a certain range, no matter whether the object is large or small, the human eye can distinguish it. However, it is difficult for a computer to have the same ability. Therefore, in order for the machine to have a unified understanding of objects at different scales, it is necessary to It is necessary to consider the characteristics of images that exist at different scales.
The acquisition of scale space is usually achieved using Gaussian blur.
Gaussian functions with different σ determine the smoothness of the image. A larger σ value corresponds to a blurrier image.
multi-resolution pyramid
Difference of Gaussian Pyramid (DOG)
DoG space extreme value detection
In order to find the extreme point of the scale space, each pixel point is compared with all adjacent points in its image domain (same scale space) and scale domain (adjacent scale space). When it is greater than (or smaller than) all the corresponding When it is an adjacent point, the point is the extreme point. As shown in the figure below, the middle detection point needs to be compared with 8 pixels in the 3×3 neighborhood of the image where it is located, and 18 pixels in the 3×3 areas of the adjacent upper and lower layers, for a total of 26 pixels. .
Precise positioning of key points
These candidate key points are local extreme points in the DOG space, and these extreme points are discrete points. One way to accurately locate the extreme points is to perform curve fitting on the scale space DoG function and calculate its extreme points. , thereby achieving precise positioning of key points.
Eliminate borderline responses
The main direction of the feature point
Each feature point can obtain three pieces of information (x, y, σ, θ), namely position, scale and direction. Key points with multiple directions can be copied into multiple copies, and then the direction values are assigned to the copied feature points respectively. One feature point generates multiple feature points with the same coordinates and scales, but different directions.
Generate feature descriptions
After completing the gradient calculation of the key points, the histogram is used to count the gradient and direction of the pixels in the neighborhood.
In order to ensure the rotation invariance of the feature vector, the feature point should be taken as the center and the coordinate axis should be rotated by an angle θ in the nearby neighborhood, that is, the coordinate axis should be rotated to the main direction of the feature point.
Take an 8x8 window with the main direction after rotation as the center, and find the gradient amplitude and direction of each pixel. The direction of the arrow represents the gradient direction, and the length represents the gradient amplitude. Then use a Gaussian window to weight it, and finally in each 4x4 Draw a gradient histogram in 8 directions on a small patch, and calculate the cumulative value of each gradient direction to form a seed point, that is, each feature is composed of 4 seed points, and each seed point has 8 directions. vector information.
The paper recommends using 4x4 total 16 seed points to describe each key point, so that a key point will generate a 128-dimensional SIFT feature vector.
opencv SIFT function
import cv2
import numpy as np
import matplotlib.pyplot as plt#Matplotlib是RGB
img = cv2.imread('test_1.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
def cv_show(img,name):
b,g,r = cv2.split(img)
img_rgb = cv2.merge((r,g,b))
plt.imshow(img_rgb)
plt.show()
def cv_show1(img,name):
plt.imshow(img)
plt.show()
cv2.imshow(name,img)
cv2.waitKey()
cv2.destroyAllWindows()
cv2.__version__ #3.4.1.15 pip install opencv-python==3.4.1.15 pip install opencv-contrib-python==3.4.1.15
'3.4.1'
Get feature points
sift = cv2.xfeatures2d.SIFT_create()
kp = sift.detect(gray, None)
img = cv2.drawKeypoints(gray, kp, img)
cv_show(img,'drawKeypoints')
# cv2.imshow('drawKeypoints', img)
# cv2.waitKey(0)
# cv2.destroyAllWindows()
Compute features
kp, des = sift.compute(gray, kp)
print (np.array(kp).shape)
(6827,)
des.shape
(6827, 128)
des[0]
array([ 0., 0., 0., 0., 0., 0., 0., 0., 21., 8., 0.,
0., 0., 0., 0., 0., 157., 31., 3., 1., 0., 0.,
2., 63., 75., 7., 20., 35., 31., 74., 23., 66., 0.,
0., 1., 3., 4., 1., 0., 0., 76., 15., 13., 27.,
8., 1., 0., 2., 157., 112., 50., 31., 2., 0., 0.,
9., 49., 42., 157., 157., 12., 4., 1., 5., 1., 13.,
7., 12., 41., 5., 0., 0., 104., 8., 5., 19., 53.,
5., 1., 21., 157., 55., 35., 90., 22., 0., 0., 18.,
3., 6., 68., 157., 52., 0., 0., 0., 7., 34., 10.,
10., 11., 0., 2., 6., 44., 9., 4., 7., 19., 5.,
14., 26., 37., 28., 32., 92., 16., 2., 3., 4., 0.,
0., 6., 92., 23., 0., 0., 0.], dtype=float32)
Image features-harris corner detection
Fundamental
cv2.cornerHarris()
- img: Input image with data type float32
- blockSize: The size of the specified area in corner detection
- ksize: window size used in Sobel derivation
- k: The value parameter is [0,04,0.06]
import cv2
import numpy as np
import matplotlib.pyplot as plt#Matplotlib是RGB
img = cv2.imread('chessboard.jpg')
print ('img.shape:',img.shape)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# gray = np.float32(gray)
dst = cv2.cornerHarris(gray, 2, 3, 0.04)
print ('dst.shape:',dst.shape)
img.shape: (512, 512, 3)
dst.shape: (512, 512)
def cv_show(img,name):
b,g,r = cv2.split(img)
img_rgb = cv2.merge((r,g,b))
plt.imshow(img_rgb)
plt.show()
def cv_show1(img,name):
plt.imshow(img)
plt.show()
cv2.imshow(name,img)
cv2.waitKey()
cv2.destroyAllWindows()
img[dst>0.01*dst.max()]=[255,255,255]
cv_show(img,'dst')
# cv2.imshow('dst',img)
# cv2.waitKey(0)
# cv2.destroyAllWindows()
feature matching
Brute-Force brute force matching
import cv2
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
img1 = cv2.imread('box.png', 0)
img2 = cv2.imread('box_in_scene.png', 0)
def cv_show(img,name):
b,g,r = cv2.split(img)
img_rgb = cv2.merge((r,g,b))
plt.imshow(img_rgb)
plt.show()
def cv_show1(img,name):
plt.imshow(img)
plt.show()
cv2.imshow(name,img)
cv2.waitKey()
cv2.destroyAllWindows()
cv_show1(img1,'img1')
cv_show1(img2,'img2')
sift = cv2.xfeatures2d.SIFT_create()
kp1, des1 = sift.detectAndCompute(img1, None)
kp2, des2 = sift.detectAndCompute(img2, None)
# crossCheck表示两个特征点要互相匹,例如A中的第i个特征点与B中的第j个特征点最近的,并且B中的第j个特征点到A中的第i个特征点也是
#NORM_L2: 归一化数组的(欧几里德距离),如果其他特征计算方法需要考虑不同的匹配计算方式
bf = cv2.BFMatcher(crossCheck=True) #蛮力匹配
1 to 1 match
matches = bf.match(des1, des2)
matches = sorted(matches, key=lambda x: x.distance)#排序
img3 = cv2.drawMatches(img1, kp1, img2, kp2, matches[:10], None,flags=2)
cv_show(img3,'img3')
k best matches
bf = cv2.BFMatcher()
matches = bf.knnMatch(des1, des2, k=2)#1对K匹配
good = []
for m, n in matches:
if m.distance < 0.75 * n.distance:
good.append([m])
img3 = cv2.drawMatchesKnn(img1,kp1,img2,kp2,good,None,flags=2)
cv_show(img3,'img3')
If you need to complete the operation faster, you can try using cv2.FlannBasedMatcher
Random sample consensus algorithm (RANSAC)
Select initial sample points for fitting, give a tolerance range, and continue to iterate
After each fitting, there will be a corresponding number of data points within the tolerance range. Find the situation with the largest number of data points, which is the final fitting result.
homography matrix