Python computer vision (2) - local image descriptor

2.1 Harris corner detector

The main idea of ​​the Harris corner detection algorithm is: if there are edges in more than one direction around the pixel, the point is considered to be a point of interest, called a corner point.

corner feature

The point where the local window moves in all directions produces obvious changes; the point where the local curvature of the image changes abruptly; the intersection point between the contours; for the same scene, even if the viewing angle changes, it usually has the characteristics of stable properties; the pixel points in the neighborhood of this point There are large changes in both the gradient direction and the gradient magnitude.

Basic idea of ​​corner detection

When the small window moves in any direction, the gray value of the corner neighborhood image changes significantly.
insert slice description here

Mathematical Thoughts of Harris Corner Detector

A function can be used to determine whether it is a corner point:


insert image description here

w(x, y) represents a rectangular window, I(x, y) represents the gray value of the point, u and v represent the offset of the window, if there is a large gray change in any direction, then judge the Points are corner points. We can use the Gaussian weight matrix to highlight the points with large changes and give them greater weights.
We can use the first-order Taylor formula to simplify the above function:
insert image description here
represented by a matrix:
insert image description here

Order M=\begin{pmatrix} I_{x}^{2} &I_{x} I_{y}\\ I_{y} I_{x} & I_{y}^{2} \end{pmatrix}to solve the eigenvalues \left | \lambda E-M \right |=0, the eigenvalues \lambda _{1},\lambda _{2}

Features:

Harris corner detection operator has rotation invariance

Harris corner detection operator is insensitive to gray scale translation and gray scale change

Harris corner detection operator does not have scale invariance

Implement the corner detection code 

The algorithm uses python-opencv to realize Harris corner detection

import numpy as np
import cv2 as cv

# 读取图像
im_name = 'jmu/3.jpg'
im = cv.imread(im_name)
# 图像灰度化,并进行浮点数运算
im_gray = cv.cvtColor(im, cv.COLOR_BGR2GRAY)
im_gray = np.float32(im_gray)
# Harris算法检测角点
'''
cv2.cornerHarris(src, blockSize, ksize,k)
函数参数含义:
src:输入图像,通常为灰度图像
blockSize:角点检测算法使用的邻域窗口大小
ksize:Sobel 算子的卷积核大小,用于计算图像的梯度信息,一般为奇数
k:角点响应函数中的自由参数,一般取值为[0,0.04,0.06]
'''
# 角点响应图像
dst = cv.cornerHarris(im_gray, 5, 5, 0.06)
# 标准化响应图像
dst_norm = np.empty_like(dst)
cv.normalize(dst, dst_norm, alpha=0, beta=255, norm_type=cv.NORM_MINMAX)

# 将响应图像转换为8位无符号整数类型
dst_norm = np.uint8(dst_norm)

# 显示响应图像
cv.imshow('Corner Response Image', dst_norm)
cv.waitKey(0)

# 将原图像角点响应值大于阈值t的角点标记为红色
im[dst > 0.005* dst.max()] = [0, 0, 255]

# 防止图片输出时过大
cv.namedWindow('t=0.005', cv.WINDOW_KEEPRATIO)
cv.imshow('t=0.005', im)
cv.waitKey(0)
cv.destroyAllWindows()

 result:

Corner Responsive Images

  

t=0.001
t=0.005
t=0.05

Result analysis:
0.06 in dst = cv.cornerHarris(im_gray, 5, 5, 0.06) represents the free parameter k in the corner response function. Changing the value of k can change the number of detected corners. Increasing the value of k will reduce the corner response value R, reduce the sensitivity of corner detection, and reduce the number of detected corners; decrease the value of k, will increase the corner response value R, and increase the corner detection. Sensitivity, increasing the number of detected corners. And as the threshold t increases, the number of marked corners decreases.

2.2 SIFT (Scale Invariant Feature Transform)

SIFT features include interest point detectors and descriptors, which are invariant to scale, rotation and brightness.

Points of Interest

The SIFT features use the difference of Gaussian function to locate points of interest:

D(x,\sigma )=\left [ G_{k\sigma }(x)- G_{\sigma }\right(x) ]*I(x)=\left [ G_{k\sigma }- G_{\sigma }\right ]*I(x)=I_{k\sigma }-I_{\sigma }

Among them G_{\sigma }is a two-dimensional Gaussian kernel, I_{\sigma }which uses G_{\sigma }a blurred grayscale image, and k is a constant that determines the scale of the difference.

The interest points are D(x,\sigma ) the maximum and minimum points under the variation of image position and scale. These candidate locations are filtered to remove unstable points.

descriptor

The interest point location descriptor gives the location and scale information of the interest point. To achieve rotation invariance, the SIFT descriptor introduces a reference direction based on the direction and magnitude of the image gradient around each point. The SIFT descriptor uses principal directions to describe the reference directions. The main directions are measured using a direction histogram weighted by size. To be robust to image brightness, the SIFT descriptor uses image gradients. The SIFT descriptor selects a grid of sub-regions near each pixel, and calculates a histogram of image gradient orientations in each sub-region. The histograms of each sub-region are concatenated to form a description sub-vector. The standard setting of the SIFT descriptor uses 4 × 4 sub-regions, and each sub-region uses the direction histogram between 8 small areas, which will generate a total of 128 histograms between the small areas (4 × 4 × 8=128).


The steps of SIFT algorithm feature matching

Detect interest points and compute descriptors

sift = cv.SIFT_create()
kp1, des1 = sift.detectAndCompute(img1, None)
kp2, des2 = sift.detectAndCompute(img2, None)

matching descriptor

bf = cv.BFMatcher(cv.NORM_L2)
matches = bf.knnMatch(des1, des2, k=2)

Implement feature matching code

The algorithm uses python-opencv to achieve SIFT feature matching

import cv2 as cv

# 读取图像
img1 = cv.imread(r'jmu/3.jpg')
img2 = cv.imread(r'jmu/4.jpg')

# 创建SIFT特征点检测
sift = cv.SIFT_create()

# 检测兴趣点并计算描述子
kp1, des1 = sift.detectAndCompute(img1, None)
kp2, des2 = sift.detectAndCompute(img2, None)

# 使用OpenCV中的BFMatcher算法进行特征匹配,并返回最近邻和次近邻匹配的结果
bf = cv.BFMatcher(cv.NORM_L2)
matches = bf.knnMatch(des1, des2, k=2)

# 储存特征匹配最好的优质匹配点对
'''基于距离阈值选择优质匹配点对,如果最近邻m的距离小于0.7倍的次近邻n的距离,
则认为这个匹配点对是优质的,将它存储在goodMatchs列表中。'''
goodMatchs = []
for m, n in matches:
    if m.distance < 0.65 * n.distance:
        goodMatchs.append(m)

# 可视化特征匹配结果,并保存
pic3 = cv.drawMatches(img1=img1, keypoints1=kp1, img2=img2, keypoints2=kp2, matches1to2=goodMatchs, outImg=None)
cv.imwrite(r'/Users/xionglulu/Downloads/project1/m1.jpg', pic3)

result: 

feature matching image

Result analysis: According to the results of feature matching, it can be seen that compared with the Harris algorithm, the SIFT algorithm is more comprehensive in extracting image feature points, but there are also errors in matching, and the matching effect for points with too high similarity is not good. By changing the distance threshold, the number of matching point pairs can be changed. The smaller the threshold, the fewer the number of matches and the higher the accuracy.

2.3 Matching geotagged images

Input a set of images containing three different locations, and use the SIFT feature matching algorithm to match and classify images of the same location.

the code

First save the thumbnail

import os
import cv2 as cv
from pylab import *

maxsize = (100, 100)  # 定义缩略图的大小
path = r'/Users/xionglulu/Downloads/project1/jmu'

# 读取整个文件夹的图片
def read_path(pathname):
    imgname_list = os.listdir(pathname)
    img_list = []
    i = 0
    # 图片列表
    for imgname in imgname_list:
        if imgname.endswith('.jpg'):
            img = cv.imread(pathname + '/' + imgname)
            img_n = cv.resize(img, maxsize, cv.INTER_AREA)
            filename = path + str(i) + '.png'
            cv.imwrite(filename, img_n) 
            i = i + 1
    return img_list

list = read_path(r'/Users/xionglulu/Downloads/project1/jmu')
print(list)

Next match the geotagged image

import os
import cv2 as cv
from pylab import *
import pydotplus as pydot

# 读取整个文件夹的图片
def read_path(pathname):
    imgname_list = os.listdir(pathname)
    img_list = []
    # 图片列表
    for imgname in imgname_list:
        if imgname.endswith('.jpg'):
            img = cv.imread(pathname + '/' + imgname)
            img_list.append(img)
    return img_list


img_list = read_path(r'/Users/xionglulu/Downloads/project1/jmu')
nbr_images = len(img_list)
match_scores = zeros((nbr_images, nbr_images))

for i in range(nbr_images):
    for j in range(i, nbr_images): 
        print('comparing ', i, j)
        sift = cv.xfeatures2d.SIFT_create()
        kp1, des1 = sift.detectAndCompute(img_list[i], None)
        kp2, des2 = sift.detectAndCompute(img_list[j], None)
        # BFMatch匹配
        bf = cv.BFMatcher(cv.NORM_L2)
        matches = bf.knnMatch(des1, des2, k=2)
        # 储存差距小的优秀匹配点
        goodMatches = []
        for m, n in matches:
            if m.distance < 0.5 * n.distance:
                goodMatches.append(m)
        # 计算优秀匹配点的和
        nbr_matches = len(goodMatches)
        # 向match_scores赋值
        print('number of matches = ', nbr_matches)
        match_scores[i, j] = nbr_matches

# 复制
for i in range(nbr_images):
    for j in range(i + 1, nbr_images):  # 不用复制自我匹配的对角线
        match_scores[j, i] = match_scores[i, j]
# 可视化
threshold = 2  #  至少2个以上匹配点就可以算是联系
g = pydot.Dot(graph_type='graph')  # 不需要有向图
maxsize = (100, 100)  # 定义缩略图的大小
path = r'/Users/xionglulu/Downloads/project1/jmu'
#两两配对
for i in range(nbr_images):
    for j in range(i + 1, nbr_images):
        if match_scores[i, j] > threshold:
            filename = path + str(i) + '.png'
            g.add_node(pydot.Node(str(i), fontcolor='transparent', shape='rectangle', image=filename))
            filename = path + str(j) + '.png'
            g.add_node(pydot.Node(str(j), fontcolor='transparent', shape='rectangle', image=filename))
            g.add_edge(pydot.Edge(str(i), str(j)))
#绘制S地理标记SIFT匹配图
g.write_jpg('jmuv.jpg')

 result:

Match geotagged images

 Result analysis: It can be seen from the result picture that there are some errors in the matching, the clock tower and the Sun Yat-sen Memorial Hall are wrongly matched, and the Lu Building and the Sun Yat-sen Memorial Hall are also wrongly matched. This may be that the three buildings are very similar.

Guess you like

Origin blog.csdn.net/summer_524/article/details/130114804