Opencv image feature point extraction (

       

Table of contents

Classification of feature points

1 ORB

① Feature point detection

②Calculation feature description

2 SIFT

1 Steps of SIFT feature detection

①. Obtain extreme points in the DOG scale space, that is, key points.

②. Feature point direction estimation

③ Generate feature description

④. Code implementation

3.SURF

①. Introduction of SURF

②.SURF algorithm steps

③. Comparison of SIFT and SURF effects

④ Code implementation

4 FAST corner detection and adjustable threshold

Replenish

image pyramid

grayscale centroid method

Implementation ideas:


The image itself is a matrix composed of brightness and color . The core problem of VO is how to estimate camera motion based on the image. Digital images are stored in the form of a gray value matrix in the computer , and the gray value is unstable, so the gray value + feature point is used; the corners, edges and blocks in the image are representative places, which can be as image features;

       Feature points are composed of key points and descriptors . Calculating SIFT feature points refers to two things: "extract SIFT key points and calculate SIFT descriptors".

       Key point: the position, size, and direction information of the feature point in the image ; Descriptor: a vector describing the pixel information around the key point .

Classification of feature points

1 ORB

       The ORB feature point key point calculates the main direction of the feature point for Oriented FAST , and the descriptor uses the key point direction information descriptor to increase the rotation invariance for BRIER .

FAST corner: the place where the local pixel gray level changes obviously, and the speed is fast. Computes the brightness difference between pixels . ORB achieves scale invariance by constructing an image pyramid and detecting corner points on each layer of the image pyramid ; ORB feature rotation is realized by the gray-scale centroid method .

The BRIEF descriptor is a binary description that requires the Hamming distance , and the ORB descriptor is the rotated BRIEF.

       ORB feature points use the FAST (features from accelerated segment test) algorithm to detect feature points. The core idea of ​​FAST is to compare a point with its surrounding points, and if it is different from most of the points, it can be considered as a feature point. For example, we can compare that point with all the pixels on a circle with a radius of 3 around it, as shown in the following figure:

① Feature point detection

insert image description here

 

With the calculation plan and strategy, let's look at the specific process of calculation.

The specific calculation process of FAST:

①Select a pixel point P from the picture, and we will judge whether it is a feature point or not. We first set its density (ie gray value) to Ip .

②Set an appropriate threshold t: When the absolute value of the difference between the gray values ​​of two points is greater than t, we consider the two points to be different.

③Consider the 16 pixels around the pixel . (see picture above)

④ Now if there are n consecutive points among the 16 points that are different from the selected pixel point, then it is a corner point. Here n is set to 12.

⑤We now propose an efficient test to quickly exclude a large number of non-feature points. This test only checks the pixels at four positions 1, 9, 5 and 13. If it is a corner point, then at least 3 of the above four pixel points should be the same as the point. If neither is satisfied, then it cannot be a corner point.

Through the above process, our image has a lot of feature points, which we mark in red.

insert image description here

 

②Calculation feature description

After getting the feature points, we need to describe the attributes of these feature points in a certain way. The output of these attributes is called the descriptor of the feature point . ORB uses the BRIEF algorithm to calculate the descriptor of a feature point. The core idea of ​​the BRIEF algorithm is to select N point pairs in a certain pattern around the key point P, and combine the comparison results of these N point pairs as descriptors.

Next, let's look at the specific operation:                              

  • Make a circle O with the key point P as the center and d as the radius.

  • Select N point pairs in a certain pattern within the circle O. For the convenience of explanation here, N=4, and N can be 512 in practical applications.

  • Assume that the currently selected 4 point pairs are marked as:

  • Define the operation T

  • Perform T operation on the selected point pairs respectively, and combine the obtained results.

if:

Then the final descriptor is: 1011

#include<opencv2/opencv.hpp>
#include<iostream>
#include<opencv2/xfeatures2d.hpp>
#include <opencv2/highgui/highgui_c.h>
#include<opencv2/xfeatures2d/nonfree.hpp>
#include<vector>

using namespace cv;
using namespace std;
using namespace cv::xfeatures2d;

Mat src;
int main(int argc, char** argv)
{
	src = imread("./data2/101.png", IMREAD_GRAYSCALE); //加载灰度图像
	//src = imread("./data2/101.png"); //加载图像
	if (!src.data)
	{
		cout << "图片加载失败" << endl;
		return -1;
	}
	namedWindow("加载的灰度图像", CV_WINDOW_NORMAL); //可任意改变窗口大小
	imshow("加载的灰度图像", src);
	int numfeature = 400;  //特征点数目
	Ptr<ORB>detector = ORB::create(numfeature);
	//auto detector = ORB::create(); //自动生成特征点的个数
	vector<KeyPoint>keypoints;
	detector->detect(src, keypoints, Mat());
	printf("所有的特征点个数:%d", keypoints.size());
	Mat resultImg;
	drawKeypoints(src, keypoints, resultImg, Scalar::all(-1), DrawMatchesFlags::DEFAULT); //特征点颜色随机
	imshow("特征点提取", resultImg);
	imwrite("./效果图/特征点提取.jpg", resultImg);
	waitKey(0);
	return 0;

}
import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
img1 = cv.imread('./data/box.png',0)          # queryImage
img2 = cv.imread('./data/box_in_scene.png',0) # trainImage
# Initiate ORB detector
orb = cv.ORB_create()
# find the keypoints and descriptors with ORB
kp1, des1 = orb.detectAndCompute(img1,None)
kp2, des2 = orb.detectAndCompute(img2,None)

# create BFMatcher object
bf = cv.BFMatcher(cv.NORM_HAMMING, crossCheck=True)
# Match descriptors.
matches = bf.match(des1,des2)
# Sort them in the order of their distance.
matches = sorted(matches, key = lambda x:x.distance)
# Draw first 10 matches.
img3 = cv.drawMatches(img1,kp1,img2,kp2,matches[:20],None, flags=2)
plt.imshow(img3),plt.show()

2 SIFT

       Scale-invariant features , SIFT features remain invariant to rotation, scale scaling, brightness changes, etc. , and are very stable local features. It also maintains a certain degree of stability against viewing angle changes, affine transformations, and noise .

1 Steps of SIFT feature detection

①. Obtain extreme points in the DOG scale space, that is, key points.

       About LOG (Gaussian Laplacian Pyramid) DOG (Gaussian Difference Pyramid) reference https://blog.csdn.net/dcrmg/article/details/52561656

       Determine extreme points : When looking for DoG extreme points, each pixel is compared with all its adjacent points , and when it is greater (or smaller) than all adjacent points in its image domain and scale domain , it is an extreme value point. As shown in the figure below, the scope of comparison is a 3×3 cube: the middle detection point and its 8 adjacent points of the same scale, and 9×2 points corresponding to the upper and lower adjacent scales—a total of 26 points for comparison , to ensure that extreme points are detected in both scale space and 2D image space.

insert image description here

②. Feature point direction estimation

       Calculate the argument and amplitude of the region image centered on the feature point and with a radius of 3×1.5σ, the modulus m(x ,y) of the gradient of each point L(x,y) and the direction θ(x,y ) can be obtained by the following formula
insert image description here

       After calculating the gradient direction , it is necessary to use the histogram to count the gradient direction and magnitude corresponding to the pixels in the neighborhood of the feature point . The horizontal axis of the histogram of the gradient direction is the angle of the gradient direction (the range of the gradient direction is 0 to 360 degrees, and the histogram has a total of 10 columns per 36-degree column, or a total of 8 columns per 45-degree column), and the vertical axis It is the accumulation of the gradient magnitude corresponding to the gradient direction, and the peak value in the histogram is the main direction of the feature point. In the gradient histogram , when there is a column value equivalent to 80% energy of the main peak, this direction can be considered as the auxiliary direction of the feature point. Therefore, a feature point may detect multiple directions (it can also be understood that a feature point may generate multiple coordinates, the same scale, but different directions).

③ Generate feature description

       In order to ensure the rotation invariance of the feature vector, it is necessary to take the feature point as the center and rotate the coordinate axis by θ (the main direction of the feature point) in the nearby neighborhood, that is, rotate the coordinate axis to the main direction of the feature point.

insert image description here

       Divide the rotated area into d×d (d is 2 or 4, usually 4) sub-areas (each area interval is mσ pixel), calculate the gradient histogram of 8 directions in the sub-area, and draw each direction The cumulative value of the gradient direction forms a seed point.
Different from finding the main direction, at this time, the gradient direction histogram of each sub-region divides 0°~360° into 8 direction intervals, and each interval is 45°. That is, each seed point has gradient strength information of 8 direction intervals. Since there are d×d, that is, 4×4 sub-regions, there are finally 4×4×8=128 data, forming a 128-dimensional SIFT feature vector.

insert image description here

 Reference blog: Image Feature Point Extraction (SIFT, SURF, ORB)_A Fei Damowang's Blog-CSDN Blog_Image Feature Point Extraction

④. Code implementation

import numpy as np
import cv2 as cv
img = cv.imread('./data/home.jpg')
gray= cv.cvtColor(img,cv.COLOR_BGR2GRAY)
sift = cv.xfeatures2d.SIFT_create()
kp = sift.detect(gray,None)
img=cv.drawKeypoints(gray,kp,img)

cv.imshow("SIFT", img)
cv.imwrite('sift_keypoints.jpg',img)
cv.waitKey(0)
cv.destroyAllWindows()
#include<opencv2/opencv.hpp>
#include<iostream>
#include<opencv2/xfeatures2d.hpp>
#include <opencv2/highgui/highgui_c.h>
#include<opencv2/xfeatures2d/nonfree.hpp>
#include<vector>

using namespace cv;
using namespace std;
using namespace cv::xfeatures2d;

Mat src;
int main(int argc, char** argv)
{
	src = imread("./data2/101.png", IMREAD_GRAYSCALE); //加载灰度图像
	//src = imread("./data2/101.png"); //加载图像
	if (!src.data)
	{
		cout << "图片加载失败" << endl;
		return -1;
	}
	//namedWindow("加载的灰度图像", CV_WINDOW_NORMAL); //可任意改变窗口大小
	imshow("加载的灰度图像", src);
	int numfeature = 400;  //特征点数目
	Ptr<SIFT>detector = SIFT::create(numfeature);
	//auto detector = SIFT::create(); //自动生成特征点的个数
	vector<KeyPoint>keypoints;
	detector->detect(src, keypoints, Mat());
	printf("所有的特征点个数:%d", keypoints.size());
	Mat resultImg;
	drawKeypoints(src, keypoints, resultImg, Scalar::all(-1), DrawMatchesFlags::DEFAULT); //特征点颜色随机
	imshow("SIFT特征点提取", resultImg);
	imwrite("./效果图/SIFT特征点提取.jpg", resultImg);
	waitKey(0);
	return 0;

}

Image Feature Point Matching


img1_gray = cv2.imread("./data/000029.jpg")
img2_gray = cv2.imread("./data/000087.jpg")

# sift = cv2.xfeatures2d.SIFT_create()
sift = cv2.xfeatures2d.SURF_create()

kp1, des1 = sift.detectAndCompute(img1_gray, None)
kp2, des2 = sift.detectAndCompute(img2_gray, None)

# BFmatcher with default parms
bf = cv2.BFMatcher(cv2.NORM_L2)
matches = bf.knnMatch(des1, des2, k=2)

goodMatch = []
for m, n in matches:
    if m.distance < 0.50 * n.distance:
        goodMatch.append(m)

goodMatch = np.expand_dims(goodMatch, 1)
# print(goodMatch[:20])

# drawMatchesKnn_cv2(img1_gray, kp1, img2_gray, kp2, goodMatch[:20])
res = cv2.drawMatchesKnn(img1_gray, kp1, img2_gray, kp2, goodMatch[:100], None, flags=2)

cv2.namedWindow('res', cv2.WINDOW_NORMAL)
cv2.resizeWindow('res', 1080, 720)
cv2.imshow('res', res)

cv2.waitKey(0)
cv2.destroyAllWindows()

insert image description hereextract effect

 Result: (the first 100 matching points are shown)
insert image description here

 


3.SURF

①. Introduction of SURF

       SURF (Speeded Up Robust Features) improves the extraction and description of features, and completes the extraction and description of features in a more efficient way. The advantages of the Sift algorithm are stable features, invariance to rotation, scale transformation, and brightness, and a certain degree of stability to perspective transformation and noise; the disadvantage is that the real-time performance is not high, and the ability to extract feature points for smooth-edged objects is weak. . SURF is an improvement on SIFT that provides a 3x speedup. SURF mainly simplifies some operations in SIFT.

②.SURF algorithm steps

Extremum detection in scale space: Search images in all scale spaces and use Hessian to identify potential interest points that are invariant to scale and selection.
Feature point filtering and precise positioning.
Feature direction assignment: Statistics of Harr wavelet features in the circular neighborhood of feature points. That is, within the 60-degree sector, the 60-degree sector area is rotated by 0.2 radians each time for statistics, and the direction of the sector with the largest value is taken as the main direction of the feature point.
Feature point description: In the neighborhood along the main direction of the feature point, take 4×4 small rectangular areas, count the Haar features of each small area, and then obtain a 4-dimensional feature vector for each area. A feature point has a total of 64-dimensional feature vectors as the descriptor of the SURF feature.
For details, you can refer to it, it is well written https://www.cnblogs.com/zyly/p/9531907.html

. Comparison of SIFT and SURF effects

  • 3 times faster
  • Works well with brightness changes
  • Outperforms SIFT in terms of blur
  • Scale invariance is not as good as SIFT
  • Rotation invariant is much worse

④ Code implementation

import numpy as np
import cv2 as cv
img = cv.imread('./data/000029.jpg',0)

surf = cv.xfeatures2d.SURF_create(2000)

kp, des = surf.detectAndCompute(img,None)
img1 = cv.drawKeypoints(img,kp,None,(0,0,255),4)
cv.imshow('surf', img1)
cv.waitKey(0)
cv.destroyAllWindows()


surf.setHessianThreshold(20000)

kp, des = surf.detectAndCompute(img,None)

img2 = cv.drawKeypoints(img,kp,None,(0,0,255),4)
cv.imshow('surf',img2)
cv.waitKey(0)
cv.destroyAllWindows()

insert image description here

4 FAST corner detection and adjustable threshold

  • The threshold can be adjusted automatically. First, a threshold with an initial value of 40 is given.insert image description here
  • Print out the number of feature points and the threshold.

#include<opencv2/opencv.hpp>
#include<iostream>
#include<opencv2/xfeatures2d.hpp>
#include <opencv2/highgui/highgui_c.h>
#include<opencv2/xfeatures2d/nonfree.hpp>
#include<vector>

//FAST角点检测
using namespace std;
using namespace cv;
int thre = 40;
Mat src;
void trackBar(int, void*);

int main(int argc, char** argv)
{
	//src = imread("./data2/88.jpg");
	src = imread("./data2/88.jpg", IMREAD_GRAYSCALE); //加载灰度图像
	if (src.empty())
	{
		printf("无图像加载 \n");
		return -1;
	}
	namedWindow("input", WINDOW_NORMAL);
	imshow("input", src);

	namedWindow("output", WINDOW_NORMAL);
	createTrackbar("threshould", "output", &thre, 255, trackBar);
	waitKey(0);
	return 0;
}

void trackBar(int, void*)
{
	std::vector<KeyPoint> keypoints;
	Mat dst = src.clone();
	Ptr<FastFeatureDetector> detector = FastFeatureDetector::create(thre);
	printf("阈值:%d", thre); //打印检测到的特征点个数,随阈值变化
	detector->detect(src, keypoints);
	printf("检测到的所有的特征点个数:%d", keypoints.size()); //打印检测到的特征点个数,随阈值变化
	drawKeypoints(dst, keypoints, dst, Scalar::all(-1), DrawMatchesFlags::DRAW_OVER_OUTIMG); //随机颜色画出
	imshow("角点检测图", dst);
	imwrite("./效果图/角点检测图.jpg", dst);

}


Replenish

ORB features added scale and rotation descriptions,

Scale invariance is achieved by building an image pyramid and detecting corners at each level of the pyramid.

Feature rotation invariance is achieved by the grayscale centroid method.

Image pyramid to achieve ORB scale invariance

Pyramid is an image processing method commonly used in CV, and the bottom layer of the pyramid is the original image;

The image is scaled at a fixed magnification every time it goes up one layer , and images with different resolutions are obtained;

Smaller images can be seen as scenes from a distance;

In the feature matching algorithm, we can match images on different layers to achieve scale invariance.

For example, if the camera is moving backwards, we can find a match in the upper level of the previous image pyramid or the lower level of the next image pyramid.

  • 3. Theoretical basis: downsampling, upsampling, filter

  • Downsampling: An image pyramid is generated from bottom to top, the bottom of which is usually the original image, and the size of the upward image is halved.
  • Upsampling: An image pyramid is generated from top to bottom. The top is usually a feature map with a smaller size, and the size of the images going down in turn is doubled.

 If we generate a pyramid by downsampling , the easiest way is to continuously delete the even-numbered rows and even-numbered columns of the image, and repeat this process to get a pyramid.
If we generate a pyramid by upsampling , the simplest thing is: insert a column with a value of 0 to the right of each column of pixels, insert a row with a value of 0 under each row of pixels, and repeat continuously to generate a pyramid.
Summary:
1. Downsampling is the process of the image getting smaller, and upsampling is the process of the image getting bigger.
2. An image is down-sampled once and up-sampled once. Although the size is restored to the size of the original image, the pixel value has changed! ! ! That is, these two operations are irreversible.

Up and down sampling API:
down sampling:

cv2.pyrDown(img [, dstsize, borderType])

Upsample:

cv2.pyrUp(img [, dstsize, borderType])

The default size is reduced by half and half, or doubled by double.
The default filters are all Gaussian filters

filter  

In order to reduce the loss of image information, the filtered image is an approximate image of the original image

Why use a filter?
When we downsample to generate a pyramid image, it is an operation to directly delete even-numbered rows and even-numbered columns, but this operation means directly discarding the information in the image! In order to reduce the loss of image information , we use a filter to filter the original image before the down-sampling operation, so that the filtered image is an approximate image of the original image. At this time, we delete even-numbered rows and even-numbered columns, so there is no direct information. lost. There are many ways to filter the original image. For example, we can use the neighborhood filter to operate, so that the generated image is an average pyramid . If we process it with a Gaussian filter , we generate a Gaussian pyramid  .
In the same way, when we upsample to generate an image pyramid, we directly insert the row operation under the column. This operation will generate a large number of 0-value pixels. These 0-value pixels are meaningless, so we need to insert the 0-value pixels Click to assign. The assignment is the interpolation process. There are also many methods for interpolation processing, such as supplementing with the regional mean value , which generates an average pyramid , and if filled with a Gaussian kernel , it is a Gaussian pyramid

Laplace Pyramid

The Laplacian pyramid is generated on the basis of the Gaussian pyramid.
Why was Laplace's pyramid invented? Still because of the Gaussian pyramid, although it is filtered with a Gaussian kernel , more or less information is lost, and this lost information is the Laplacian pyramid  . Therefore, the role of the Laplacian pyramid is to restore the details of the image, that is, after we extract features from the high-level small-size feature map, we can also use the Laplacian pyramid data to retrieve the underlying clarity corresponding to the high-level pixels. A higher-resolution image is to go back to find more details of the image.
Li = Gi - PyrUp( PyrDown(Gi) )
Among them, Gi: original image; Li: Laplacian pyramid image

Gray-scale centroid method to achieve ORB rotation without deformation

In terms of ORB feature rotation, the gray centroid of the image near the image feature point is calculated. The centroid is the center of the weight of the gray value of the image block

Implementation ideas:

Purpose: Know the centroid P, and solve the mass center Q

1. In order to ensure rotation invariance , it must be calculated within a circle;

2. In order to calculate within the circle, it is necessary to know the boundary of the circle before indexing;

3. In order to know the boundary of the circle , it is necessary to know the abscissa umax when the ordinate is known;

4. Finally, use the IC_Angle function to solve the pixel sum in parallel

Specific steps:

①In a small image block B, define the image block matrix as:

 ② The centroid of the image block can be found by moment :

 ③ Connect the geometric center O and the centroid C of the image block to obtain a direction vector \overrightarrow{OC}, so the direction of the feature point can be defined as:

 Through the above method, the FAST corners have a description of scale and rotation, which greatly improves the robustness of their representation between different images. So in ORB, this improved FAST is called Oriented FAST.


 

references

Image Feature Point Extraction (SIFT, SURF, ORB)_A Fei Damowang's Blog-CSDN Blog_Image Feature Point Extraction

Summary of Several Methods of Image Feature Point Extraction and Matching - Based on C++ and OPENCV to Realize SIFT, SURF, ORB, FAST_Fighting_XH Blog-CSDN Blog_vector<keypoint> 

Guess you like

Origin blog.csdn.net/Yangy_Jiaojiao/article/details/127635775