---- SIFT image feature extraction algorithm program works

Study Notes ---- SIFT image feature extraction algorithm theory

Tang Yu Di herein by reference opencv project combat class

zddhub's blog

Paper: Scale Invariant Feature Transform (SIFT) for up to 56 :)

1 Overview

SIFT Algorithm (Scale Invariant Feature Transform) translation invariant image feature matching algorithm.
Features SIFT algorithm

  1. SIFT feature is a local feature of an image, which is scaled rotation, scale, luminance variation remains invariant to changes in perspective, an affine transformation, noise is to maintain a certain degree of stability
  2. Informative, suitable for rapid mass characteristics in databases and accurate match
  3. Large amounts of even the few objects may also produce large amounts of SIFT feature vector
  4. High speed, optimized SIFT matching algorithm can even achieve real-time requirements
  5. Scalability, can easily be combined with other forms of feature vectors

SIFT Algorithm in different scale space lookup key point (feature point), and calculate the critical point direction . The SIFT key point is to find some very prominent, not because of light, noise and other factors affine transformation and change point, such as corners, edges point, highlights the dark areas and bright areas such as dark spots.

Step SIFT algorithm

  1. Scale space images
  2. Multi-resolution pyramid
  3. Difference of Gaussian pyramid
  4. The key point precise positioning
  5. Eliminate the boundary effect
  6. The main direction of the feature point
  7. Generates a feature description

2 algorithm steps

2.1 scale space images

Within a certain range, whether the object is large or small, to the human eye can tell the difference. SIFT algorithm is a computer for the purpose of the object at different scales and a unified perception. That is the rotation of the object, zoom scale, luminance variation remains invariant to changes in perspective, an affine transformation, noise is to maintain a certain degree of stability.
So we must consider the characteristics of the image at different scales.
By Gaussian filtering to achieve scale-space acquisition

Here Insert Picture DescriptionThe larger the standard deviation of a normal distribution, a Gaussian filtering of the image more blurred.

As shown in FIG curved two-dimensional Gaussian:
Here Insert Picture Description
Gaussian weighting is centrally symmetric. Value of each pixel is a weighted average of neighboring pixel values around.
Here Gaussian blur Gaussian filter function is not called directly in the time domain gussian = cv2.GaussianBlur(img, (5, 5), 1)
but may change the standard deviation values, to change the degree of image blur.

from PIL import Image, ImageFont
from PIL import ImageFilter
import cv2


im = Image.open('ggg.jpg')
im = im.filter(ImageFilter.GaussianBlur(radius=1))
im.show()
im = im.filter(ImageFilter.GaussianBlur(radius=2))
im.show()
im = im.filter(ImageFilter.GaussianBlur(radius=3))
im.show()

The results are shown:
original Here Insert Picture Description
standard deviation of 3:
Here Insert Picture Description
standard deviation of 5:
Here Insert Picture Description
standard deviation of 7:
Here Insert Picture Description

2.2 Difference of Gaussian pyramid (DOG)

Multi-resolution image pyramid
Here Insert Picture Description
principle of image pyramid (for feature extraction)
Gaussian pyramid difference from bottom to top, by way of the sampling, a Gaussian convolution kernel, all the even-numbered rows and columns removed.
Gaussian pyramid definition of differential equations:
Here Insert Picture Description
Gaussian pyramid differential
Gaussian pyramid schematic Difference:
Here Insert Picture Description
same size in the same layer as an image in the image pyramid, but chose a different standard deviation of the Gaussian blur, having different degree of blurring.
Difference of Gaussian pyramid (DOG) of the differential in the same layer as the blurred image to obtain a difference result.
Here Insert Picture Description
If X is an extreme point, not only in the same layer X is the extreme point of around 9 points, and, in the case of the original schematic also a layer and the next layer are compared, in order to be seen as the point It is the extreme point. Point X is compared with 3 × 9-1 = 26 points in addition to its own, it is determined whether extreme point.

Precise positioning of the extreme point 2.3

Obtained by the above method a candidate key points, but these points are not necessarily the exact location of extreme point, possibly in the vicinity of the real extreme point.
So we want to scale space DOG function curve fitting, to achieve precise positioning of key points.
Here Insert Picture Description
Hollow dots represent the last step of the key points of the candidate, solid origin represent a real extreme points.
The candidate keypoints are local extreme points Difference of Gaussian pyramid (DOG) space. These extreme points are discrete. The question then becomes a function of a number of discrete points, find the real extreme points of this function.

We step on the key points and get a real candidate extreme points at the same time on the same distribution function curve.
We use the Taylor series fitting original function.
Here Insert Picture Description
A discrete point of a fitting original function f (x). Variable function and then turned, so that the derivative is equal to 0, true extreme points determined.

In Difference of Gaussian pyramid (DOG) of each candidate key space can be expressed as:
Here Insert Picture Description
In an example of 3 × 3 matrix, the matrix is written in the form of Taylor expansion:
Here Insert Picture Description
Simplification after expressed as:
Here Insert Picture DescriptionHere Insert Picture Description

2.4 Elimination of the corresponding boundary

DOG operator will have a strong response to the edge, the edge response need to eliminate the unstable points. Obtaining feature points at the Hessian matrix, the principal curvatures by a 2x2 Hessian matrix H is obtained, in response to eliminate boundaries.
Here Insert Picture Description

2.5 How to pinpoint the key point that out

The main direction of the feature point

Here Insert Picture Description
Each feature point is obtained (x, y, σ, θ) three information: the location, scale and orientation.

Generating a characterization 2.6

A key point in the center as shown in FIG draw appropriate range to make a histogram of the eight directions, the direction of the peak of the histogram represents the characteristic direction of the gradient at the neighbor point, the maximum value in the histogram as the main direction of the key points.
Here Insert Picture Description

Rotation invariance
Here Insert Picture Description
order to ensure the proper amount of rotation invariance feature point as the center, in the vicinity of the neighborhood of the axis of rotation, the direction of the main direction of the feature point as the coordinate axes.

After the main direction of rotation taking 8x8 window centered, seeking the gradient magnitude and direction, the direction of the arrow representing the gradient direction for each pixel, represents the length of the gradient magnitude, and then subjected to Gaussian weighted calculation window, in each of the last 4x4 8 plotted on the histogram of the gradient directions of pieces, is calculated for each gradient direction of the accumulated value, to form a seed point, i.e., the four dots of each feature seeds, each seed point has eight directions vector information.
Here Insert Picture Description
Each key is described by 4 × 4 = 16 seed points. Such a hanging shop that will produce SIFT feature vector of 128 dimensions.
Here Insert Picture Description
On the code:

import cv2
import numpy as np

img = cv2.imread('box.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print(cv2.__version__)
sift = cv2.xfeatures2d.SIFT_create()
kp = sift.detect(gray, None)
img = cv2.drawKeypoints(gray, kp, img)

cv2.imshow('drawKeypoints', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
kp, des = sift.compute(gray, kp)
print(np.array(kp).shape)
print(des.shape)
print(des[0])

Function cv2.xfeatures2d.SIFT_create()is an example of SIFT function.
Function sift.detect(gray, None)to identify the key points on a graph.
Then cv2.drawKeypoints(gray, kp, img)draw a critical point.
Program kp, des = sift.compute(gray, kp)in here to find out kp will be a key point list, des is a numpy array shape Number_of_Keypoints * 128's
output:
Here Insert Picture Description
At this point, the algorithm SIFT image feature extraction project is completed.
Need to be improved, thanks to help provide for my blog and Tang Yu Di opencv project training classes.

Published 10 original articles · won praise 0 · Views 215

Guess you like

Origin blog.csdn.net/weixin_43227526/article/details/105020660