Illustration of meanshift algorithm and its application in image clustering and target tracking

Recently, I have been paying attention to the algorithm of tracking this piece. The understanding of meanshift also comes from papers and blogs. This blog will summarize the meanshift algorithm, including the principle of meanshift algorithm and formula derivation, graphics, image clustering, application in target tracking and advantages and disadvantages Summarize.

Algorithm principle

The meanshift algorithm can actually see the core of the algorithm through the name, mean (mean), shift (offset), in short, there is a point , there are many points around it, we calculate the point and move to each point The sum of the required offsets is averaged to obtain the average offset, (the direction of the offset is the direction where the surrounding points are densely distributed) The offset includes size and direction. Then the point moves in the direction of the average offset, and takes this as a new starting point to iterate continuously until a certain condition is met.

The diagram is as follows:


The center point is what we said above, and the small red dots around are the yellow arrows, which are the average offset vectors we solved. So what is the "big circle" in the picture? What is the concept around the dots we mentioned above ? There is always something to limit it. That "circle" is our constraint, or in image processing, the size of the window in which we iterate through our search. However, in opencv, we generally use a rectangular window, and it is an image, 2-dimensional. This is not actually a circle, but a high-dimensional ball.

Step 1. First set the starting point . As we said, it is a ball, so it has a radius . All the points in the ball are . The black arrow is the vector we calculated. We can get our vector by . meanshift vector, which is the yellow vector in the figure.

Then, take the focus of the meanshift vector as the center of the circle, and then make a high-dimensional ball, as shown in the figure below, repeat the above steps, and finally converge to the place with the highest density in the distribution of points


The final result is as follows:


Mathematical derivation

Given n sample points in the d-dimensional space Rd, i=1,...,n, choose a point x in the space, then the basic definition of the meanshift vector is as follows:


where is a high-dimensional region of radius . Defined as follows:


It means that among the n sample points xi, k points fall into the region .
Then, we upgrade the meanshift vector and add a kernel function (such as a Gaussian kernel), then the meanshift algorithm becomes:


Explain the K() kernel function, h is the radius, and is the unit density. To maximize the above formula f, the easiest thing to think of is to take the derivation of the above formula. Indeed, meanshift is to take the derivative of the above formula.


so that we can get:


Since we are using a Gaussian kernel, the first term is equal to


The second term is equivalent to the formula for a meanshift vector:


Then the above formula can be expressed as:


The diagram formula is structured as follows:


Of course, when we obtain the meanshift vector, the place with the highest density is also the place of the extreme point, and the gradient is 0 at this time, that is
, if and only if it is true, then we can get the new origin coordinates :


The above is the derivation of the formula, it is recommended to understand and go through it. Reference from
http://www.cnblogs.com/liqizhou/archive/2012/05/12/2497220.html

In the actual project, the algorithm flow of meanshift is:

1. Select the center point and make (if we are in image or video processing, it is a 2-dimensional window, not limited to a sphere, it can be a rectangle), mark all the points that fall into the window as
2, Calculate , if the value of is less than the threshold or the number of iterations reaches a certain threshold, stop the algorithm, otherwise use the above formula to find the center of the circle to update the dot, and continue to step 1

Clustering and tracking of meanshift in image processing

As we can see above, each step of the meanshift algorithm is actually going in the direction of the greatest density. The density of the distribution of points in the space can be applied to the meanshift algorithm, and an image is composed of densely packed pixels. How to use the inconsistency of density distribution for clustering?

The first thing we think of is still the distance. The closer the distance, the more likely it is to be classified into the same category. So we use the probability density of the points, which is

The closer the pixel is to the dot , the higher the probability density is defined

Then we can think that the colors classified into the same category are generally relatively close, so define the color probability density:

The more similar the color to the dot, the higher the probability

Then, the probability density distribution of each point can be obtained by the following formula:


Among them: write picture description herethe information representing the spatial position, the closer it is to the far point, the greater the value, and the write picture description herecolor information, the more similar the color, the greater the value.

Next, the meanshift algorithm can be used for clustering.

meanshift algorithm target tracking

There are already library functions that can be called in opencv:

int cvMeanShift(const CvArr* prob_image,CvRect window,CvTermCriteria criteria,CvConnectedComp* comp );
  • 1

Looking at the API, you can know that when calling this function, the image that needs to be input is a backprojection image. What is a backprojection image? Briefly introduce:

The back-projection map of the image is to replace the pixel value with the pixel value (multi-dimensional or grayscale) at a certain position of the input image corresponding to a value on a bin of the histogram, so the obtained back-projection map is single-pass. . In statistical terms, the value of an output image pixel is the probability that the observed array falls under a certain distribution (histogram).
do not know? To illustrate with an example:

  • (1) For example, the grayscale image is as follows

Image=

0 1 2 3

4 5 6 7

8 9 10 11

8 9 14 15

  • (2) The histogram of the grayscale image is (bin specified interval is [0,3), [4,7), [8,11), [12,16))

Histogram=

4 4 6 2

  • (3) Reverse projection map

Back_Projection=

4 4 4 4

4 4 4 4

6 6 6 6

6 6 2 2

For example, the pixel value at position (0,0) is 0, and the corresponding bin is [0,3), so the value of the reverse histogram at this position is the value of this bin is 4.

Now I understand, it is actually a pixel density distribution map of an image, and our meanshift algorithm itself works by relying on the density distribution to find the place with the largest density distribution. So it is easy to understand this input parameter.

Semi-automatic tracking idea : input video, circle the target to be tracked with a brush, and then track the object.

Anyone who has used opencv knows that this is actually the working process of camshiftdemo.

Step 1: Select the object and record the boxes and objects you entered.
Step 2: Find the back-projection map of the object in the video.
Step 3: Perform meanshift iteration according to the backprojection map and the input box. Since it moves to the center of gravity, that is, to the place with high probability in the backprojection map, it will always move to the target.
Step 4: Then use the box output from the previous frame to iterate in the next frame of image.

Fully automatic tracking idea : input video and track moving objects.

Step 1: Use a motion detection algorithm to separate the moving objects from the background.
The second step: extract the outline of the moving object, and obtain the information of the moving image from the original image.
Step 3: Back-project this information to obtain a back-projection map.
Step 4: Perform meanshift iteration according to the backprojection map and the outline of the object (that is, the input box). Since it moves to the center of gravity, that is, to the place with high probability in the backprojection map, it will always move to on the object.
Step 5: Then use the box output from the previous frame to iterate in the next frame of image.

Summarize

When the meanShift algorithm is used for video target tracking, it actually uses the color histogram of the target as the search feature, and iterates the meanShift vector to make the algorithm converge to the real position of the target, so as to achieve the purpose of tracking.

In object tracking: meanshift has the following advantages:

(1) The algorithm has a small amount of calculation, and can achieve real-time tracking when the target area is known;
(2) The kernel function histogram model is used, which is insensitive to edge occlusion, target rotation, deformation and background motion.

At the same time, the meanShift algorithm also has the following shortcomings:

(1) Lack of necessary template update;
(2) Since the window width remains unchanged during the tracking process, when the target scale changes, the tracking will fail;
(3) When the target speed is fast, the tracking effect is not good ;
(4) The histogram feature is slightly lacking in the description of the target color feature and lacks spatial information;

Due to its fast calculation speed, it has certain robustness to target deformation and occlusion, some of which can also be improved and adjusted in engineering practice as follows:

(1) Introduce a certain target position change prediction mechanism, thereby further reducing the search time of meanShift tracking and reducing the amount of calculation;
(2) Certain methods can be used to increase the "features" used for target matching;
(3) The The fixed bandwidth of the kernel function in the traditional meanShift algorithm is changed to a dynamically changing bandwidth;
(4) The overall template is learned and updated in a certain way;

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325727512&siteId=291194637