Popular explanation of meanshift algorithm [meanshift example display]

Principle of meanshift algorithm

The principle of the meanshift algorithm is very simple. Suppose you have a bunch of point sets and a small window. This window may be circular. Now you may want to move this window to the area with the highest density of point sets.

As shown below:

The principle of meanshift algorithm

The first window is the area of ​​the blue circle, named C1. The center of the blue circle is marked with a blue rectangle, named C1_o.

The center of mass formed by the point sets of all points in the window is at the blue circular point C1_r, obviously the centroid and the center of mass of the ring do not coincide. So, move the blue window so that the centroid coincides with the centroid obtained earlier. Find the centroid of the point set enclosed in the circle again in the area of ​​the newly moved circle, and then move again. Usually, the centroid and the centroid do not coincide. Continue to perform the above moving process until the centroid and the centroid roughly coincide. In this way, the final circular window will fall to the place where the pixel distribution is the largest, that is, the green circle in the figure, named C2.

In addition to being used in video tracking, the meanshift algorithm has important applications in various occasions involving data and unsupervised learning such as clustering and smoothing. It is a widely used algorithm.

An image is a matrix of information. How to use the meanshift algorithm to track a moving object in a video? The general process is as follows:

1. First select a target area on the image

2. Calculate the histogram distribution of the selected area, generally the histogram of the HSV color space.

3. The histogram distribution is also calculated for the next frame of image b.

4. Calculate the area in image b that is most similar to the histogram distribution of the selected area, and use the meanshift algorithm to move the selected area along the most similar part until the most similar area is found, and the goal in image b is completed track.

5. Repeat the process from 3 to 4 to complete the entire video target tracking.

Usually we use the image obtained by histogram back projection and the starting position of the target object in the first frame. When the movement of the target object is reflected in the histogram back projection, the meanshift algorithm will move our window to the back projection The area with the highest gray density in the projected image is projected. As shown below:
insert image description here

The process of histogram backprojection is:

Suppose we have a 100x100 input image and a 10x10 template image, the search process is as follows:

1. Starting from the upper left corner (0,0) of the input image, cut a temporary image from (0,0) to (10,10);

2. Generate a histogram of the temporary image;

3. Use the histogram of the temporary image and the histogram of the template image to compare, and record the comparison result as c;

4. The histogram comparison result c is the pixel value at (0,0) of the result image;

5. Cut the temporary image of the input image from (0,1) to (10,11), compare the histogram, and record it to the result image;

6. Repeat steps 1 to 5 until the lower right corner of the input image, and the back projection of the histogram is formed.

Implementation of meanshift video tracking

The API for implementing Meanshift in OpenCV is:

cv.meanShift(probImage, window, criteria)

parameter:

·probImage: ROI area, which is the back projection of the histogram of the target

Window: The initial search window is the rect that defines the ROI

Criteria: Determine the criteria for stopping the window search, mainly including the number of iterations reaching the set maximum value, the drift value of the window center being greater than a certain set limit, etc.

The main process of implementing Meanshift is:

1. Read the video file: cv.videoCapture()

2. Area of ​​interest setting: Get the first frame of image and set the target area, that is, the area of ​​interest

3. Calculate the histogram: Calculate the HSV histogram of the region of interest and normalize it

4. Target tracking: set the window search stop condition, back-project the histogram, perform target tracking, and draw a rectangular frame at the target position.

Example:

import numpy as np
import cv2 as cv
# 1.获取图像
cap = cv.VideoCapture('DOG.wmv')

# 2.获取第一帧图像,并指定目标位置
ret,frame = cap.read()
# 2.1 目标位置(行,高,列,宽)
r,h,c,w = 197,141,0,208  
track_window = (c,r,w,h)
# 2.2 指定目标的感兴趣区域
roi = frame[r:r+h, c:c+w]

# 3. 计算直方图
# 3.1 转换色彩空间(HSV)
hsv_roi =  cv.cvtColor(roi, cv.COLOR_BGR2HSV)
# 3.2 去除低亮度的值
# mask = cv.inRange(hsv_roi, np.array((0., 60.,32.)), np.array((180.,255.,255.)))
# 3.3 计算直方图
roi_hist = cv.calcHist([hsv_roi],[0],None,[180],[0,180])
# 3.4 归一化
cv.normalize(roi_hist,roi_hist,0,255,cv.NORM_MINMAX)

# 4. 目标追踪
# 4.1 设置窗口搜索终止条件:最大迭代次数,窗口中心漂移最小值
term_crit = ( cv.TERM_CRITERIA_EPS | cv.TERM_CRITERIA_COUNT, 10, 1 )

while(True):
    # 4.2 获取每一帧图像
    ret ,frame = cap.read()
    if ret == True:
        # 4.3 计算直方图的反向投影
        hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
        dst = cv.calcBackProject([hsv],[0],roi_hist,[0,180],1)

        # 4.4 进行meanshift追踪
        ret, track_window = cv.meanShift(dst, track_window, term_crit)

        # 4.5 将追踪的位置绘制在视频上,并进行显示
        x,y,w,h = track_window
        img2 = cv.rectangle(frame, (x,y), (x+w,y+h), 255,2)
        cv.imshow('frame',img2)

        if cv.waitKey(60) & 0xFF == ord('q'):
            break        
    else:
        break
# 5. 资源释放        
cap.release()
cv.destroyAllWindows()

The following are the tracking results of the three frames of images:

meanshift video tracking effect

Camshift algorithm

Please carefully look at the above results. There is a problem, that is, the size of the detection window is fixed, and the dog is gradually shrinking from near to far, so a fixed window is not suitable. So we need to correct the size and angle of the window according to the size and angle of the target. CamShift can help us solve this problem.

The full name of the CamShift algorithm is "Continuously Adaptive Mean-Shift" (continuously adaptive MeanShift algorithm), which is an improved algorithm for the MeanShift algorithm. It can adjust the size of the search window in real time as the size of the tracking target changes, and has a better tracking effect.

The Camshift algorithm first applies a meanshift, and once the meanshift converges, it updates the size of the window and also calculates the orientation of the best-fit ellipse, thereby updating the search window according to the location and size of the target. As shown below:

Camshift algorithm

When Camshift is implemented in OpenCV, just change the above meanshift function to Camshift function:

In Camshift:

 # 4.4 进行meanshift追踪
        ret, track_window = cv.meanShift(dst, track_window, term_crit)
        # 4.5 将追踪的位置绘制在视频上,并进行显示
        x,y,w,h = track_window
        img2 = cv.rectangle(frame, (x,y), (x+w,y+h), 255,2)

Change to:

#进行camshift追踪
    ret, track_window = cv.CamShift(dst, track_window, term_crit)

        # 绘制追踪结果
        pts = cv.boxPoints(ret)
        pts = np.int0(pts)
        img2 = cv.polylines(frame,[pts],True, 255,2)

Algorithm summary

meanshift

Principle: An iterative step, that is, first calculate the offset mean of the current point, move the point to its offset mean, and then use this as a new starting point to continue moving until certain conditions are met.

API:cv.meanshift()

Advantages and disadvantages: simple, fewer iterations, but it cannot solve the occlusion problem of the target and cannot adapt to the shape and size changes of the moving target

camshift

Principle: An improvement to the meanshift algorithm, first apply meanshift, once meanshift converges, it updates the size of the window, and also calculates the direction of the best-fit ellipse, thereby updating the search window according to the position and size of the target.

API:cv.camshift()

Advantages and disadvantages: It can adapt to the change of the size and shape of the moving target, and has a good tracking effect, but when the background color and the target color are close, it is easy to make the target area larger, which may eventually lead to the loss of target tracking

Guess you like

Origin blog.csdn.net/cz_00001/article/details/132065601