Traditional target tracking - Kalman filter Kalman (combined with MeanShift+Kalman)

Table of contents

1. Kalman

2. Process

3. Code

3.1 meanshift+kalman implementation


Kalman is used to describe the motion model of the target. It does not model the characteristics of the target, but models the motion model of the target. It is often used to estimate the position of the target in the next frame.

1. Kalman

        In the trace above, meanshift and camshift are linked. We use Meanshift or Camshift based on Meanshift to track, here we can also improve it: introduce the concept of Kalman filtering.

        The concept of Kalman filtering is more complicated. Here we can understand it as: from an input containing noise, a statistically good estimate is obtained.  

        Let's take a simple example. For example, I have an object that is moving away from me at a speed of 1m/s (the value measured in the past), and it is currently 1m away from me because we know that it is moving at a speed of 1m/s. The speed is far away from me, so I can estimate that it is 2m away from me in the next second, which is obvious. But if the object hits something and the speed changes, we correct our guess based on the value of its actual position (fuzzy input with noise). Kalman filtering generally we set to 4 parameters, two of which can be obtained directly. As an example, we use the function to create a Kalman filter:

klm=cv2.KalmanFilter(4,2)

        The 4 here means that there are four parameters (x, y, dx, dy). The first two are the parameters we input. In object tracking, we assume that it is the coordinates of the center of our tracking object, and dx and dy represent It is the moving speed of the object, which is an observational quantity and cannot be obtained directly, so in fact, the parameters we can directly input are only coordinates, x and y values, so when initializing, 4 is followed by a 2, so it is (4, 2) Then we need to set several matrices of the Kalman filter:  

klm.measurementMatrix = np.array([[1,0,0,0],[0,1,0,0]],np.float32)
klm.transitionMatrix = np.array([[1,0,1,0],[0,1,0,1],[0,0,1,0],[0,0,0,1]],np.float32)
klm.processNoiseCov = np.array([[1,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,1]],np.float32)

        parameter:

  • measurementMatrix, the measurement matrix , here represents the first two digits of the parameters (x, y, dx, dy) that we can directly measure, so only [1,0,0,0], [0,1,0, 0], to mark the first and second digits.
  • transitionMatrix, the state transition matrix , is [1,0,1,0], [0,1,0,1], [0,0,1,0], [0,0,0,1], which means (x+dx,y+dy), the meaning here is actually the estimate of the next moment after calculating the speed.
  • processNoiseCov , this is the parameter in several equations in the modified Kalman filter for prediction, and the order of reference is x, y, dx, dy.

2. Process

3. Code

3.1 meanshift+kalman implementation

        Now we add Kalman filter on the basis of the above optimization. We use the middle position of the rectangular frame as input and load it into the Kalman filter. After predicting the position of the object, draw a circle with the predicted position as the center of the circle. , the sample code is as follows:

import cv2
import numpy as np


def center(points):
    x = (points[0][0] + points[1][0] + points[2][0] + points[3][0]) / 4
    y = (points[0][1] + points[1][1] + points[2][1] + points[3][1]) / 4
    return np.array([np.float32(x), np.float32(y)], np.float32)


cap = cv2.VideoCapture('E:\Python-Code/videodataset/enn.mp4')
ret, frame = cap.read()
# 我这里画面太大了所以缩小点
# frame = cv2.resize(frame, None, None, fx=1 / 2, fy=1 / 2, interpolation=cv2.INTER_CUBIC)
# 跟踪框
track_window = cv2.selectROI('img', frame)

# 获得绿色的直方图
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
mask = cv2.inRange(hsv, np.array((35, 43, 46)), np.array((77, 255, 255)))
hist = cv2.calcHist([hsv], [0], mask, [181], [0, 180])
cv2.normalize(hist, hist, 0, 255, cv2.NORM_MINMAX)

term_crit = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1)

# 设置卡尔曼滤波器
klm = cv2.KalmanFilter(4, 2)
klm.measurementMatrix = np.array([[1, 0, 0, 0], [0, 1, 0, 0]], np.float32)
klm.transitionMatrix = np.array([[1, 0, 1, 0], [0, 1, 0, 1], [0, 0, 1, 0], [0, 0, 0, 1]], np.float32)
klm.processNoiseCov = np.array([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]], np.float32)

while 1:
    ret, frame = cap.read()
    # frame = cv2.resize(frame, None, None, fx=1 / 2, fy=1 / 2, interpolation=cv2.INTER_CUBIC)
    if ret == True:
        hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
        dst = cv2.calcBackProject([hsv], [0], hist, [0, 180], 1)
        ret, track_window = cv2.meanShift(dst, track_window, term_crit)
        x, y, w, h = track_window

        # 获得中间坐标
        cent = center([[x, y], [x + w, y], [x, y + h], [x + w, y + h]])
        # 修正参数
        klm.correct(cent)
        # 预测
        c = klm.predict()
        # 画出预测位置
        cv2.circle(frame, (int(c[0]), int(c[1])), 30, (255, 255, 0), -1)
        # 画出矩形框
        img = cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
        (x, y) = img.shape[:2]
        cv2.imshow('img', img)
        if cv2.waitKey(1) == ord('q'):
            break
    else:
        break
cap.release()
cv2.destroyAllWindows()

Guess you like

Origin blog.csdn.net/weixin_45823221/article/details/128484856