[OpenCV] Calculate the optical flow of the video and track the object calcOpticalFlowPyrLK

 1. Introduction

        You can use OpenCV’s calcOpticalFlowPyrLK method to calculate optical flow. cv2.calcOpticalFlowPyrLK is a function in the OpenCV library that is used to calculate sparse optical flow. It implements the Lucas-Kanade method, which is a commonly used optical flow calculation method.

        Optical flow is an approximate representation of the movement of objects in an image. It describes the movement of each pixel in the image between two consecutive frames. The Lucas-Kanade method assumes that all pixels within a small neighborhood in the image are uniform in motion (i.e. have the same optical flow).

2. Principle

        The following is the basic working principle of cv2.calcOpticalFlowPyrLK:

        ​ ​ 1. Select feature points: Select some feature points in the first frame of the image. These feature points are usually corner points because corner points have changes in all directions and are easier to track.

        ​​​​ 2. Window: For each feature point, define a surrounding window.

        ​​​​ 3. Optical flow: For each window, assume that all pixels have the same optical flow, and then solve the optical flow by minimizing the brightness difference of the pixels in the window between the first frame and the second frame.

        ​​​​​4. Pyramid: In fact, since motion can occur at different scales, cv2.calcOpticalFlowPyrLK actually repeats the above steps at multiple scales of the image (each level of the pyramid). This is the so-called pyramid Lucas-Kanade method.

        It should be noted that since the Lucas-Kanade method assumes that all pixels within a window have the same optical flow, it can only handle small and smooth motions. For large or complex motions, more sophisticated methods such as the Horn-Schunck method need to be used 

3. Code implementation

        The following code realizes the identification of feature points, tracks these feature points, and draws the movement trajectory of each point.

import numpy as np
import cv2

src_path = r'input.mp4'
target_path = r'output.mp4'

fps = 25
cap = cv2.VideoCapture(src_path)

# ShiTomasi corner detection的参数
feature_params = dict(maxCorners=300,
                      qualityLevel=0.3,
                      minDistance=7,
                      blockSize=7)
# 光流法参数
# maxLevel 未使用的图像金字塔层数
lk_params = dict(winSize=(10, 10),
                 maxLevel=2,
                 criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03))

ret, old_frame = cap.read()                             # 取出视频的第一帧
old_gray = cv2.cvtColor(old_frame, cv2.COLOR_BGR2GRAY)  # 灰度化
p0 = cv2.goodFeaturesToTrack(old_gray, mask=None, **feature_params)
mask = np.zeros_like(old_frame)                         # 为绘制创建掩码图片
h, w, _ = old_frame.shape
target_video = cv2.VideoWriter(target_path,cv2.VideoWriter_fourcc(*"H264"), fps, (w, h))
while True:
    res, frame = cap.read()
    if not res:
        break
    frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    # 计算光流以获取点的新位置
    p1, st, err = cv2.calcOpticalFlowPyrLK(old_gray, frame_gray, p0, None, **lk_params)
    # 选择good points
    good_new = p1[st == 1]
    good_old = p0[st == 1]
    good_new1 = good_new.copy()
    for i, (new, old) in enumerate(zip(good_new, good_old)):
        a, b = new.ravel()
        c, d = old.ravel()
        mask = cv2.line(mask, (int(a), int(b)), (int(c), int(d)), (0,255,0), 3)
        frame = cv2.circle(frame, (int(a), int(b)), 5, (0,255,0), -1)
    img = cv2.add(frame, mask)
    target_video.write(img)
    old_gray = frame_gray.copy()
    p0 = good_new1.reshape(-1, 1, 2)

target_video.release()
cv2.destroyAllWindows()
cap.release()

4. Parameter explanation

1.calcOpticalFlowPyrLK input parameter explanation

1. prevImg: The previous frame input image.

2. nextImg: Input image for this round.

3. prevPts: Feature points (2D point vector) extracted from the previous frame image.

4. nextPts: Points (2D point vectors) extracted from the current round image, including the calculated new position.

5. winSize: The search window size of each level of the pyramid. The default value is Size(21,21).

        The image pyramid is a series of images obtained by continuously downsampling the original image. The image of each layer is smaller than the image of the previous layer. It's like a pyramid with the original large image at the bottom and the smallest image at the top. Image pyramids are used to handle motion at different scales. This is because in real images, the motion of objects may occur at different scales. For example, objects that are far away from the camera may have relatively small motion in the image, while objects that are close to the camera may have relatively large motion in the image. This motion at different scales can be handled by computing optical flow at each level of the image pyramid.

6. maxLevel: The maximum number of pyramid levels based on 0; if set to 0, no pyramid (single level) is used, if set to 1, then Use two levels etc. The default value is 3.

        When building an image pyramid, starting from the original image, the size of each layer of the image is half the size of the previous layer, and maxLevel is the maximum number of layers of the pyramid. maxLevel=0 means only the original image is used, maxLevel=1 means the original image and its half-size image, maxLevel=2 means the original image, half-size image and quarter-size image, and so on.

7. criteria: criteria is an important parameter that needs to be explained in detail. This parameter is a tuple that specifies the termination criterion of the iterative search algorithm. It contains three elements:
        (1) cv2.TERM_CRITERIA_EPS or cv2.TERM_CRITERIA_COUNT or a combination of both. These are the types of termination criteria:
- cv2.TERM_CRITERIA_EPS: Iterations stop when the specified precision epsilon is reached.
- cv2.TERM_CRITERIA_COUNT: Iteration will stop when the specified maximum number of iterations is reached.
- cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT: When any of the above conditions is met, the iteration will stop.
        (2) Maximum number of iterations. In this example, the maximum number of iterations is 10.
        (3) epsilon, the required accuracy. In this example, epsilon is 0.03.
For example, criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03) means that the iteration will stop when it reaches 10 iterations or when the error accuracy reaches 0.03, whichever is reached first 

8. flags: operation flag. Can be set to

        (1) 0: No special behavior.

        (2) cv2.OPTFLOW_USE_INITIAL_FLOW: If this flag is set, the function will use the point in the nextPts parameter as the initial approximation and then optimize. Otherwise, it will directly use the point in the prevPts parameter as the initial value.
         (3) cv2.OPTFLOW_LK_GET_MIN_EIGENVALS: If this flag is set, the function will calculate the minimum eigenvalue of each point and store them in the err parameter.

        0,cv2.OPTFLOW_USE_INITIAL_FLOW或cv2.OPTFLOW_LK_GET_MIN_EIGENVALS。

        For example, if you want to use the points in nextPts as an initial approximation, you can call the function like this:

p1, st, err = cv2.calcOpticalFlowPyrLK(old_gray, frame_gray, p0, None, flags=cv2.OPTFLOW_USE_INITIAL_FLOW, **lk_params)

9. minEigThreshold:The threshold of the minimum eigenvalue. The default value is 1e-4.

The main purpose of this function is to track some feature points (prevPts) between two images using the Lucas-Kanade method. The new positions of these feature points are stored in nextPts. If a point cannot be tracked (for example, because it wanders out of the image), then the corresponding element of the status vector is set to 0 

2.Explanation of calcOpticalFlowPyrLK output parameters

1. nextPts:This is an array containing the new positions of the input feature points found in the next frame of the image.

2. status: This is an array, the same size as the input feature point array. If status[i] is 1, it means that the new position of the i-th feature point has been found; if it is 0, it means that it has not been found.

3. err: This is an array, the same size as the input feature point array. err[i] represents the error between the new position and the original position of the i-th feature point. This error is based on the window size.

These three output parameters can be used to understand and evaluate the calculation results of optical flow. For example, you can check the status array to see which feature points were successfully found in the next frame of the image, or check the err array to evaluate the calculation accuracy of optical flow 

        This is a brief introduction to the usage of calcOpticalFlowPyrLK. Pay attention to not getting lost (#^.^#)

Guess you like

Origin blog.csdn.net/xian0710830114/article/details/134302562