Detailed explanation of LK optical flow method (including pyramid multi-layer optical flow), reverse optical flow method (with code)

The LK optical flow method can be used to track the position of feature points.
For example, the feature points in img1 come to different positions in img2 due to the movement of the camera or the object. Later, img1 will be called Template (T), and img2 will be called I.

The optical flow method has a hypothesis:
the gray level is constant assumption: the pixel gray value of the same spatial point is constant in each image, that is to say, the gray level at the feature point in T is still the same in I grayscale.

This requires constant illumination and constant object reflection, which is a strong assumption.

What is to be estimated now is the motion offset [dx, dy], which is the optical flow. It cannot be solved with only one point. Generally, pixels within a window are taken, considering that they have the same motion.

Use the least squares method to solve the movement of pixels as follows:
insert image description here
W here represents the time to transform the pixel of the image I and change it to T, that is, the image before the movement, and see how much error it has with T. Here it is assumed that the pixels within the window are transformed.
p is the motion displacement (dx, dy).

This is a nonlinear optimization problem, because pixels are not linearly related to coordinates.
At this time, it can be assumed that p is already known (assuming that it is the point coordinates before the movement, or a value is given), and based on it, the increment is continuously added for adjustment.
So it becomes an optimization problem as follows, which is called error:
insert image description here
every time the increment Δ p \Delta p is estimatedΔ p
updates p,
insert image description here
(4)(5) Iterates repeatedly in two steps until the convergence condition is reached, the convergence condition can be
Δ p \Delta pΔ p is less than a threshold, or the cost in (4) is larger than the previous one (theoretically, the cost is gradually reduced)

The above is the general steps of the least squares method to find the optical flow, specifically how to find Δ p \Delta pΔ p , the following is the Gauss-Newton method solutionΔ p \Delta pΔ p , and the steps to iterate out the optical flow (dx, dy):

  1. Take the above (4) formula, which is the cost function, for I ( W ( x ; p + Δ p ) ) I(W(x;p+\Delta p))I(W(x;p+Δ p ) ) carries out the first-order Taylor expansion to get:▽ I \bigtriangledown I
    insert image description here
    hereI refers to finding the gradient in the x and y directions at the feature point in image I, that is, img2, and then transforming it back to the coordinates of T through W.
    ∂ W / ∂ p \partial W / \partial pW / p refers to the partial derivative of W with respect to p.
    For example, if W is as follows
    insert image description here
    [ ∂ W x / ∂ p 1 ∂ W x / ∂ p 2 ∂ W y ∂ p 1 ∂ W y / ∂ p 2 ] = [ 1 0 0 1 ] \left[ \begin {matrix} \partial W_x/ \partial p1 & \partial W_x/ \partial p2 \\ \partial W_y \partial p1 & \partial W_y/ \partial p2 \\ \end{matrix} \right] = \left[ \begin {matrix} 1 &0 \\ 0 & 1 \\ \end{matrix} \right][Wx/p1Wyp1Wx/p2Wy/p2]=[1001]
    The above (p1, p2) is the optical flow (dx, dy) we require, this∂ W / ∂ p \partial W/ \partial pW / p is the Jacobian matrixJJJ

  2. The above formula (6) is for Δ p \Delta pΔ p finds the partial derivative, and we get:
    insert image description here
    first explainΔ p \Delta pThe relationship between Δ p and the above p. The above p is the optical flow (dx, dy) we require, and it is difficult to find it directly. Now assume that the initial (dx, dy) is known, such as (0, 0), to pass Find the increment of (dx, dy) each time (Δ dx , Δ dy ) \Delta dx, \Delta dy)Δdx,Δ d y ) , that is,Δ p \Delta pΔ p , to continuously modify (dx, dy) until convergence.
    In addition, the partial derivative of the above formula (9) is 0, which can be obtained:
    insert image description here
    This is the solution of the least squares method.
    In fact, it can be thought that the form of the least squares method to solve the incremental problem is generally
    H Δ x = b H \Delta x = bHΔx=b , where
    H = JJT , b = − J e H = JJ^{T}, b = -JeH=JJT,b=Ye ,ee _e is the error function.

Similarly, H = JJTH = JJ^{T} aboveH=JJT ,
insert image description here
Δ p \Delta pOnce Δ p is obtained, it can be iterated continuously to obtain the final optical flow (dx, dy), which is p.

The following is an example to illustrate how to track the optical flow of feature points.

First extract the feature point kp1 in img1, which is T

Mat img1 = imread("../imgs/LK1.png", 0); //T
Mat img2 = imread("../imgs/LK2.png", 0);  //I

//key points
vector<KeyPoint> kp1;
FAST(img1, kp1, 40); //可以用其他特征点

For each feature point kp1[i], set initial (dx, dy) = (0, 0)

auto kp = kp1[i];
double dx = 0, dy = 0;  //需要估计

Several matrices to be used

Eigen::Matrix2d H = Eigen::Matrix2d::Zero();  //Hessian
Eigen::Vector2d b = Eigen::Vector2d::Zero();  //bias
Eigen::Vector2d J;   //Jacobian

As mentioned earlier, you can't just calculate one point, you need to calculate the points in a small window, and assume that their movements are the same.
The reverse optical flow method appears in the code, which will be explained later.
H, b are found in the following code, and Δ p = H − 1 b \Delta p = H^{-1}bp _=H1b

//计算cost和jacobian
for(int x = -half_patch_size; x < half_patch_size; x++) {
    
    
    for(int y = -half_patch_size; y < half_patch_size; y++) {
    
    
        double error = GetPixelValue(img1, kp.pt.x + x, kp.pt.y + y) -
                GetPixelValue(img2, kp.pt.x + x + dx, kp.pt.y + y + dy);
        if(inverse == false) {
    
    
            //error分别对dx, dy求导,因为是离散的,所以用dx+1,dx-1的位移差,dy同样
            //x, y是窗口内的偏移量
            //img1中没有dx, dy的变量,所以只有img2
            J = -1.0 * Eigen::Vector2d(
                    0.5 * (GetPixelValue(img2, kp.pt.x + x + dx + 1, kp.pt.y + y + dy) -
                           GetPixelValue(img2, kp.pt.x + x + dx - 1, kp.pt.y + y + dy)),
                    0.5 * (GetPixelValue(img2, kp.pt.x + x + dx, kp.pt.y + y + dy + 1) -
                           GetPixelValue(img2, kp.pt.x + x + dx, kp.pt.y + y + dy - 1))
                    );
        }else if (iter == 0) {
    
    
            //反向光流法,只需计算第一次的H,后面H固定
            J = -1.0 * Eigen::Vector2d(
                    0.5 * (GetPixelValue(img1, kp.pt.x + x + 1, kp.pt.y + y) -
                              GetPixelValue(img1, kp.pt.x + x - 1, kp.pt.y + y)),
                    0.5 * (GetPixelValue(img1, kp.pt.x + x, kp.pt.y + y + 1) -
                              GetPixelValue(img1, kp.pt.x + x, kp.pt.y + y - 1))
                    );
        }

        //计算H, b and set cost
        b += -error * J;
        cost += error * error;
        //正向光流法每次迭代都要更新H
        if(inverse == false || iter == 0) {
    
    
            H += J * J.transpose();
        }
    }

Δ p = H − 1 b \Delta p = H^{-1}b p _=H1b

Eigen::Vector2d update = H.ldlt().solve(b);  //避免求矩阵的逆

//最小二乘收敛
if(iter > 0 && cost > lastCost) {
    
    
    break;
}

iterative update (dx, dy)

dx += update[0];
dy += update[1];
lastCost = cost;

Finally (dx, dy) is the optical flow we require. According to the keypoint kp1 in img1, the keypoint coordinates in img2 can be tracked as

kp2[i].pt = kp.pt + Point2f(dx, dy);

The above is a single-layer optical flow. Let’s talk about the multi-layer optical flow
insert image description here of the pyramid. Assuming that the bottom layer on the left (No. 3) is the original image img1 (T), then the upward (No. 2, No. 1) is the scaling of img1. The bottom layer on the right is img2, and the upper part is the scaling of img2.

double pyramid_scale = 0.5;
//create pyramids
vector<Mat> pyr1, pyr2;  //image pyramids
for(int i = 0; i < pyramids; i++) {
    
    
    if(i == 0) {
    
    
        pyr1.push_back(img1);
        pyr2.push_back(img2);
    } else {
    
    
        Mat img1_pyr, img2_pyr;
        //cv::Size(列,行)
        resize(pyr1[i - 1], img1_pyr,
               Size(pyr1[i-1].cols * pyramid_scale, pyr1[i-1].rows * pyramid_scale));
        resize(pyr2[i - 1], img2_pyr,
                Size(pyr2[i-1].cols * pyramid_scale, pyr2[i-1].rows * pyramid_scale));
        pyr1.push_back(img1_pyr);
        pyr2.push_back(img2_pyr);
    }
}

When calculating the optical flow, it is calculated from the top layer (No. 1).
The scaled img1 of No. 1 on the left is T, and the scaled img2 of No. 1 on the right is I. Calculate the optical flow with the keypoint kp1 of img1 to obtain kp2.
When calculating the next two layers, the result of the optical flow of the previous layer is used as the initial assumption of the next layer, instead of assuming (dx, dy) = (0, 0) like the top layer.

double dx = 0, dy = 0;  //需要估计
if(has_initial) {
    
    
    dx = kp2[i].pt.x - kp.pt.x;  //x位移
    dy = kp2[i].pt.y - kp.pt.y;  //y位移
}

What are the benefits of doing this? Assuming that 20 pixels are moved, the direct optical flow may fall into a local extremum due to too much change; but after zooming, it may only move 5 pixels. This is a kind of coarse to fine thinking.

Now let’s talk about the reverse optical flow method .
The previous forward optical flow needs to calculate H once every iteration, and the calculation is very heavy. Consider whether there is a way to calculate H only once, and then use this H later.
The reverse optical flow method is the reverse of the previous forward optical flow, that is, the direction is exchanged, and now it is reversed from I to T, that is, the I after the movement is transformed back to the T before the movement.
Refer to a picture for easy understanding
insert image description here


insert image description here
The error function is as follows, which is calculated by exchanging T and I :
insert image description here
Among them
insert image description here
, it can be seen that since T is the img before the movement, p=(dx, dy) has not been moved, so H has nothing to do with p, and Δ p \Delta is calculated for each iteration pH is constant when Δp , and only needs to be calculated once in the first iteration. Corresponds to the following part of the above code:

}else if (iter == 0) {
    
    
            //反向光流法,只需计算第一次的H,后面H固定
            //H只和T,也就是img1有关
            J = -1.0 * Eigen::Vector2d(
                    0.5 * (GetPixelValue(img1, kp.pt.x + x + 1, kp.pt.y + y) -
                              GetPixelValue(img1, kp.pt.x + x - 1, kp.pt.y + y)),
                    0.5 * (GetPixelValue(img1, kp.pt.x + x, kp.pt.y + y + 1) -
                              GetPixelValue(img1, kp.pt.x + x, kp.pt.y + y - 1))
                    );
        }

H does not need to be updated in each iteration

if(inverse == false || iter == 0) {
    
    
    H += J * J.transpose();
}

The above is the forward optical flow and reverse optical flow. If you want to know more detailed information, please refer to the full version of the paper code, please refer to the link

Guess you like

Origin blog.csdn.net/level_code/article/details/123459188