Binocular Stereo Vision: SAD Algorithm

Algorithm principle

SAD (Sum of absolute differences) is an image matching algorithm. Basic idea: the sum of the absolute values ​​of differences. This algorithm is often used in image block matching, and the absolute value of the difference between the corresponding values ​​of each pixel is summed to evaluate the similarity of two image blocks. This algorithm is fast, but not precise, and is often used for initial screening in multistage processing.

Common Stereo Matching Algorithm Process

The common stereo matching algorithm mainly includes the following four steps

  1. Matching cost calculation

  1. cost aggregation

  1. Parallax calculation or optimization

  1. Parallax improvement

The matching cost calculation often adopts methods such as sad, based on the absolute value of the pixel difference between the matching points on the left and right images.

Cost aggregation often uses a fixed window to calculate the sum of all disparities inside the window.

The most intuitive way to calculate the disparity is to use the WTA (Winner Takes All) method to directly select the disparity value that minimizes the aggregation cost.

Summary of BM algorithm

Simple understanding of stereo matching, find the same point in the two aligned images, or a given point in the Reference image, search for the corresponding point in the Target image, as shown in the figure below.

According to the epipolar rule, the red pixel point (x, y) in the left image above searches for a matching point in the right image. In fact, directly matching a point will cause various problems. At this time, we choose to replace a point with a fixed window, as shown in the figure below.

Doing so implies an assumption that the disparity value inside the window is the same, but obviously, this assumption is too taken for granted, which also makes the actual effect of the algorithm not good.

BM algorithm, also often called SAD (Sum of Absolute Differences) algorithm, is the most basic algorithm in binocular stereo matching.

SAD basic theory

The SAD algorithm consists of 3 steps.

  1. Matching cost calculation

  1. cost aggregation

  1. Disparity Calculation

Matching Cost Computation

The calculation of the matching cost of SAD is relatively simple. The pixels of the Reference image and the Target image are directly subtracted and added to the absolute value, that is |I R (x,y)−IT ( x+d,y)||IR(x,y)−IT (x+d,y)| .

The disparity space (DSI) is a three-dimensional matrix that defines

[c(x,y,d)=

I_R(x,y)-I_T(x+d,y)

]

可以理解为Reference图像(x,y)(x,y)点,在搜索视差为dd时的代价。

Cost Aggeration

SAD的代价聚合就是将固定窗口FW(Fixed Window)内代价求和,直观理解如下图所示。

计算FW内视差视差为d时的聚合代价

[C(x,y,d)=\sum_{x\in S}|I_R(x,y)-I_T(x+d,y)|]

Disparity Computation

SAD的视差计算非常简单,采用WTA原则,对于给定的(x,y)(x,y),找使得C(x,y,d)C(x,y,d)最小的d,此d即可认为时该点的视差。

基本流程

输入:两幅图像,一幅Left-Image,一幅Right-Image且两幅图像已经校正实现行对准

对左图,依次扫描,选定一个锚点:

(1)设定SAD窗口的大小(下图灰色区域),left_image为开始匹配的位置,(p,q)以及在right_image中SAD窗口移动的范围D。

(2)在left_image图像中,确定待匹配的像素点的位置(x,y),并以此位置作为SAD窗口的锚点,用SAD窗口覆盖left_image中以(x,y)为锚点的区域regionl。

(3)在right_image图像中,选取匹配的开始点,位置为(m,n),并以该点作为SAD窗口的锚点,用SAD窗口去覆盖,在right_iamge中形成以(m,n)为锚点的图像区域regionr.

(4)定义differernce=regionr-regionl。计算difference中的和。

(5)在right_image图像中沿行方向移动SAD(移动次数为匹配的范围大小),重复步骤(3),(4),并将每次得到的difference记录在mat矩阵中。

(6)找到mat矩阵中difference最小的值,则其所在位置就是right_image和left_image的视差。

代码实现


#include "opencv2/opencv.hpp"

class SAD
{
public:
    SAD() :winSize(7), DSR(30) {}
    SAD(int _winSize, int _DSR) :winSize(_winSize), DSR(_DSR) {}
    cv::Mat computerSAD(cv::Mat& L, cv::Mat& R); //计算SAD
private:
    int winSize; //卷积核的尺寸
    int DSR;     //视差搜索范围

};

cv::Mat SAD::computerSAD(cv::Mat& L, cv::Mat& R)
{
    int Height = L.rows;
    int Width = L.cols;
    cv::Mat Kernel_L(cv::Size(winSize, winSize), CV_8U, cv::Scalar::all(0));
    cv::Mat Kernel_R(cv::Size(winSize, winSize), CV_8U, cv::Scalar::all(0));
    cv::Mat Disparity(Height, Width, CV_8U, cv::Scalar(0)); //视差图

    for (int i = 0; i < Width - winSize; i++)
    {
        for (int j = 0; j < Height - winSize; j++)
        {
            Kernel_L = L(cv::Rect(i, j, winSize, winSize));
            cv::Mat MM(1, DSR, CV_32F, cv::Scalar(0)); //MM是一个1行DSR列的图像(矩阵)

            for (int k = 0; k < DSR; k++)
            {
                int x = i - k; //为什么是i-k参见我上面的叙述
                if (x >= 0)
                {
                    Kernel_R = R(cv::Rect(x, j, winSize, winSize));
                    cv::Mat Dif;
                    cv::absdiff(Kernel_L, Kernel_R, Dif);//
                    cv::Scalar ADD = sum(Dif);
                    float a = ADD[0];//a为视差为k是相应窗口的像素差值的绝对值之和
                    MM.at<float>(k) = a;//将a赋给MM的第k列,因为从0开始搜索,遍历结束后MM每一列为视差为列序号时对应的SAD值,我们取其最小即可
                    std::cout << "i,j: " << i << ", " << j << "; MM " << MM << std::endl;
                }
            }

             cv::Point minLoc; //point数据类型为二维点对象,有横纵xy两个坐标
            double min = 0.0;
            cv::minMaxLoc(MM, &min, NULL, &minLoc, NULL);//返回MM最小值的坐标

            int loc = minLoc.x;//取最小值坐标的横坐标x值,即为对应的列序号,也就是相应的视差值
            //int loc=DSR-loc;
            Disparity.at<char>(j, i) = loc * 16;//*16只是为了方便显示
        }

        double rate = double(i) / (Width);
        //cout << "已完成" << setprecision(2) << rate * 100 << "%" << endl; //处理进度
    }
    return Disparity;
}

int main()
{
    cv::Mat Img_L = cv::imread("SAD\\left_0.jpg", 0);
    cv::Mat Img_R = cv::imread("SAD\\right_0.jpg", 0);
    cv::Mat Disparity;    //视差图

    //SAD mySAD;
    SAD mySAD(7, 30);
    Disparity = mySAD.computerSAD(Img_L, Img_R);

    cv::imshow("Img_L", Img_L);
    cv::imshow("Img_R", Img_R);
    cv::imshow("Disparity", Disparity);
    cv::waitKey();

    return -1;
}

备注:

用SAD算法可以得出左右图像的视差,进一步处理就可以得到深度图,深度与视差成反比的关系。我们做个实验:将手指头放在离眼睛不同距离的位置,并轮换睁、闭左右眼,可以发现手指在不同距离的位置,视觉差也不同,且距离越近,视差越大,其中距离的远近就是深度了。并且可以观察到,用左眼看手指时,手指在你眼中的靠右位置,而用右眼看时,手指在你眼中靠左的位置。假设两只眼分别看到的视野一样大。若用(x,y)表示左眼视图中某个位置的坐标,那么相应的该位置右眼视图的坐标应该为(x-d,y),其中d就是视差。这时(x,y)和(x-d,y)就是最佳匹配点。但是实际情况我们并不知道d是多少。SAD算法就给出了如何求视差d.

SAD算法:我们按视差搜索范围从0开始搜索,找到左右图像最匹配的点,对应的视差值就确定了。如何确定最佳匹配点呢?试想一下,如果视差为0,也就是左右图像一样,那么这个点上下左右区域对应的点都应该相同,所以像素相减后都为0,由于视差的存在(简单理解为从不同的角度看物体,由于光照的影响像素值也会发生改变),该点上下左右区域的像素值不会完全相等,但是我们依然可以利用这个思想,设定一个小窗口,在左右两幅图中计算其像素值差的绝对值之和。根据极线约束覆盖右图像像素点,假如视差搜索范围为0-50,那么就会得到51个结果。若在某个视差值d下该绝对值之和最小,那么d就为该中心点对应的视差。再由视差与深度的关系就可以得到深度图。

https://jiweibo.github.io/StereoBM/

Guess you like

Origin blog.csdn.net/qq_30460949/article/details/129087841