Introduction:

Stereo Vision is a technique aimed at inferring depth from two or more cameras.
Stereo matching is the process of taking two or more images and estimating a 3D model of the scene by finding matching pixels in the images and converting their 2D positions into 3D depths.

In this project, I will use SAD algorithm to build a sparse or dense map that assigns relative depths to pixels in the input images.

Epipolar constraint：

（eippolar constarint）

Why we use eippolar constarint:

For a point P projects onto the left and right projection plane as Pl and Pr. If we want to match Pl on the right plane, we have to search all the left plane (2D scope) to find Pr. However, epipolar constraint will reduce the match into a 1D scope matching since the corresponding point Pr will lie on the epipolar line.

SAD algorithm:

In digital image processing, the sum of absolute differences (SAD) is a measure of the similarity between image blocks. It is calculated by taking the absolute difference between each pixel in the original block and the corresponding pixel in the block being used for comparison.

Steps:

construct a small window like 3X3, 5X5, 9X9
put the window on the left image, like a filter, and select all the pixels of the image in the window.
put the window on the right image, like a filter, and select all the pixels of the image in the window.
the pixels in the left window minus these in right window, and Sum all the absolute values of the subtraction results.
Move the right window, like a filter, repet step 3 and 4.
find the minimum value of the Sum, the window of the minimum Sum on the right window is the best maching window of the left window.
Because of the eippolar constraint and left-right images, we only need to move the right window horizontally.

Some Results:

Extensions:

1. Compute the depth map for many different window sizes

Try to combine (6x6) with (30x30) by calculating the mean of two images:

im1=imread('re1.png');
im2=imread('re2.png');
im3 = (im1 + im2) /2;
imshow(im3)

2. One challenge is images that have large blank regions; develop some other heuristics or rules to "guess" what the best correspondence is for these large regions.

Use SAD algorithm to match the window will not get the blank regions since the window will match the other one through finding the minimun absolute value of subtraction results.

If using other algorithm and some regions can not find their maching regions, we can use Interpolation to calculate a blank region through adjacent regions.

References:

http://vision.deis.unibo.it/~smatt/Seminars/StereoVision.pdf

https://blog.csdn.net/u012507022/article/details/51446891

http://www.cnblogs.com/Crazy-Dog123/articles/5043864.html

http://www.cnblogs.com/Crazy-Dog123/articles/5041950.html

https://blog.csdn.net/ccblogger/article/details/72900316

https://pdfs.semanticscholar.org/5ff4/a905eeb744129d17a3039f229802749c3edc.pdf

https://github.com/luosch/stereo-matching/tree/master/ALL-2views

http://vision.middlebury.edu/stereo/data/

https://blog.csdn.net/liulina603/article/details/53302168

【computer vision】Stereo Depth Estimation