Parallax optimization for binocular stereo matching

Get into the habit of writing together! This is the 4th day of my participation in the "Nuggets Daily New Plan·April Update Challenge", click to view the details of the event .

This article is my own study notes, recording the problems and summaries in the process of learning stereo matching, share it here, please attach the link to the original text for reprinting

Parallax optimization

  The previous article introduced cost aggregation , and this article mainly describes the method of parallax optimization. In fact, after the cost matrix S is obtained by cost aggregation, the next operation is to perform disparity calculation, but this part is relatively simple (mainly using the winner-take-all algorithm). It has been introduced in the third section of the binocular stereo matching step , but there is not much here. elaborate.

  Parallax optimization, as the name suggests, is to make the parallax map perform better! ! ! To make the effect better, first of all, we should know what is the cause of the bad parallax. In fact, the reason is very simple, that is, the calculation of the matching cost during cost aggregation is not accurate, resulting in wrong parallax, and the cause of this phenomenon is often caused by image noise, occlusion, weak texture or repeated texture. Since wrong parallaxes are generated, we must find a way to correct these wrong parallaxes-the method we use is: first remove these nasty wrong matching points, make these points invalid, and then fill in these invalid points.

  Having said that, let's first introduce the method of eliminating these false matching points.  

Left-right consistency check

  The left-right consistency check is currently the most commonly used method for parallax optimization, and this method is used in almost all parallax optimizations. It is a disparity-based uniqueness constraint, that is, there is only one correct disparity per pixel. The basic idea is: we have obtained the disparity map of the left image from the cost aggregation step, and now the left and right images are swapped, that is, the left image becomes the right image, and the right image becomes the left image. At this time, do the stereo again. If they match, the disparity map of the new left image (original right image) will be obtained. At this time, the disparity of the left and right images with the same name point is compared to see if the disparity is consistent. If they are consistent, the consistency check is satisfied. Satisfactory points are eliminated. In fact, we have a certain degree of tolerance for comparing the parallax between the left and right images with the same name, that is, we do not need the parallax of the two to be consistent, but if the difference between the two parallax values ​​is less than a certain threshold (usually 1 Pixels), if the uniqueness constraint is satisfied, it will be retained, otherwise it will be rejected if it does not meet the uniqueness constraint. The formula for the consistency check is shown in Equation 1:

insert image description here

  Here b refers to the left view, m refers to the right view, and p and q are point pairs with the same name.

  This is the left-right consistency check, isn't it easy to find it? The consistency check method of exchanging left and right images and then performing stereo matching twice is called external type check. It should be noted that after swapping the left and right images, the overlapping areas of the left and right images are on both sides, which does not conform to the setting of the original algorithm, so here I generally flip the pixels of the left and right images to a horizontal mirror after the swap, so that the two images the overlap area to the middle. We can see the difference in the overlapping area after swapping the left and right images through the following figure:insert image description here

  External type checking This method is logically clear, but it takes a lot of time to perform stereo matching twice. Is there a more efficient algorithm? ...enmmm... I said that there must be some, let's introduce this magical method in detail - internal type inspection. Well... it's not magical actually. The internal check method is to calculate the cost array of the right image directly through the cost array of the left image, so as to calculate the disparity map of the right image. It can be seen that no, there is no process of stereo matching twice, which greatly improves the efficiency. Do you understand it, but I don't seem to understand it. The specific operation is given below. First, the formula for calculating the cost of the right image is given:

      The cost of right image ( i , j ) with disparity d = the cost of left image ( i , j + d ) with disparity d

  不知道能不能理解,现做如下解释:对于右影像的像素( i , j ),根据视差值d可算出左影像的对应像素位置为( i , j + d ),然后把左影像( i , j + d )同样视差值d下的代价值取出来作为右影像( i , j )在视差d下的代价值。

  根据上面的公式,我们可以将右影像每个像素的所有候选视差d的代价值Cost ( i , j , d )都得到,进而寻找最小代价值对应的视差,最终得到右影像视差图。

  计算出右视差图后,执行一致性检查:根据左影像视差图可以算出像素(ileft,jleft)在右影像中的匹配像素是(ileft,jleft-d),若(ileft,jleft-d)的视差刚好也近似等于d,则满足一致性。   读到这里会不会有这样的疑问呢:既然右影像的代价是根据左影像代价计算的,那算出来同名点的视差不会完全一致吗?有问题我认为非常好,起码说明我们在思考!!!

  其实会产生这个疑问还是对代价计算的过程不清楚,代价计算是要计算出某点对于各个视差下的代价值。我们通过下图解释一下:现我们知道左图的A点像素为(i,j),其在右图中的匹配像素是A’(i,j-d),那么现在A’相对于A的视差的确为d,但这可以作为A’最终的视差嘛,显然这是不一定的,因为该点对应的匹配代价不一定是最低的,还要看其它视差下的代价值,最终选取一个最小的代价值所对应的视差作为该点的视差。如果你还没理解的化,那一定是我表述的不够清楚,再多读几篇看看能否理解我所想表达的含义。 insert image description here

    

剔除小连通域

  剔除小连通域是指剔除掉视差图中连通的极小块区域,其在图像上一般表现为一些极不协调的一些小块,如下图红框中白色区域所示: insert image description here

  其主要的思想是判断同一个连通区域内的视差与邻域视差的差值,若差值大于规定的阈值,则将其剔除,将该块区域都视为无效的视差区域。  

唯一性检测

  唯一性检测是指对每个像素计算最小代价和次最小代价的值,若两者相对差小于一定阈值,则被剔除。其实这里也非常容易理解,举个例子,在寻找某个像素点的视差时,那么这时候会在代价矩阵中寻找代价最小的点,这时候发现最小的代价是20,但是还有一个次最小的代价是21,这两个代价相差很小,往往又因为代价计算是受噪声等因素影响的,所以这两个代价极其相近的点我们不能准确的说哪一个才对应着最优的视差,这时候我们就把这种不好抉择的点给剔除掉。这就是视差的唯一性检测。        

  上面介绍了几种剔除错误匹配点的方法,其实做这些的目的都是为了进行视差优化,下面再介绍一种视差优化的方法。

 

子像素拟合

  我们通过代价矩阵得到的视差图是整像素精度的,这在许多应用中都是无法满足需求的。这时候我们需要对得到的视差图进行子像素拟合,让视差图的精度进一步提高。子像素拟合?怎么拟合?挺起来好像挺高大上的,但是…额...好像没有什么技术含量,你们肯定一看就懂。好了,下面通过下图来解释一下:

insert image description here

  图中是某个像素在各个视差下的匹配代价,我们发现代价值最小的点在视差为18时取得,但这时我们不把该像素的视差定为18,而是在其左右各取一个代价指,将其记录下来,形成上图右边的三个条形图,然后利用这三个条形图进行一个一元二次拟合,曲线的最低点所对应的横坐标就是视差的子像素的位置。

  我们将上图抽象出来,即知道三点的坐标(三个条形图的横坐标d和纵坐标cost),求做一个二次拟合曲线的极值点,如下图所示:  insert image description here

  这个dsub也非常容易推导,我们设这个一元二次曲线的方程为 ,该方程有三个未知数,而我们知道三个点的坐标,即可求出这三个未知数的值。而且我们知道一元二次方程的最小值是在x=- b 2 a {b\over2a} 时取到,那么这个x就是dsub的值。

     

  文章的一开始就提到我们要先剔除那些可恶的错误匹配点,让这些点成为无效点,然后再来填补这些无效点。现在那些错误匹配点已经基本填补,下面来介绍视差填充。

 

视差填充

  下面直接进入主题,我们该怎么进行是视差填充呢?我们很自然的可能会想到:我们可以用无效视差周围的视差来对其进行填补。我想这种方法是可行的,但的确考虑的不周全,我们可以想想看,如果某一个像素p在左视图中可见,但在右视图中却不可见(即为遮挡区域),被遮挡的像素点p是位于背景图像的,该点的视差应该比较小(根据 z= b f d i f f {bf\over diff} ),而且p点应位于视差的非连续区域,即一侧是前景(视差大),一侧是背景(视差小)。因此我们在选择像p点这种被遮挡点的视差时,应该尽量选择背景像素的视差,即相对较小的视差。

  通过上面的分析,我们在视差填补时应该先判断一些点是被遮挡点还是普通误匹配点(视差连续),然后根据不同的分类采取不同的分类策略。那么怎么判断某些像素是不是属于遮挡区呢?通过以下方法判断: (1)像素p是通过各种优化操作而判定的无效像素。 (2)左影像像素p在右影像上的匹配像素为q = p − d,像素q在右视差图上的值为dr,通过dr找到左影像的匹配点p’,获取p’的视差d’,若d’>d,则p为遮挡区。

  第二条有点绕,多读几篇理解以下,这里也给出一个另一种解释,看能不能辅助理解:假设q是通过视差d找到的同名点,如果在左影像存在另外一个像素p’ 也和q是同名点而且它的视差比d要大,那么p就是遮挡区。

  Now that you know how to judge which points are occlusion points, the next step is to assign appropriate parallaxes to occlusion points and mismatched points. We adopt the following method: For the occlusion area pixel, because its identity is the background pixel, it cannot select the disparity value of the surrounding foreground pixels, and the disparity value of the surrounding background pixels should be selected. Since the disparity value of the background pixels is smaller than that of the foreground pixels, after collecting the effective disparity values ​​around, you should choose the smaller ones. Which one is the specific one? The SGM authors chose the sub-minimum parallax. For the mismatched pixel, it is not located in the occlusion area, so the surrounding pixels are visible, and there is no discontinuous parallax caused by occlusion. It is like a small piece of noise raised on a continuous surface. At this time, the surrounding The parallax values ​​of , are equivalent, and neither should be selected or which should not be selected. In this case, the median value is suitable. The formula for selecting the appropriate parallax is expressed as follows:

insert image description here

    Finally, we can look at the difference between the following parallax filling parallax maps, and we can see that many black invalid points before parallax filling disappear after parallax filling.insert image description here

咻咻咻咻~~duang~~Like it

Reference article: ethanli.blog.csdn.net/article/det…

Previous: Cost aggregation for binocular stereo matching

Guess you like

Origin juejin.im/post/7082549618440929311