【论文简述】Learning Inverse Depth Regression for Pixelwise Visibility-AwareMulti-View Stereo (IJCV 2022)

一、论文简述

1. 第一作者:Qingshan Xu

2. 发表年份:2022

3. 发表期刊:IJCV

4. 关键词:MVS、3D重建、可见性信息、抗噪声训练、逆深度回归、平均组相关

5. 探索动机:可见性信息被忽略。

Therefore, visibility estimation is totally ignored in almost all networks. However, treating each source image equally will make the cost volume susceptible to the noise from unrelated source images. This greatly limits the performance of learning-based methods on datasets like ETH3D high-res benchmark with wide baselines. Note that, there exist two concurrent works that also estimate visibility information to improve the performance of learning-based MVS. However, these two works still focus on the datasets with narrow baselines, making their performance still limited on datasets with wide baselines.

6. 工作目标:为了使基于学习的MVS方法在实践中真正可行,在深度神经网络中学习源图像的逐像素可见性信息具有重要意义。

To make learning-based MVS methods truly feasible in practice, it is significant to learn the pixelwise visibility information of source images in deep neural networks.

7. 核心思想:Learning Inverse Depth Regression for Multi-View Stereo with Correlation Cost Volume与PVSNet: Pixelwise Visibility-Aware Multi-ViewStereo Network两篇论文的的二合一版。

  1. We propose a pixelwsie visibility-aware group-wise correlation similarity measure to construct a light-weight cost volume. This measure not only allows our network to be truly applied to datasets with strong viewpoint changes, but also greatly eases the memory burden of our network.
  2. We propose a pixelwise visibility estimation network to regress 2D visibility maps from two-view cost volumes and develop an anti-noise training strategy to train the network. The visibility maps can reflect the influence of occlusion, illumination, and unstructured viewing geometry. This allows good views to have larger weights in the final cost volume representation.
  3. We treat the multi-view depth inference problem as an inverse depth regression task and demonstrate that the inverse depth regression can reach more robust and accurate results in large-scale scenes. 
  4. We design an ordinal-based uncertainty estimation strategy for high-resolution depth map refinement. This strategy fits in the 3D reconstruction of large-scale scenes.

8. 实验结果:

our network achieves promising reconstruction results on DTU data-set, Tanks and Temples dataset and ETH3D high-res benchmark.

9.论文下载:

Learning Inverse Depth Regression for Pixelwise Visibility-Aware Multi-View Stereo Networks | SpringerLink

二、实现过程

PVSNet概述

PVSNet由两部分组成:基线和高分辨率估计。

a 极线(VisCIDER):通过权重共享深度特征提取模块对参考图像和源图像进行特征图提取。将源图像的特征图用均匀的逆深度值进行单应性变化,投影到参考图像的坐标。通过组相关模块构建参考视图和每个源视图的双视图代价体,使用(a)中红框所示的像素可见性学习网络对两视图代价体进行回归,得到可见性图。多个双视图代价体进一步聚合为一个统一的代价体,由可见性图加权。通过对代价体进行滤波和回归,得到预测深度图。

b 高分辨率估计:为了产生自适应逆深度假设,基于前一阶段获得的概率体计算基于序数的不确定性。利用自适应逆深度值,通过单应性变化和分组相关建立了薄双视图代价体。多个双视图代价体由前一尺度下获得的上采样可见性图加权聚合成一个统一的代价体。参考图像的高分辨率深度图是通过3D U-Net和逆深度回归生成的。此过程迭代,直到获得与参考图像具有相同分辨率的深度图。

 结构详解参见:https://zhuanlan.zhihu.com/p/558191511

猜你喜欢

转载自blog.csdn.net/qq_43307074/article/details/129618261