Click the card below to follow the " CVer " official account
AI/CV heavy dry goods, delivered in the first time
Click to enter —>【Depth Estimation】communication group
Robust Monocular Depth Estimation under Challenging Conditions
Author list: Stefano Gasperini, Nils Morbitzer, HyunJun Jung, Nassir Navab, Federico Tombari
Translated by: Zhai Guangyao
Unit: Technical University of Munich, Google
foreword
Hello everyone, my name is Stefano Gasperini, and I am here to promote our work on ICCV 2023, more details can be found in our paper: https://arxiv.org/abs/2308.09711, and our project website: https:// md4all.github.io.
Code: https://github.com/md4all/md4all
Reply in the background of the CVer WeChat public account: md4all, you can download the pdf and code of this paper
First of all, please watch such an example:
Can you see the tree in the color picture?
Our monocular depth estimation network outputs reliable depth estimates under all conditions, even in darkness!
background
While state-of-the-art monocular depth estimation methods achieve impressive results in ideal environments, they are highly unreliable under challenging lighting and weather conditions, such as nighttime or rainy days.
In these cases, unfavorable factors such as inherent sensor noise, textureless dark areas, and reflections all violate the training assumptions of supervised and self-supervised learning methods. Self-supervised methods cannot establish the correspondence of pixels needed to learn depth, while supervised methods may learn data flaws from sensor truth (such as LiDAR and nuScenes data samples in the above figure).
method
In this paper, we propose md4all to address these security-critical issues. md4all is a simple and effective solution that works reliably under both adverse and ideal conditions, and is applicable to different types of supervised learning.
We achieve this by exploiting the ability of existing methods to work in perfect settings. Therefore, we provide an effective training signal independent of the input signal. First, through image transformation, we generate a set of complex samples corresponding to normal training samples. We then guide the network model to either self-supervised learning or fully supervised learning by feeding the generated samples and computing a standard loss on the corresponding raw images.
As shown in the figure above, we further distill knowledge from a pre-trained baseline model that only performs inference under ideal circumstances, while feeding a mixture of ideal and unfavorable inputs to the deep model.
The implementation code of the proposed method is included in our GitHub code repository, welcome to visit:
https://github.com/md4all/md4all
result
With md4all, we substantially outperform previous solutions, providing robust estimates under a wide range of conditions. It is worth noting that the proposed md4all only uses a monocular model without a dedicated branch.
The figure above shows the prediction results in the challenging environment of the nuScenes dataset. The self-supervised method Monodepth2 cannot extract valuable features due to the darkness of the scene and the influence of noise (first row). Supervised AdaBins learn the imperfections from the sensor data and cause hole predictions on the road (second row). Applied on the same architecture, our md4all improves robustness under both standard and adverse conditions.
In this paper, we show the effectiveness of md4all under two types of supervision under standard and unfavorable conditions. Through extensive experiments on the nuScenes and Oxford RobotCar datasets, md4all performed significantly better than previous works (as shown in the data above).
image conversion
We also show examples of image transformations generated for training md4all (shown above). We perform data augmentation by feeding the model a mixture of original and transformed samples. Such a model can recover information under different conditions without modification at inference time.
Here, we open-source share all adverse-condition generated images corresponding to sunny and cloudy samples from the nuScenes and Oxford Robotcar training sets. Welcome to visit:
https://forms.gle/31w2TvtTiVNyPb916
These images can be used in future robust methods for depth estimation or other tasks.
Reply in the background of the CVer WeChat public account: md4all, you can download the pdf and code of this paper
Click to enter —> [Target Detection and Transformer] Exchange Group
ICCV/CVPR 2023 Paper and Code Download
Background reply: CVPR2023, you can download the collection of CVPR 2023 papers and code open source papers
后台回复:ICCV2023,即可下载ICCV 2023论文和代码开源的论文合集
目标检测和Transformer交流群成立
扫描下方二维码,或者添加微信:CVer333,即可添加CVer小助手微信,便可申请加入CVer-目标检测或者Transformer 微信交流群。另外其他垂直方向已涵盖:目标检测、图像分割、目标跟踪、人脸检测&识别、OCR、姿态估计、超分辨率、SLAM、医疗影像、Re-ID、GAN、NAS、深度估计、自动驾驶、强化学习、车道线检测、模型剪枝&压缩、去噪、去雾、去雨、风格迁移、遥感图像、行为识别、视频理解、图像融合、图像检索、论文投稿&交流、PyTorch、TensorFlow和Transformer、NeRF等。
一定要备注:研究方向+地点+学校/公司+昵称(如目标检测或者Transformer+上海+上交+卡卡),根据格式备注,可更快被通过且邀请进群
▲扫码或加微信号: CVer333,进交流群
CVer计算机视觉(知识星球)来了!想要了解最新最快最好的CV/DL/AI论文速递、优质实战项目、AI行业前沿、从入门到精通学习教程等资料,欢迎扫描下方二维码,加入CVer计算机视觉,已汇集数千人!
▲扫码进星球
▲点击上方卡片,关注CVer公众号
It's not easy to organize, please like and watch