In-depth study of rectified homography matrix for online self-calibration of stereo cameras

文章:Dive Deeper into Rectifying Homography for Stereo Camera Online Self-Calibration

Authors: Hongbo Zhao, Yikang Zhang, Qijun Chen, and Rui Fan

Editor: Point Cloud PCL

Welcome to join Knowledge Planet, get PDF papers, and forward them to your circle of friends. The article is for academic sharing only. If there is any infringement, please contact us and delete the article.

The public account is dedicated to sharing content related to point cloud processing, SLAM, three-dimensional vision, high-precision maps and other fields. Everyone is welcome to join. If you are interested, please contact [email protected]. For any infringement or reproduction, please contact WeChat cloudpoint9527.

Summary

Accurate estimation of stereo camera extrinsic parameters is the key to ensuring the performance of stereo matching algorithms. In previous studies, online self-calibration of stereo cameras is usually formalized as a specialized visual odometry problem without considering the principle of stereo correction. This paper provides the first in-depth exploration of the concept of rectified homography, which is the cornerstone of the development of our novel online self-calibration algorithm for stereo cameras, applicable to the case of only one pair of images. Furthermore, a simple yet effective solution is introduced for globally optimal extrinsic parameter estimation in the presence of stereoscopic video sequences. The impracticality of performance quantification using three Euler angles and three translation vector components is also highlighted. Instead, we introduce four new evaluation metrics for quantifying the robustness and accuracy of extrinsic parameter estimation, applicable to single-pair and multi-pair cases. The effectiveness of our proposed algorithm is demonstrated through extensive experiments in indoor and outdoor environments conducted under various experimental settings. Comprehensive evaluation results show that our algorithm performs better than the baseline algorithm.

Main contributions

This paper demonstrates the effectiveness of our proposed algorithm under different external parameters by conducting extensive experiments on a large-scale dataset containing more than 10,000 pairs of real stereo images from indoor and outdoor environments, while also performing experiments on two public The datasets, KITTI and Middlebury, are perturbed in different directions. The superior performance of our algorithm relative to baseline algorithms is demonstrated. Overall, our innovative contributions are as follows: 

• A new online self-calibration algorithm for stereo cameras based on rectified homography for the single-pair case; 

• Provides a simple and effective global optimal external parameter estimation solution for the situation where there are multiple pairs of stereo images; 

• Four practical evaluation indicators designed to comprehensively quantify the performance of online self-calibration of stereo cameras; 

• Extensive experiments conducted in various indoor and outdoor environments using different experimental setups.

Content overview

corrected homography

"Rectifying homography" is a concept in stereovision that is used to correct stereoscopic images so that they meet specific geometric relationships. In binocular vision, by using two cameras to capture the same scene simultaneously, objects in the image may exhibit certain perspective distortion and geometric differences. The goal of correcting homography is to reproject the left and right camera images onto a common plane in order to simplify stereo matching and depth estimation. This plane is usually a plane parallel to the camera's baseline. "Rectifying homography" is a homogeneous matrix that describes the geometric relationship between the left and right camera images so that they meet the condition of parallelism after correction. By applying rectified homography to the left and right images, pixels in the same row can be made to have the same ordinate in both images, thus simplifying the matching problem. In stereovision, this correction is very important to improve the effectiveness and accuracy of stereo matching algorithms, as it helps eliminate perspective distortion in the image, making the matching more reliable and accurate.

Energy function in optimization

"Energy function" is a common concept in computer vision and optimization problems. In the text, it refers specifically to the mathematical function used to describe the optimization objective of the problem. For "Energy function and its solution for single-pair cases", this refers to the energy function used to describe a certain problem and the method to solve the function for a single image pair. The issues mentioned in the article are about camera calibration or image correction in binocular stereo vision. An energy function usually consists of a model representing the objective, as well as parameters that need to be adjusted to minimize or maximize the function. In optimization, by changing these parameters, the optimal solution can be achieved so that the energy function obtains the minimum or maximum value. For the case of a single image pair, there is usually an energy function that is related to the specific geometric relationships and calibration parameters of the images. The goal of solving this energy function is to find parameter values ​​that minimize or maximize the function to obtain the optimal solution to the problem.

global optimization

A step or method for global optimization in the case of multiple pairs of images. In this paper, we need to solve problems involving multiple groups of images. This is contrary to the idea of ​​​​local optimization. Local optimization only considers optimization near the current parameter value, while global optimization considers the entire parameter space. In the case of multiple pairs of images, there may be more complex camera configurations or scene geometric relationships, so the parameters of all image pairs need to be jointly optimized to obtain more accurate and robust results. This includes global adjustments to parameters such as the camera's extrinsic parameters (rotation matrix and translation vector) to achieve the best reconstruction or calibration effect of the entire system. Solving such problems usually requires the use of more complex mathematics and computer vision techniques, including global optimization algorithms, nonlinear optimization methods, etc. The goal of these methods is to find a global optimal solution by considering the interrelationships between all image pairs to achieve better overall performance.

experiment

Two of MindVision's MV-SUA202GC global shutter CMOS cameras were used in our experiments to collect data from indoor and outdoor environments. We achieve camera hardware synchronization by utilizing the 20Hz synchronization signal provided by the FPGA, combined with an external power supply that provides 24V. Experiments to comprehensively evaluate the performance of our algorithm were conducted by mounting the left camera at five different viewpoints (middle, top, bottom, left and right views), as shown in Figure 1.

picture

Figure 1: Experimental configuration with the left camera installed at five different viewpoints

Two public stereo matching datasets, KITTI and Middlebury, were used to further quantify the performance of our algorithm, and four additional viewpoints (top, bottom, left and right views) were also manually created with a rotation angle of 10 degrees. Our algorithm is implemented in C++, using the OpenCV, Sophus, Eigen and Ceres libraries.

picture

Quantitative experiments were conducted on a large-scale data set created by ourselves, and the results are shown in Table I and Figure 2.

picture

Figure 2: Comparison of [3] and our proposed algorithm on the large-scale dataset we created.

Our algorithm shows higher accuracy in the multi-pair case, especially in t∗ and θ∗ estimation. The quantitative experimental results for the KITTI 2015 data set are shown in Table II and Figure 3, which are consistent with the above-mentioned outdoor experiments. We think this may be because the image quality in the KITTI 2015 dataset is slightly higher than our dataset and is less affected by motion blur, allowing the two algorithms to achieve relatively stable results. Since moving vehicles usually have negligible yaw angles unless they turn, the estimation of the rotation vector is relatively stable and accurate. Through a comprehensive analysis of the comprehensive performance of our algorithm on the dataset we created and the KITTI 2015 dataset, we believe that our algorithm is less sensitive to image quality and can provide feasible performance even when the image has motion-induced blur. s solution.

picture

picture

Figure 3: Comparison of [3] and our proposed algorithm on KITTI and Middlebury datasets.

Experimental results on the Middlebury dataset further support our view on the algorithm's performance in both static and dynamic environments. Our algorithm reduces eθ and et by 35.62% and 66.04% on average. The σθ and σt we obtained are comparable to the results obtained by [3].

picture

Figure 4: Qualitative experimental results of disparity estimation: (a) left image; (b) disparity map estimated using uncorrected stereo images; (c) disparity estimated using stereo images corrected using extrinsic parameters estimated based on Ling and Shen algorithm Figure; (d) Disparity map estimated from stereo images using extrinsic parameter correction estimated based on our proposed algorithm.

As shown in Figure 4, the disparity map estimated from uncorrected stereoscopic images is of poor quality, while the disparity map estimated from stereoscopic images self-calibrated and corrected using our proposed algorithm performs better in accuracy with fewer error areas, Compared with the baseline algorithm [3], the disparity map obtained is significantly improved.

Summarize

This paper proposes two important algorithmic contributions: (1) an online self-calibration algorithm for stereo cameras based on the single-pair case, built on the principle of stereo correction; (2) an online self-calibration algorithm for stereo cameras when multiple pairs of stereo images are available An efficient and effective algorithm for global optimization of extrinsic parameter estimation. Furthermore, this paper introduces four new practical evaluation metrics for quantifying the robustness and accuracy of extrinsic parameter estimation, applicable to both single-pair and multi-pair cases. By conducting comprehensive experiments on our newly created indoor and outdoor datasets as well as two public datasets, we demonstrate that our algorithm significantly outperforms state-of-the-art algorithms. By further optimizing the algorithm efficiency, we are confident that the algorithm can be integrated into an actual stereo vision system to provide robust three-dimensional information for autonomous robots.

references

[3] Y. Ling and S. Shen, “High-precision online markerless stereo extrinsiccalibration,” in 2016 IEEE/RSJ International Conference on IntelligentRobots and Systems (IROS). IEEE, 2016, pp. 1771–1778.

resource

Autonomous driving and positioning related sharing

[Quick Reading of Point Cloud Papers] Odometer based on lidar and positioning method in 3D point cloud map

Moving object detection based on optical flow in autonomous driving

Camera extrinsic parameter calibration based on semantic segmentation

Review: Introduction to theoretical models and perception of panoramic fisheye cameras for autonomous driving

A review of autonomous vehicle positioning methods in high-speed scenarios

Patchwork++: A fast and robust ground segmentation method based on point clouds

PaGO-LOAM: Ground-based optimized lidar odometry

Multimodal road edge detection and filtering method

A framework for simultaneous calibration, positioning and mapping of multiple lidars

Extraction, mapping and long-term positioning of rods in dynamic urban environments

Motion distortion correction of non-repetitive scanning lidar

Fast and tightly coupled sparse direct radar-inertial-visual odometry

3D vehicle detection based on cameras and low-resolution lidar

Annotation tools and city datasets for 3D point cloud semantic segmentation

Basic introduction to getting started with ROS2

Automatic calibration of solid-state lidar and camera systems

Sensor fusion positioning solution of lidar+GPS+IMU+wheel speedometer

Mapping and positioning of road scenes based on sparse semantic visual features

Real-time detection of vehicle roads and sidewalks based on lidar in autonomous driving (code open source)

Annotation tools and city datasets for 3D point cloud semantic segmentation

More articles can be viewed: A summary of historical articles on point cloud learning

SLAM and AR related sharing

Introduction to TOF camera principles

Introduction to TOF time-of-flight depth camera

Structured PLP-SLAM: an efficient sparse mapping and positioning solution for monocular, RGB-D and binocular cameras using points, lines and surfaces

Open source and optimized F-LOAM solution: based on optimized SC-F-LOAM

[Paper Quick Reading] AVP-SLAM: Semantic SLAM in automatic parking systems

[Quick Reading of Point Cloud Papers] StructSLAM: Structured Line Feature SLAM

SLAM and AR Overview

Commonly used 3D depth cameras

Review and evaluation of monocular visual inertial navigation SLAM algorithm for AR equipment

SLAM Overview (4) Laser and Vision Fusion SLAM

Semantic SLAM system for real-time reconstruction by Kimera

Easy-to-extend SLAM framework-OpenVSLAM

Introduction to SLAM method based on fisheye camera

If there are any errors in the above content, please leave a comment. Corrections and exchanges are welcome. If there is any infringement, please contact us to delete it.

Let’s share and learn together! We look forward to friends who have ideas and are willing to share to join Knowledge Planet and inject fresh vitality into sharing. Topics shared include but are not limited to three-dimensional vision, point clouds, high-precision maps, autonomous driving, robots and other related fields.

Sharing and cooperation: WeChat "cloudpoint9527" (remarks: name + school/company + research direction) Contact email: [email protected].

Guess you like

Origin blog.csdn.net/u013019296/article/details/135364178