Paper Express: Visual Odometry Based on a Single Panoramic Camera

标题:360VO: Visual Odometry Using a Single 360 Camera

Authors: Huajian Huang and Sai-Kit Yeung

来源:2022 IEEE International Conference on Robotics and Automation (ICRA)

This paper presents a novel direct visual odometry algorithm for robust localization and mapping using 360-degree cameras. This system uses a spherical camera model to process equidistant columnar images that do not require correction, and extends the sparse direct visual odometry (DSO, direct sparse odometry) to achieve omnidirectional perception. After applying the mapping and optimization algorithms to the new model, the camera parameters including intrinsic and extrinsic parameters and 3D mapping can be jointly optimized in the local window. Furthermore, we conduct qualitative and quantitative evaluations of the proposed algorithm in both real-world and large-scale simulated scenarios. Extensive experiments show that our system achieves state-of-the-art results.

Figure 1, top left, shows a typical panoramic camera consisting of only two ultra-wide-field lenses. The upper right corner is an example image of the 360 ​​camera. As the lens manufacturing industry matured, ultra-wide field-of-view lenses became cheaper and produced higher quality images. The map below is a map reconstructed by the 360VO system in this article.

Figure 2 Overview of the 360VO system. The input to the system is a sequence of equidistant columns of frames. After initialization, the system tracks and optimizes relevant model parameters in a local window.

Figure 3 The above figure shows the coordinate system of 360VO, which takes advantage of the spherical model to represent the camera projection. Also, 2D images are equirectangular projections.

Figure 4 Epipolar constraints. New activation points need to be created when successfully traced and their inverse depths refined by triangulation. The high corresponding point of the main frame Ci is located on the epipolar curve, not on the straight line in the target frame Cj.

Figure 5. Representative frames of partial sequences in the synthetic dataset. This synthetic dataset is rendered in real city models and consists of 10 large video sequences.

Figure 6 Comparison of trajectories based on sequence 3. The black curve in the figure represents Ground-truth, the blue curve represents the trajectory obtained by OpenVSLAM, and the red curve represents the trajectory obtained by 360VO proposed in this paper. The results show that the trajectory of 360VO is closer to the ground truth.

Figure 7 The upper figure is the result obtained from the synthetic data set. Each sequence is run 10 times to obtain the root mean square error (RMSE, Root Mean Square Error) of the trajectory. The number at the top of each bar in the graph is the mean value of the root mean square error. Compared with OpenVSLAM, our 360VO achieves impressive results. Furthermore, we rectified and cropped the 360° images into perspective images of 90° FOV and used them as input for running ORB-SLAM and DSO. Clearly, methods utilizing 360 cameras are generally more robust and accurate.

Figure 8 Qualitative results of 360VO tested in outdoor environment

The blue line in Figure 9 represents the constraints between the active keyframes in the local optimization window, and the magenta curve represents the camera trajectory. The gray sphere represents the current frame position, while the black dot represents the 3D map. Since the same landmark can be observed for a long time, the results obtained by the algorithm in this paper have better consistency and lower drift.

Figure 10 Even in narrow indoor environments with untextured floors, white walls, and dynamic objects, 360 cameras can spatially capture enough features. This inherent advantage allows 360VO to track and map successfully, whereas systems using perspective images are prone to drift. Note: The color on the image indicates the estimated depth of the point, near (red) → far (blue)

Abstract

In this paper, we propose a novel direct visual odometry algorithm to take the advantage of a 360-degree camera for robust localization and mapping. Our system extends direct sparse odometry by using a spherical camera model to process equirectangular images without rectification to attain omnidirectional perception. After adapting mapping and optimization algorithms to the new model, camera parameters, including intrinsic and extrinsic parameters, and 3D mapping can be jointly optimized within the local sliding window. In addition, we evaluate the proposed algorithm using both real world and large-scale simulated scenes for qualitative and quantitative validations. The extensive experiments indicate that our system achieves start of the art results.

Guess you like

Origin blog.csdn.net/qq_41050642/article/details/128298740