Overview | A comprehensive review on point cloud registration (1)

Original | Text by BFT Robot 

picture

01

Summary

Registration is a transformation estimation problem between two point clouds and plays a unique and critical role in numerous computer vision applications. The development of optimization-based methods and deep learning methods has improved the robustness and efficiency of registration. Recently, performance has been further improved based on a combination of optimization and deep learning methods. However, the connection between optimization-based methods and deep learning methods remains unclear. In addition, with the development of 3D sensors and 3D reconstruction technology, a new research direction has emerged in cross-source point cloud alignment. This review conducts a comprehensive survey, including homologous and cross-origin registration methods, and summarizes the connections between optimization-based methods and deep learning methods to provide further research insights. This survey also establishes a new benchmark for evaluating state-of-the-art registration algorithms that address cross-origin challenges. Furthermore, this survey summarizes benchmark datasets and discusses point cloud registration applications across different domains. Finally, this survey suggests potential research directions in this rapidly growing field.

02

Contributions from the authors of this article

Overview Survey: We provide the most comprehensive overview of homologous point cloud registration, including traditional optimization and modern deep learning methods (1992-2021). We summarize the challenges and analyze the advantages and limitations of each type of registration method. Additionally, this article summarizes the connections between traditional optimization and modern deep learning methods.

A survey of cross-source registration: We provide the first literature review on cross-source point cloud registration. This investigation provides insights into data fusion research from different 3D sensors such as Kinect and Lidar.

New comparison: We construct a novel cross-source point cloud benchmark and then evaluate and compare the performance of existing state-of-the-art registration algorithms on the new cross-source point cloud benchmark. This survey can provide guidance for the selection of cross-source point cloud applications and the development of new registration methods.

Applications and future directions: We summarize potential applications of point cloud registration and explore research directions for practical applications. In addition, we propose possible future research directions and open issues in the field of point cloud registration.

03

Challenges in registration

3.1 Same origin challenge

Since point clouds are captured from the same type of sensor but different times or views, challenges present in the registration problem include:

• Noise and outliers: Environmental and sensor noise will vary at different acquisition times, and the captured point cloud will contain noise and outliers around the same 3D location.

• Partial overlap: Due to different viewpoints and acquisition times, the captured point clouds only partially overlap.

3.2 Cross-origin challenges

Since point clouds are captured from different types of sensors, and different types of sensors incorporate different imaging mechanisms, the cross-source challenges in the registration problem are much more complex than the same-origin challenges. These challenges can be mainly divided into:

• Noise and outliers: Due to different acquisition environments, sensor noise, and sensor image mechanisms at different acquisition times, the captured point cloud will contain noise and outliers around the same 3D position.

• Partial overlap: Due to different viewpoints and acquisition times, the captured point clouds only partially overlap.

• Density differences: Due to different imaging mechanisms and different resolutions, captured point clouds often contain different densities.

• Scale variation: Since different imaging mechanisms may have different physical measurements, the captured point cloud may contain scale differences.

In this paper, we provide a comprehensive review of point cloud registration and establish a new cross-source point cloud benchmark to evaluate the performance of state-of-the-art registration methods in solving these challenges.

picture

Figure 1 Categories of point cloud registration

04

Categories of point cloud registration

This section introduces our point cloud registration classification, as shown in Figure 1. We divide point cloud registration into two types: homologous registration and cross-source registration. Homologous registration can be divided into optimization-based registration methods, feature learning methods, and end-to-end learning registration methods. Below we briefly introduce each category and analyze its advantages and limitations.

4.1 Optimization-based registration method

Optimization-based registration utilizes an optimization strategy to estimate the transformation matrix. Most optimization-based registration methods consist of two stages: correspondence search and transformation estimation. Figure (2a) summarizes the main processes of this class. Correspondence search finds matching points for every point in another point cloud. Transformation estimation uses correspondence relationships to estimate the transformation matrix. These two stages are iteratively performed to find the optimal transformation. During the iterative process, the initial correspondence may not be accurate. As the iterative process continues, the correspondence will become increasingly accurate. Then, by using exact correspondence, the estimated transformation matrix will become accurate. The corresponding relationship can be found by comparing the coordinate difference between points or the feature difference between points.

There are two advantages of this category: 1) Strict mathematical theory can guarantee their convergence. 2) They require no training data and generalize well to unknown scenarios. A limitation of this category is that many complex strategies are required to overcome noise, outliers, density changes, and partially overlapping changes, which will increase the computational cost.

picture

Figure (2a) Point cloud registration framework based on optimization

4.2 Registration method based on feature learning

Feature learning methods for registration differ from classic optimization-based registration methods in that they use deep neural networks to learn robust feature correspondence searches.

The transformation matrix is ​​then finalized without iteration by one-step estimation (e.g. RANSAC). Figure (2b) summarizes the main processes of this category. For example, AlexNet is used to learn 3D features from an RGB-D dataset. The local PPF features are proposed using the distribution of neighboring points, and then input into the network for deep feature learning. [35] proposed a rotation-invariant handcrafted feature and input it into a deep neural network for feature learning. All these methods use deep learning as a feature extraction tool. By developing complex network architectures or loss functions, they aim to estimate robust correspondences through learned unique features.

The advantages of this category are twofold: 1) Point features based on deep learning can provide robust and accurate correspondence search. 2) By using the simple RANSAC method, accurate correspondence can obtain accurate registration results.

The limitations of this method lie in three aspects: 1) It requires a large amount of training data. 2) If the scene has a large distribution difference with the training data, the registration performance will drop sharply in unknown scenes. 3) They use separate training processes to learn independent feature extraction networks. The learned feature network is designed to determine point-to-point matching rather than registration.

picture

Figure (2b) Point cloud registration framework based on feature learning

4.3 Registration method based on end-to-end learning

End-to-end learning-based methods solve the registration problem through end-to-end neural networks. The input to this class is two point clouds, and the output is a transformation matrix that aligns the two point clouds. Embedding transformation estimation into neural network optimization is different from the above feature learning methods, which focus on point feature learning. Neural network optimization is separate from transformation estimation. Figure (2c) summarizes the main processes of this category. The basic idea of ​​the end-to-end learning method is to transform the registration problem into a regression problem. For example, [109] attempts to learn features from point clouds to be aligned and then regress transformation parameters from this feature. [97] proposed a registration network to formulate the correlation between the source point set and the target point set and use the defined correlation to predict the transformation. [27] proposed an autoencoder registration network for localization that combines superpoint extraction and unsupervised feature learning. [64] proposed a key point detection method and simultaneously estimated relative pose. FMR [40] proposed a feature metric registration method, which transformed the registration problem from minimizing the point-to-point projection error to minimizing the feature difference. This method is a pioneering work in feature metric registration that combines deep learning and traditional Lucas-Kanade optimization methods.

The advantages of this category are twofold: 1) Neural networks are specifically designed and optimized for registration tasks. 2) It can take advantage of traditional mathematical theory and deep neural networks.

The limitations of current methods are two-fold: 1) Regression methods treat transformation parameter estimates as a black box, and distance metrics are measured in a coordinate-based Euclidean space that is sensitive to noise and density differences. 2) Feature metric registration methods do consider local structural information, which is very important for registration.

picture

Figure (2c) Point cloud registration framework based on end-to-end learning

4.4 Registration method based on end-to-end learning

Cross-source point cloud registration is the alignment of point clouds from different types of sensors, such as Kinect and Lidar.

Cross-source point cloud registration is more challenging due to a combination of large amounts of noise and outliers, density differences, partial overlap, and scale differences. Several algorithms use complex optimization strategies to solve the cross-source point cloud registration problem by overcoming cross-source challenges. For example, CSGM transforms the registration problem into a graph matching problem and utilizes graph matching theory to overcome these challenges. Recently, FMR demonstrated the performance of using deep learning to align cross-source point clouds. These methods are striving to overcome cross-source challenges using optimization strategies or deep neural networks to estimate transformation matrices.

The benefit of cross-source point cloud registration is to combine the advantages of multiple sensors to provide comprehensive 3D visual information for many computer vision tasks such as augmented reality and building construction. However, existing registration methods are limited by low accuracy and high time complexity, and are still in their infancy. With the rapid development of 3D sensor technology in recent years, the lack of cross-source point cloud registration research has caused a gap between sensor technology and cross-source applications.

picture

Figure (2d) Cross-point cloud registration framework

Author | Jiangcheng

Typesetting | Xiaohe

Review | Orange

If you have any questions about the content of this article, please contact us and we will respond promptly. If you want to know more cutting-edge information, remember to like and follow~

Guess you like

Origin blog.csdn.net/Hinyeung2021/article/details/133353187