Paper Interpretation|Lepard: Learning Partial Point Cloud Matching in Rigid and Deformable Scenes

Original | Wen BFT robot 

01

background

Point cloud matching and registration has widespread applications in computer vision and robotics and is critical to the successful implementation of many tasks. It is widely used in many applications, such as 3D modeling, robot navigation, virtual reality, etc.

The goal of point cloud matching and registration is to align two or more point clouds for subsequent processing and analysis. In practical applications, point clouds may be affected by noise, occlusion, non-rigid deformation, etc., which makes point cloud matching and registration more difficult.

Therefore, studying how to achieve efficient and accurate point cloud matching and registration in these complex situations is an important research direction.

02

Innovation

1. A new point cloud matching and registration method Lepard is proposed, which utilizes the 3D position information of the point cloud, and realizes the feature extraction and matching of the point cloud through the self-attention mechanism and the cross-attention mechanism.

2. A new point cloud matching benchmark dataset 4DMatch and 4DLoMatch is introduced. These two datasets contain non-rigid deformed point clouds, which are of great significance for the research of point cloud matching and registration.

3. The superiority of the Lepard method on multiple point cloud matching and registration benchmark datasets, including 3DMatch, 3DLoMatch, 4DMatch, and 4DLoMatch, is verified experimentally. On these datasets, Lepard's method achieves state-of-the-art results in both rigid and non-rigid cases.

4. A new point cloud position encoding method is proposed, which explicitly represents the 3D relative distance information between point clouds through the dot product of vectors, thereby improving the accuracy and robustness of point cloud matching and registration.

03

Algorithm introduction

Represent the input point cloud as a matrix, where each row represents the coordinates and eigenvectors of a point. The point cloud matrix is ​​fed into a self-attention mechanism to extract feature vectors of the point cloud.

The position information of the point cloud is encoded into the feature vector, and the 3D relative distance information between the point clouds is explicitly represented by the dot product of the vector. The encoded feature vectors are fed into a cross-attention mechanism to compute a similarity matrix between point clouds.

The similarity matrix is ​​fed into a double softmax operation, which converts the similarity matrix into a confidence matrix. Then, the matching point pairs are selected according to the confidence matrix, and the matching point pairs are further screened using the mutual nearest neighbor method.

Finally, the ICP algorithm is used to optimize the matching point pairs to obtain the final point cloud registration result.

Figure 1 Overview of the proposed method

By decomposing the point cloud representation into a feature space and a 3D position space architecture, Lepard can better use 3D position information for point cloud matching, thereby improving the accuracy of matching.

The position-encoding method explicitly reveals 3D relative distance information, which enables Lepard to better handle point cloud matching in non-rigid scenes.

The relocalization technique is able to modify the relative position of the intersection point cloud, which further improves the matching accuracy.

The combination of these techniques enables Lepard to achieve excellent point cloud matching results in both rigid and non-rigid scenes.

Figure 2. Visualization of self/cross-attention heatmaps and rigid fit-based relocalization

04

experiment

Dataset: This paper uses multiple public datasets for experiments, including 3DMatch, KITTI, Redwood, ETH, ICL-NUIM, etc.

Experimental settings: This paper uses a variety of evaluation indicators to evaluate the performance of point cloud matching and registration, including EPE, Acc5, Acc10, etc. At the same time, the paper also compared with a variety of classic point cloud matching and registration methods, including N-ICP, Predator, etc.

Figure 3 4DMatch and 4DLoMatch benchmarks

Figure 3 shows the histograms of the 4DMatch and 4DLoMatch benchmarks, where the overlap ratio threshold is set to 45%.

Table 1 Ablation study of 4DMatch

Table 1 presents the ablation study results of 4DMatch, where “*” indicates the default configuration of our method. This table lists the performance indicators of different methods on the 4DMatch dataset, including NFMR and IR. Among them, NFMR represents the proportion of unpaired point cloud pairs, and IR represents the proportion of correctly matched point cloud pairs.

Paper title:

Lepard: Learning partial point cloud matching in rigid and deformable scenes

For more exciting content, please pay attention to the official account: BFT Robot

This article is an original article, and the copyright belongs to BFT Robot. If you need to reprint, please contact us. If you have any questions about the content of this article, please contact us and we will respond promptly.

Guess you like

Origin blog.csdn.net/Hinyeung2021/article/details/131638860