3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation
This article introduces an article about point cloud recognition in cvpr2020.
Thesis There
is currently no open source code
1. Problem
The main difficulty of 3D target detection is how to predict and process object proposals. One way of thinking is a top-down approach, first returning to a large number of boxes, and then performing the second stage of optimization. However, if the deviation of the box is relatively large, such methods are difficult to work. Another way of thinking is bottom-up, learning the features of each point through metric learning, and then clustering to form an instance according to the characteristics of these points. However, the parameters of clustering need to be adjusted manually, and the calculation amount of pairwise is too large. This paper combines these two methods to take advantage of the small amount of calculations of the top-down method, and takes advantage of the robustness of the point feature. Perform feature expression through point-level, but do not cluster point-level, so as to complement the advantages of the two methods.
2. Thoughts
This article contains three modules. The proposal generation module learns the features point by point, and each point performs proposals for the center point to which it belongs. Next is the proposal consolidation module, which refines the previous proposal. Finally, the object generation module returns to the final result. It is worth mentioning that the author did not use the common NMS method for regression, because NMS may lose the correct results. Instead, it clusters high-level proposall, which greatly reduces the amount of calculation compared to clustering points directly.
3. Algorithm
3.1 Proposal Generation
Sparse convolution is used to extract the features of each point, and then divided into two branches to predict the semantic category and return the center point coordinates of the instance to which each point belongs.
For the obtained center point set, k points are randomly sampled as instance proposals, and the points within the radius r of each proposal center point are regarded as points belonging to their corresponding instances, and then each proposal point set is extracted through a simple pointnet. feature.
Each proposal is represented by a tuple (yi; gi; si), where yi represents the central point position of the proposal, gi represents the proposal feature, and si represents the set of instance points corresponding to the proposal.
3.2 Proposal Consolidation
In order to make proposal features more interactive with global features, this article uses the idea of DGCNN to build a GCN to refine proposal features.
3.3 Object Generation
Here, we get K proposals
to get the objectness score through MLP. If the distance from the gt center point is less than 0.3m, it is set to positive, and if it is greater than 0.6m or equal to the distance between the two gt center points, it is set to negative. Negative proposals are not processed.
For positive proposals, first predict the semantic category, then aggregation features and binary instance mask. The aggregation feature contains two aspects: Geometric features and Embedding features. The
final loss is
4 Experimental results
to sum up
The previous PointGroup and this article have similarities. But 3D-MPA should be faster. Because there are not enough articles on instance segmentation, multi task loss seems to be inevitable. I always find it too time-consuming and difficult to optimize, which is prohibitive.