Interpretation of Fastpillars paper

This paper is an improvement of Meituan based on the pointpillar algorithm.

The main improvement points are divided into the following two:

1. Introduce an attention mechanism to extract features in the pillar to improve the loss of fine-grained information caused by direct maxpooling.

2. Built a new lightweight backbone with reference to CSPNet and RepVGG.

1. Pillar attention mechanism feature extraction

Pillar feature extraction is different from pointpillars. It will retain all the points in each pillar (how to calculate in parallel?), and then MLP increases the dimension of each point and finds max to get the global features. This is the same as pointpillar, and then it will Starting from the original features of the points, MLP obtains a matrix of N*C, finds softmax in N dimensions, and does the feature dot product of the previous ascending dimension to obtain the features after the attention mechanism, and then sums the features obtained by the attention with the global features. As the final characteristic of pillar.

 

2. Backbone network CRVNet

The backbone network adopts the structure of RepVGG+CSPNet. The theory of RepVGG will be added later. Here we mainly talk about the advantages of CSPNet.

CSPNet can obtain richer gradient flows while reducing the amount of calculation. By splitting the gradient flow, the gradient flow is propagated through different network paths. By switching concatenation and transition steps, the propagated gradient information can have large correlation differences. In addition, CSPNet can greatly reduce the amount of calculation and improve the inference speed and accuracy.

 In Neck, the paper adopts the enhanced neck design in PillarNet. The neck fuses features with 8x and 16x feature maps from the backbone to achieve effective interaction of different spatial semantic features. The paper found that in this neck design, the number of convolutional layers before the cascade operation significantly affects the final performance. Therefore, the paper changes the number of blocks in the four stages of VGG in FastPillers-s from (4, 6, 16, 1) to (6, 16, 1, 1), and changes the four levels of ResNet-34 in FastPillars-m The number of blocks is changed from (3, 4, 6, 3) and (6, 6, 3, 2) respectively, while both remove the first 2x downsampling of the first stage.

Guess you like

Origin blog.csdn.net/slamer111/article/details/131613808