Interpretation of PointNet++ source code

1. Start from the main function:

1.1 Determine which GPU to use.

 1.2 Save training parameters and logs

 2. Load data

 First find the directory where the training and test data are stored, and then load the relevant data parameters:

The following is the result of the execution:

 Next prepare for the training sample to start:

 Label the different tags:

 Locate each sample file:

 3. Import the model

 4. Other parameters

4.1 Using GPUs

 4.2 Load the pre-trained model

 4.3 

 5. Training part

 

optimizer.zero_grad()//Clear the gradient to zero, if this is not added, the gradient will continue to accumulate.

Next, convert the point cloud into Numpy format, and perform random point elimination, scaling and translation

The above content is very helpful to prevent overfitting and improve the generalization ability of the model

The next step is to convert numpy to tensor data form, transpose the second and third dimensions in the dimension, and then put it into cuda.

Data changes: 16*1024*3---》16*3*1024

Enter the network to start training

After the data enters the network, it enters the PointNetSetAbstraction function:

First set the data 16*3*1024---"16*1024*3 because I set the data batch_size to 16

16*1024*3: 16 samples, each sample has 1024 points, and each point has 3 dimensions (x, y, z) that can also be considered as 3 features.

Point cloud samples are fed into sample_and_group: what does it do

Point cloud changes:

new_xyz: 16*512*3 center point   
new_points:16*3*32*512-->16*64*32*512-->16*128*32*512-->16*128*512

new_points is 32 points queried around 512 points, and uses the idea of ​​PointNet to perform feature dimensioning on 32 points, and then calculates the maximum value of the 32 point clouds to complete the feature calculation.

new_xyz:16*128*3
new_points:16*131*64*128-->16*128*64*128-->16*256*64*128-->16*256*128

The appearance of 131 here is due to the 128-dimensional feature plus the original 3 features of each point.

new_xyz:16*1*3
new_points:16*259*128*1-->16*256*128*1-->16*512*128*1-->16*1024*128*1-->16*1024*1
Now it is equivalent to after downsampling, query 128 points near 1 point, and perform feature dimensioning and then perform MaxPooling
x:16*1024-->16*512-->16*256-->16*10 The training is completed, and the loss and prediction results are returned

 Calculate the maximum probability for each sample and record it as the prediction result. Calculate Loss at the same time,

loss.backward() Backpropagation, calculate the current gradient;

optimizer.step() updates the network parameters according to the gradient

Guess you like

Origin blog.csdn.net/qq_42813620/article/details/130953979