Article Directory
Project scenario:
Recently, I wanted to use PaddleSlim to perform model compression on my efficientnetB0 classification task (pytorch model), but it was unsatisfactory and I reported countless errors before I could start training. It should be noted that the pt model we trained needs to be converted to the onnx model (corresponding to the onnx format attribute in the yaml file is True) or paddle (corresponding to the onnx format in the yaml file is changed to False) format in advance.
For the training code, please refer to my other article: For training
data, if you want to use your own data set, the format needs to be processed into ILSVRC2012 format. For the processing code, please refer to this article: Data set processing
steps are as follows:
Question one
Problem Description
Dimension matching failed, as shown in the figure:
Cause Analysis:
This is because the bs we set when exporting the onnx model does not correspond to the bs set in PaddleSlim. For example, mine is set to 1 when exporting onnx, while the default value in PaddleSlim is 32.
solution:
1. Set the bs when exporting onnx to 32.
2. Change the bs in your corresponding yaml file in PaddleSlim to the same as when you exported onnx. Like mine, it is in PaddleSlim/example/auto_compression/image_classification/configs/EfficientNetB0 Make the following changes in /qat_dis.yaml:
However, the suggestion is still Option 1. If you ask why, then I ask a question, when training the model, why do you set the batch_size to be a multiple of 8 instead of 1?
question two
Problem Description
The error is as follows:
Cause Analysis
According to the developer, the origin_metric parameter is set in the config file (generally the last parameter of the corresponding yaml file in the classification task), as shown in the figure:
If the difference between the actual test accuracy and the filling is too large, this error will be reported
solution
1. Delete the parameter origin_metric
2. Change this parameter to the actual size (it is recommended to keep 2-3 decimal places. When I changed it to 0.9, this error occurred until it was changed to 0.939)
question three
Problem Description
The error is as follows:
Cause Analysis
The official explanation is: the distillation node is not set correctly, the program will automatically set the last output with parameter op as the distillation node for training, as shown in the figure:
solution
Delete the node parameter, as shown in the figure:
question four
Problem Description
The error is as follows:
Cause Analysis
This error is probably because our input name is different from the configuration file
solution
Set the first parameter input_name in the yaml file to x2paddle_input, as shown in the figure:
Summary: The above are all the bugs I have encountered. Due to resource problems, my training is over halfway, so I will not show the size of the compressed model here. Interested friends can do it.