Deep learning model (yolov5) compression


foreword

When we use deep learning to train our model, what will be the next step? Undoubtedly, it is to test our model. Once the accuracy reaches the standard, it can be released online. After the release, we will find the problem -> the reasoning time is slow, so what should we do? The following ideas can be used for reference. Here we take yolov5 as an example:

  • Change the network to lightweight, such as YOLOv5s, YOLOv5n
  • Model acceleration through technical means, such as tensorrt, openvino, etc.
  • In the case of strong funds, you can consider using a card with a higher computing power
  • Model Compression
    Next, let me introduce a tool for model compression –> PaddleSlim

1. Environment construction

Needless to say, the environment of paddle is still quite difficult to deal with, and I feel that the compatibility is not very good. In order to let everyone avoid the pits I have stepped on, here I suggest directly downloading the latest version from the official website, but you must see it Are you installing GPU or CPU, the command is as follows:

pip install paddlepaddle-gpu -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple

The version comparison table is as follows, for reference only:
insert image description here
the most convenient way is to install paddle with docker, which will directly avoid many pitfalls
. After the installation is complete, verify whether the installation is successful:

  • verify paddl
import paddle
paddle.utils.run_check()

If it appears insert image description here
, it proves that the installation is successful

  • verify paddleslim
import paddleslim

Under normal circumstances, if there is no problem with the import, there is basically no problem

2. Code pull

Pull the source code of paddleslim, the link is as follows: PaddleSlim

3. Data set preparation

  • The data format generally adopts the COCO format, as follows:
    insert image description here
    Of course, if we can also use our own data set, the premise is that the data placement must be consistent with the above, and then modify paddleslim/example/auto_compression/pytorch_yolo_series/configs/yolov5s_qat_dis.yaml The data set path is sufficient, such as:

insert image description here
- No labeled pictures, directly passed to the deamination folder, does not support the evaluation model MAP (as stated in the official document)
If we are lazy to change the data format or label new data, then directly modify the image_path parameter in yolov5s_qat_dis.yaml , such as:
insert image description here
After setting here, you don’t have to worry about the setting of the coco path below

4. Model preparation

Here we need to prepare the pt model to be converted in advance and convert it to onnx or paddle format in advance. It should be noted that if you plan to use paddle in the reasoning framework of your project, both solutions are acceptable. If you do not plan to use paddle, it is recommended to use onnx, which will automatically perform model conversion during compression. My reasoning framework uses paddle, so I directly used the model in paddle format that I lived in advance.

  • If you convert onnx, you can use the exoport.py script in YOLOv5 to change where you live, such as:
python export.py --weights yolov5s.pt --include onnx
  • To convert to paddle format, you can refer to my other blog: onnx2paddle , and then specify the model path in yolov5s_qat_dis.yaml, as follows:
    insert image description here
    Secondly, since we no longer need the process of onnx2paddle or paddle2onnx, we need to change onnx_format to False, as follows:
    insert image description here

5. Start compressing

The command is as follows, here only shows the single card training:

cd example/auto_compression/pytorch_yolo_series
CUDA_VISIBLE_DEVICES=0 python run.py --config_path=./configs/yolov5s_qat_dis.yaml --save_dir="./output"

The detailed training reference link is as follows: Training

6. Comparison of results

1. Volume comparison:

Before compression: After
insert image description here
compression:
insert image description here
The reason why the model parameters are not reduced here can refer to this link here

2. Reasoning speed (trt is not enabled under 3080Ti)

Before compression:
insert image description here

After compression:
insert image description here

7. Error resolution

During the whole process, you may encounter the following errors:
insert image description here
The reason may be the version problem. This problem has tortured me for a long time, so you must pay attention to the version. The reference solution link is as follows: Bug solution
The fastest way to verify whether this problem is solved is to directly import paddleslim. If this error does not occur, it means that it has been solved.


Summarize

The above is the entire content of this article, welcome to correct the problem.

Guess you like

Origin blog.csdn.net/qq_55068938/article/details/128132402