Yolov5 instance segmentation Tensorrt deployment actual combat

0 Preface

Ultralytics released the instance segmentation model          in version 6.2 of yolov5 , which can realize fast instance segmentation. The effect of using the official v5s-seg.pt is shown in the figure below:

         This blog will use this as a basis to develop the tensorrt reasoning code of the c++ version, and link it directly: Here , my environment is:

cuda10.2 cudnn8.2.4 Tensorrt8.0.1.6 Opencv4.5.4 . . . . . . The code list is as follows

├── CMakeLists.txt
├── images
│   ├── bus.jpg
│   └── zidane.jpg
├── logging.h
├── main1_onnx2trt.cpp
├── main2_trt_infer.cpp
├── models
│   ├── yolov5s-seg.engine
│   └── yolov5s-seg.onnx
├── output.jpg
├── README.md
└── utils.h

1. Generate onnx model

        First, we clone the latest version of the code, that is, version 6.2, and download the corresponding pt model. Here we take yolov5s-seg.pt as an example for the following description.

git clone [email protected]:ultralytics/yolov5.git#官方代码
git clone [email protected]:fish-kong/Yolov5-instance-seg-tensorrt.git#我的tensort推理c++代码

        The official code export.py of yolov5-6.2 provides a method to directly generate an engine, but I do not recommend using it directly because the generated engine is related to the computer environment. After you change the environment, the engine generated by the computer before will be It can't be used, unless the environment of the two computers is exactly the same, so we only generate the onnx model, the command is as follows

python export.py --data coco128-seg.yaml --weights yolov5s-seg.pt --cfg yolov5s-seg.yaml --include onnx

The name yolov5s-seg.onnx will be generated. After opening it with Netron , we can see that the input is 1x3x640x640, the output output0 is 1x25200x117, and the output output1 is 1x32x160x160. These sizes are very important for subsequent reasoning, and they all need to be written into c++ reasoning code parameters in .

 2. Generate engine model

1. First locate the repo directory of your clone, which is the Yolov5-instance-seg-tensorrt directory
2. Copy yolov5s-seg.onnx to models/

3. Run the following code to generate executable files for conversion and inference -->onnx2trt trt_infer

mkdir build
cd build
cmake ..
make  

4. Model conversion

sudo ./onnx2trt ../models/yolov5s-seg.onnx ../models/yolov5s-seg.engine

        Through the above operations, we can get yolov5s-seg.engine (provided that cuda, cudnn, tensorrt, opencv are installed, and my version is recommended)

3. Reasoning

Through 2 operations, the executable file trt_infer has actually been generated and only needs to be executed

sudo ./trt_infer ../models/yolov5s-seg.onnx ../images/bus.jpg
for (int i = 0; i < 10; i++) {//计算10次的推理速度
    auto start = std::chrono::system_clock::now();
    doInference(*context, data, prob, prob1, 1);
    auto end = std::chrono::system_clock::now();
    std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << "ms" << std::endl;
}

The above code in main2_trt_infer.cpp infers 10 times and displays the time. On my 1080ti, it basically maintains a frame of 10ms, which is quite fast.

The final result is as follows. Compared with the picture at the front of the article (this is the result of the official code directly using pt reasoning), it can be seen that it is basically the same.

4. Reference

1. Tensorrtx from wangxinyu

2. The opencv reasoning of UNeedCryDear boss

3. 2022.09.29 Update c++ and use opencv to deploy yolov5 and yolov7 instance segmentation models (6)

The complete code has been uploaded, and you can use it directly by cloning. Comments are welcome in the comment area. If you find it useful, please give my github repo some attention, thank you

Guess you like

Origin blog.csdn.net/qq_41043389/article/details/127754384