Reference article: Optimize and deploy YOLOv5 model based on OpenVINOTM2022.2 and Viper Canyon_openvino
Previous article: Automatic aiming based on YOLOV5 (with code)_yolov5 self-aiming_RedWhiteLuo's Blog
I have used Pytroch to call NVIDIA GPU for reasoning before, but in actual use, it is definitely best to leave the independent display unoccupied, so we can use the idle hardware-core display, the configuration of the device I have is 12700h 3060Laptop notebook.
Therefore, we can use the Openvino tool launched by INTEL to set the inference device as the core display, so that the independent display is free.
The process of using reasoning has been discussed in previous posts, so I won’t go into details here. Here we mainly record the problems encountered, I hope you can help me optimize the code logic and communicate with each other
It is worth mentioning that, through the test data in openvino's official benchmark, it can be concluded that its asynchronous reasoning speed is nearly 90fps
And openvino has an official asynchronous reasoning api ( multiprocessing )
Let me talk about the conclusion first: it takes too long to convert numpy to Tensor, resulting in a frame rate of less than 60fps, but it can be empty and independent
OpenVino environment deployment:
Download Intel® Distribution of OpenVINO™ Toolkit
pip install openvino-dev==2022.3.0
YOLOV5 model conversion:
python export.py --weights yolov5s.pt --include onnx [convert yolov5s.pt to onnx format]
mo --input_model yolov5s.onnx --data_type FP16 [convert onnx format file to xml and bin format]
OpenVino performance test:
benchmark_app -m yolov5s.xml -d AUTO:-CPU -hint cumulative_throughput
benchmark_app -m yolov5s.xml -d AUTO:-CPU -api sync
benchmark_app -m yolov5s.xml -d AUTO:-CPU -api async
Operation process (12700H):
Pre-processing: [win32api to get pictures 5.5ms], [picture dimension conversion + picture scaling = 2ms]
[numpy 2 tensor = 9.5ms] a total of 17ms
Reasoning: It takes about 11ms to send data into the neural network to output
Post-processing: Converting the content output by the neural network, labeling data, etc., takes about 4ms
Therefore, a complete process takes 32ms, but can be compressed to about 20ms through the provided asynchronous reasoning interface
In fact, the openvino reasoning process can also perform int8 quantization through the pot tool, but this kind of np2tensor takes too long, and int8 quantization is not very useful for frame rate improvement
Note: This source code does not add the part of automatic aiming, but it has already put in the function, if necessary, you can open it yourself
YOLO/Auto_Aiming_OV_async.py at main · RedWhiteLuo/YOLO (github.com)