openvino as an excellent cpu inference engine
Use the python version api here
You need to compile and install openvino before use
https://docs.openvinotoolkit.org/latest/openvino_docs_get_started_get_started_windows.html
After installation, directly pip install openvino to install the python version of openvino
In fact, the python version of the api is also called openvino compiled by c++, which is why you use the python version, you also need to compile and install openvino
You can use it next. You need to transfer the openvino model file before using it. Of course, you don’t need to transfer it. Openvino supports reading onnx.
python /opt/intel/openvino_2021/deployment_tools/model_optimizer/mo.py --input_model ctdet_coco_dlav0_512.onnx --output_dir ./ctdet_coco_dlav0_512 --data_type FP16
The onnx model is from here https://download.01.org/opencv/public_models/122019/ctdet_coco_dlav0/
Now you can perform inference, first use the onnx file to inference directly
from openvino.inference_engine import IECore
import numpy as np
import cv2
import time
ie = IECore()
model="ctdet_coco_dlav0_512.onnx"
#model="ctdet_coco_dlav0_512/ctdet_coco_dlav0_512.xml"
net = ie.read_network(model=model)
input_blob = next(iter(net.input_info))
out_blob = next(iter(net.outputs))
net.batch_size=16#batchsize
n, c, h, w = net.input_info[input_blob].input_data.shape
print(n, c, h, w)
images = np.ndarray(shape=(n, c, h, w))
for i in range(n):
image = cv2.imread("123.jpg")
if image.shape[:-1] != (h, w):
image = cv2.resize(image, (w, h))
image = image.transpose((2, 0, 1))
images[i] = image
exec_net = ie.load_network(network=net, device_name="CPU")
start=time.time()
res = exec_net.infer(inputs={input_blob: images})
#print(res)
print('infer total time is %.4f s'%(time.time()-start))
operation result:
The batchsize we set in the above code is 16, now try the result of batchsize as 1, just modify net.batch_size=1
Next, use the file transferred by onnx for inference:
from openvino.inference_engine import IECore
import numpy as np
import cv2
import time
ie = IECore()
#model="ctdet_coco_dlav0_512.onnx"
model="ctdet_coco_dlav0_512/ctdet_coco_dlav0_512.xml"
net = ie.read_network(model=model)
input_blob = next(iter(net.input_info))
out_blob = next(iter(net.outputs))
net.batch_size=16#batchsize
n, c, h, w = net.input_info[input_blob].input_data.shape
print(n, c, h, w)
images = np.ndarray(shape=(n, c, h, w))
for i in range(n):
image = cv2.imread("123.jpg")
if image.shape[:-1] != (h, w):
image = cv2.resize(image, (w, h))
image = image.transpose((2, 0, 1))
images[i] = image
exec_net = ie.load_network(network=net, device_name="CPU")
start=time.time()
res = exec_net.infer(inputs={input_blob: images})
#print(res)
print('infer total time is %.4f s'%(time.time()-start))
operation result
Similarly, the operation result of net.batch_size=1:
It can be found that the reasoning time using onnx is basically the same as the converted reasoning time