TensorRT Reasoning Handwritten Digit Classification (2)

Series Article Directory

(1) Use pytorch to build a model and train
(2) Convert pth format to onnx format



foreword

  In the previous section, we have successfully built the model, trained it and saved it as a pth file. In this section, we will introduce how to convert the pth file to the onnx general format, and some precautions.


1. What is ONNX?

  ONNX is an open file format designed for machine learning to store trained models. It enables different artificial intelligence frameworks (such as Pytorch, MXNet) to store model data and interact in the same format. The specifications and codes of ONNX are mainly jointly developed by companies such as Microsoft, Amazon, facebook and IMB, and are hosted on Github in an open source manner.

2. How to convert pth to onnx

  In the previous section, we used the Pytorch framework to train the model and save the model parameters as a pth file. What we need to do now is to convert this into a unified format such as ONNX for storage. The ONNX file not only stores the weight of the neural network model, but also stores the structural information of the model, the input and output of each layer in the network and some other auxiliary information.
  There is already a function torch.onnx.export() in Pytorch that can directly convert the generated pth file into an ONNX file.

The code is as follows (example):

from model import Net
import torch
path="./model.pth"
onnx_path="./model.onnx"
torch.set_default_tensor_type('torch.FloatTensor')
torch.set_default_tensor_type('torch.cuda.FloatTensor')
net=Net()
net.load_state_dict(torch.load(path, map_location='cpu'))
net.eval()
test_arr = torch.randn(1,1,28,28)
input_names = ['input']
output_names = ['output']
torch.onnx.export(
    net,
    test_arr,
    onnx_path,
    verbose=False,
    opset_version=11,
    input_names=input_names,
    output_names=output_names,
    # dynamic_axes={"input":{3:"width"}}            #动态推理W纬度,若需其他动态纬度可以自行修改,不需要动态推理的话可以注释这行
)
print('->>模型转换成功!')

1. Code analysis

1. Load parameters

pth_path="./model.pth"
net = Net()
net.load_state_dict(torch.load(pth_path, map_location='cpu'))

The map_location parameter refers to how to remap the storage location when loading parameters, and you can choose 'CPU' and 'GPU'.

net.eval()  # 进入推理模式

2. Export ONNX file

torch.onnx.export(
    net,
    test_arr,
    onnx_path,
    verbose=False,
    opset_version=11,
    input_names=input_names,
    output_names=output_names,
    # dynamic_axes={"input":{3:"width"}}            #动态推理W纬度,若需其他动态纬度可以自行修改,不需要动态推理的话可以注释这行
)

Make a special statement for the torch.onnx.export() function.

torch.onnx.export(model, # 加载参数后的模型
				  args,  # 给定一组输入,再实际执行一遍模型,即把这组输入对应的计算图记录下来,保存为ONNX格式
				  f, 
				  export_params=True,  # 设置为False不会导出训练好的参数,只会导出一个未训练的模型
				  verbose=False,  # 设置为True将打印导出文件的信息
				  training=<TrainingMode.EVAL: 0>, 
				  input_names=None,  # 计算图中输入节点的名称
				  output_names=None,  # 计算图中输出节点的名称
				  operator_export_type=<OperatorExportTypes.ONNX: 0>, 
				  opset_version=None,  # 算子的版本,不同版本的pytorch默认值不一样
				  do_constant_folding=True, 
			      dynamic_axes=None, 
				  keep_initializers_as_inputs=None, 
				  custom_opsets=None, 
				  export_modules_as_functions=False)

For a special explanation of the dynamic_axes parameter, we first use Netron to open the calculation graph for setting dynamic_axes as follows:
insert image description here
You can see that the input is fixed 1(batch_size)x1(channel)x28(height)x28(width).
We set dynamic_axes={“input”:{3:“width”}}, at this time we also export the calculation graph, and check the calculation graph at this time as follows: At this time, the third dimension of the input is no longer 28
insert image description here
, but A dynamic number. In short, if the size of your input is dynamically changing during inference, then you need to set the dynamic_axes parameter here. Here, for the convenience of explanation, we set the input to be of fixed size.

2. Check whether the ONNX model is correct

code show as below:

import onnx

model_path = './model.onnx'
onnx_model = onnx.load(model_path)
try:
    onnx.checker.check_model(onnx_model)
except onnx.checker.ValidationError as e:
    print("The model is invalid: %s"%e)

else:
    print("The model is valid!")

Import the onnx package. If an error is reported, pip install is required to install it. The onnx.load function is used to read an ONNX model, and onnx.checker.check_model() is used to check whether the model format is correct. If there is an error, the function will directly report an error.

3. ONNX Runtime

  ONNX Runtime is a cross-platform machine learning reasoning accelerator developed by Microsoft, also known as the "reasoning engine" (also TensorRT, etc.). ONNX Runtime is directly connected to ONNX, that is, ONNX Runtime can directly read and run .onnx files without converting files in .onnx format to other formats. If we are the deployment pipeline of Pytorch-ONNX-ONNX Runtime, as long as we get the .onnx file in the target device and run the model in ONNX Runtime, the model deployment will be completed. But this article uses TensorRT for reasoning, so we use ONNX Runtime to verify that the output of the ONNX file we exported is correct. The above is to check whether the format of the model is correct, and the two are different.

1. Install the ONNX Runtime package

We want to use GPU for inference, so we need to install onnxruntime-gpu. If onnxruntime was installed before, just uninstall and reinstall.

pip install onnxruntime-gpu

2. Verify that the model is correct

  Our idea here is: read and run the .onnx file through ONNX Runtime, input it as a test picture of the mnist dataset, and then observe whether the output is the corresponding category; read and run the .pth file through Pytorch, and input it as the mnist dataset The test picture, and then observe whether the output is the corresponding category; and then compare whether the output categories in the two methods are the same. The overall code is as follows:

import cv2
import numpy as np
import onnx
import onnxruntime as ort
import torch
from model import Net
from torchvision import transforms

pth_model_path = './model.pth'
onnx_model_path = './model.onnx'



def pytorch_out(input):
    input = input.cuda()
    model = Net()
    model.load_state_dict(torch.load(pth_model_path))
    model.cuda()
    output = model(input)
    
    output = output.detach().cpu().numpy()
    print(output)
    predict_cla = np.argmax(output)
    print("torch预测是第{}类".format(predict_cla))

def to_numpy(tensor):
    return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()

def onnx_out(input):
    input = to_numpy(input)
    # GPU推理
    sess = ort.InferenceSession('model.onnx', providers=['CUDAExecutionProvider'])
    print("推理方式为:",ort.get_device())
    io_binding = sess.io_binding()

    input_name = sess.get_inputs()[0].name
    output_name = sess.get_outputs()[0].name
    io_binding.bind_cpu_input(input_name,input)
    io_binding.bind_output(output_name)
    sess.run_with_iobinding(io_binding)
    output = io_binding.copy_outputs_to_cpu()[0]
    
    probe = np.squeeze(output[0])  # 输出的是概率
    print(probe)
    predict_cla = np.argmax(probe)  # 概率最大值为对应的类别
    print("onnx预测是第{}类".format(predict_cla))

#读取图片
img = cv2.imread('/home/wjq/pytorch_mnist/mnist_data/mnist_test/7/mnist_test_17.png',0)
# 数据预处理
transform = transforms.Compose([transforms.ToTensor(),
                                            transforms.Normalize((0.1307),(0.3081,))])

input = transform(img)
input_data = torch.unsqueeze(input, dim=-0)  # 加上batch维度 1*1*28*28

pytorch_out(input_data)  # pytorch方法
onnx_out(input_data)   # onnxruntime 方法

First pay attention to the main lines of code in the file:

img = cv2.imread('/home/wjq/pytorch_mnist/mnist_data/mnist_test/7/mnist_test_17.png',0)
transform = transforms.Compose([transforms.ToTensor(),
                                            transforms.Normalize((0.1307),(0.3081,))])

input = transform(img)
input_data = torch.unsqueeze(input, dim=-0)  # 加上batch维度 1*1*28*28

pytorch_out(input_data)
onnx_out(input_data)

First use cv2.imread to read the test picture, and then go through the same data preprocessing as in training, and because it is a single picture, add the batch_size dimension. Then write down the input data to observe the output through the ONNX Runtime method and the Pytorch method.
The output is shown in the figure below:
insert image description here
You can see that the output of pytorch and onnxruntime are the same, which proves that the .onnc file we transferred from the .pth file is correct.

Summarize

  In this section, we introduce how to convert a .pth file to an .onnx file, and verify whether the format and content of the .onnx file are correct. Before that, we also introduce the inference engine ONNX Runtime. In the next section, we will introduce how to generate TensorRT engine files through .onnx files.

Guess you like

Origin blog.csdn.net/qq_41596730/article/details/128388179