Relevant documents have been provided on the PyTorch official website. Interested students can take a look at the documents: EXPORTING A MODEL FROM PYTORCH TO ONNX AND RUNNING IT USING ONNX RUNTIME
1. Preparation
- Semantic segmentation model
model.py
- Trained weight file
model.pth / model.pt
onnx==1.12.0
onnxruntime==1.15.1
import torch.onnx
from models import PPLiteSeg
import onnxruntime as ort
from PIL import Image
import numpy as np
import torchvision.transforms as transforms
2. Create PyTorch model
First we need to create a PyTorch model and load .pth
the weight file:
# 创建模型
torch_model = PPLiteSeg()
# 加载模型权重
model_state_dict = torch.load("checkpoint/model.pth")
# 如果模型使用了DDP训练,则模型状态字典的会有'module'的前缀,我们需要删除
# 创建一个新的字典,去掉 "module." 前缀
# new_state_dict = {k.replace('module.', ''): v for k, v in model_state_dict['model'].items()}
# 加载模型权重
torch_model.load_state_dict(new_state_dict, strict=True)
print("\033[1;31m模型权重加载完毕...\033[0m")
"""
因为我们的模型最终的输出并没有经过后处理,此时的shape为[N, num_classes, H, W],所以需要对模型添加上后处理,
让模型的输出为[N, 1, H, W]
"""
# 给模型添加后处理操作
torch_model = WrappedModel(torch_model)
# 设置模型为推理状态(这一步是必须的!)
torch_model.eval()
# 创建一个输入Tensor
x = torch.randn(1, 3, 512, 512, requires_grad=True)
torch_out = torch_model(x)
print(torch_out[0].shape) # torch.Size([1, 1, 512, 512])
The WrappedModel
code is:
import torch
class WrappedModel(torch.nn.Module):
def __init__(self, model, output_op):
super().__init__()
self.model = model
def forward(self, x):
outs = self.model(x)
new_outs = []
for out in outs:
out = torch.nn.functional.softmax(out, dim=1) # 沿着通道维度进行概率计算
label = torch.argmax(out, dim=1).to(dtype=torch.int32) # 获取最大的位置
label = torch.unsqueeze(label, 1)
# torch.max返回值有两个:最大值的张量 + 最大值的索引张量
max_score = torch.max(out, dim=1)[0] # 获取最大概率
max_score = torch.unsqueeze(max_score, 1)
new_outs.append(label)
new_outs.append(max_score)
# 返回的是一个len==2的list
return new_outs
At this point, it means that our PyTorch model was successfully created and the trained weights were correctly loaded.
3. Convert to ONNX model and save
# Export the model
torch.onnx.export(torch_model, # model being run
x, # model input (or a tuple for multiple inputs)
"model.onnx", # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=11, # the ONNX version to export the model to
do_constant_folding=True, # whether to execute constant folding for optimization
input_names = ['input'], # the model's input names
output_names = ['label', 'score'], # the model's output names
dynamic_axes={
'input' : {
0 : 'B'}, # variable length axes
'output' : {
0 : 'B'}})
print("\033[1;31mONNX模型转换完毕.\033[0m")
The following is torch.onnx.export
a description of the parameters of the function:
-
torch_model
: This is an instance of the PyTorch model to be exported. -
x
: This is the input data for the model, which can be a single input Tensor or a tuple containing multiple input Tensors, depending on how the model is input. -
"model.onnx"
: This is the saving path of the exported ONNX model file. The ONNX model will be saved in a file named "model.onnx". The file name and path can be changed. -
export_params=True
: This is a Boolean value indicating whether to export the model's parameter weights. If set toTrue
, the model's parameters will be saved with the model to an ONNX file for use at inference time. If set toFalse
, no parameters will be exported, only the model structure. -
opset_version=11
: This is the ONNX version used to export the model. In this example, ONNX version 11 is used. Different versions of ONNX support different operations, so you need to choose a version that is compatible with your model and runtime. -
do_constant_folding=True
: This is a Boolean value indicating whether to perform constant folding for optimization. If set toTrue
, ONNX export will attempt to collapse constant Tensors in the model into constant nodes to reduce model file size and increase inference speed. -
input_names
: This is the model's list of input names (list
) that identify the model's input Tensors. In this example, the model's input Tensor is named "input". -
output_names
: This is a list of the model's output names (list
) that identify the model's output Tensors. In this example, the model's output Tensors are named "label" and "score". -
dynamic_axes
: This is a dictionary specifying the names of dynamic axes. Dynamic axes are axes that can have variable lengths, usually batch axes. In this example, the first dimension of input "input" and output "output" is specified as "B", indicating that the batch axis can be of variable length.
By using these parameters, you can control how PyTorch models are exported to ONNX format and configure them according to your needs.
Description :
- Because the output of our model is a list of length 2, there
output_names
should be two; dynamic_axes
Indicates which ones are dynamic. Here we set the Batch dimension to dynamic, that is, the input of the Batch dimension of the ONNX model is arbitrary and not fixed.
The complete code is as follows :
import torch
import numpy as np
import torch.onnx
from models import PPLiteSeg
class WrappedModel(torch.nn.Module):
def __init__(self, model, output_op):
super().__init__()
self.model = model
def forward(self, x):
outs = self.model(x)
new_outs = []
for out in outs:
out = torch.nn.functional.softmax(out, dim=1) # 沿着通道维度进行概率计算
label = torch.argmax(out, dim=1).to(dtype=torch.int32) # 获取最大的位置
label = torch.unsqueeze(label, 1)
# torch.max返回值有两个:最大值的张量 + 最大值的索引张量
max_score = torch.max(out, dim=1)[0] # 获取最大概率
max_score = torch.unsqueeze(max_score, 1)
new_outs.append(label)
new_outs.append(max_score)
# 返回的是一个len==2的list
return new_outs
if __name__ == "__main__":
# 创建模型
torch_model = PPLiteSeg()
# 加载模型权重
model_state_dict = torch.load("checkpoint/model.pth")
# 如果模型使用了DDP训练,则模型状态字典的会有'module'的前缀,我们需要删除
# 创建一个新的字典,去掉 "module." 前缀
# new_state_dict = {k.replace('module.', ''): v for k, v in model_state_dict['model'].items()}
# 加载模型权重
torch_model.load_state_dict(new_state_dict, strict=True)
print("\033[1;31m模型权重加载完毕...\033[0m")
"""
因为我们的模型最终的输出并没有经过后处理,此时的shape为[N, num_classes, H, W],所以需要对模型添加上后处理,
让模型的输出为[N, 1, H, W]
"""
# 给模型添加后处理操作
torch_model = WrappedModel(torch_model)
# 设置模型为推理状态(这一步是必须的!)
torch_model.eval()
# 创建一个输入Tensor
x = torch.randn(1, 3, 512, 512, requires_grad=True)
torch_out = torch_model(x)
# Export the model
torch.onnx.export(torch_model, # model being run
x, # model input (or a tuple for multiple inputs)
"model.onnx", # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=11, # the ONNX version to export the model to
do_constant_folding=True, # whether to execute constant folding for optimization
input_names = ['input'], # the model's input names
output_names = ['label', 'score'], # the model's output names
dynamic_axes={
'input' : {
0 : 'B'}, # variable length axes
'output' : {
0 : 'B'}})
print("\033[1;31mONNX模型转换完毕.\033[0m")
4. Modify ONNX
4.1 Modify the input and output shape
After we save it as ONNX, we can use a software called Netron.onnx
to open the file, as shown below:
ArgMax
We can see that the corresponding output in the ONNX file is label
, indicating that our model conversion is correct. But when we look at the right side, we will find that the shape is , which is what we want, but the output should logically be the same , but it is not like this. In order to facilitate the later conversion to TRT (TensorRT), we will modify the output. The modified code is as follows:ReduceMax
score
input
[B, 3, 512, 512]
[B, 3, 512, 512]
import onnx
import argparse
def show_inp_and_oup_info(model, modify=False):
input_info = model.graph.input
print("模型的输入信息:")
for info in input_info:
print(info.name, info.type)
output_info = model.graph.output
print("模型的输出信息:")
for info in output_info:
print(info.name, info.type)
if __name__ == "__main__":
# 输入 ONNX 模型路径
model_path = "model.onnx"
# 输出 ONNX 模型路径
output_path = "retype_model.onnx"
# 读取 ONNX 模型
model = onnx.load(model_path)
show_inp_and_oup_info(model, modify=False)
# 找到输入张量并修改
# for input_info in model.graph.input:
# if input_info.name in ['x', 'input']:
# # 修改输入张量的形状
# input_info.type.tensor_type.shape.dim[0].dim_param = "B"
# 修改输出张量的形状
for output_info in model.graph.output:
if output_info.name in ["label", "score"]:
output_info.type.tensor_type.shape.dim[0].dim_param = "B"
output_info.type.tensor_type.shape.dim[2].dim_value = 512
output_info.type.tensor_type.shape.dim[3].dim_value = 512
show_inp_and_oup_info(model, modify=True)
# 保存修改后的模型
onnx.save(model, output_path)
4.2 Modify name
If we want to modify the input and output names, we can also use the following script:
import argparse
import sys
import onnx
def parse_arguments():
parser = argparse.ArgumentParser()
parser.add_argument('--model', required=True, help='Path of directory saved the input model.')
parser.add_argument('--origin_names', required=True, nargs='+', help='The original name you want to modify.')
parser.add_argument('--new_names', required=True, nargs='+',
help='The new name you want change to, the number of new_names should be same with the number of origin_names')
parser.add_argument('--save_file', required=True, help='Path to save the new onnx model.')
return parser.parse_args()
if __name__ == '__main__':
args = parse_arguments()
model = onnx.load(args.model)
output_tensor_names = set()
for ipt in model.graph.input:
output_tensor_names.add(ipt.name)
for node in model.graph.node:
for out in node.output:
output_tensor_names.add(out)
for origin_name in args.origin_names:
if origin_name not in output_tensor_names:
print("[ERROR] Cannot find tensor name '{}' in onnx model graph.".format(origin_name))
sys.exit(-1)
if len(set(args.origin_names)) < len(args.origin_names):
print("[ERROR] There's dumplicate name in --origin_names, which is not allowed.")
sys.exit(-1)
if len(args.new_names) != len(args.origin_names):
print("[ERROR] Number of --new_names must be same with the number of --origin_names.")
sys.exit(-1)
if len(set(args.new_names)) < len(args.new_names):
print("[ERROR] There's dumplicate name in --new_names, which is not allowed.")
sys.exit(-1)
for new_name in args.new_names:
if new_name in output_tensor_names:
print("[ERROR] The defined new_name '{}' is already exist in the onnx model, which is not allowed.")
sys.exit(-1)
for i, ipt in enumerate(model.graph.input):
if ipt.name in args.origin_names:
idx = args.origin_names.index(ipt.name)
model.graph.input[i].name = args.new_names[idx]
for i, node in enumerate(model.graph.node):
for j, ipt in enumerate(node.input):
if ipt in args.origin_names:
idx = args.origin_names.index(ipt)
model.graph.node[i].input[j] = args.new_names[idx]
for j, out in enumerate(node.output):
if out in args.origin_names:
idx = args.origin_names.index(out)
model.graph.node[i].output[j] = args.new_names[idx]
for i, out in enumerate(model.graph.output):
if out.name in args.origin_names:
idx = args.origin_names.index(out.name)
model.graph.output[i].name = args.new_names[idx]
onnx.checker.check_model(model)
onnx.save(model, args.save_file)
print("[Finished] The new model saved in {}.".format(args.save_file))
print("[DEBUG INFO] The inputs of new model: {}".format([x.name for x in model.graph.input]))
print("[DEBUG INFO] The outputs of new model: {}".format([x.name for x in model.graph.output]))
Use the command as follows:
python rename_onnx_model_name.py \
--model model.onnx \
--origin_names x y z \
--new_names x1 y1 z1 \
--save_file new_model.onnx
5. Test the effect before and after conversion
There are two ideas for testing the effects before and after conversion:
- Idea 1: Compare the differences in the output of the two models - machine view
- Idea 2: Directly convert the output of the two models into pictures - see with the naked eye
5.1 Compare the differences in the output of the two models
In the PyTorch tutorial, this method is used.
# compare ONNX Runtime and PyTorch results
np.testing.assert_allclose(torch_res[0].numpy(), onnx_res[0], rtol=1e-03, atol=1e-05)
np.testing.assert_allclose(torch_res[1].detach().numpy(), onnx_res[1], rtol=1e-03, atol=1e-05)
print("\033[1;44mExported model has been tested with ONNXRuntime, and the result looks good!\033[0m")
Since our model has
score
andlabel
, both need to be tested.
5.2 Directly convert the output of the two models into images
The following does not provide a detailed demonstration, only the necessary functions are provided.
5.2.1 Load images and preprocess them
def load_test_img(image_path, target_size=(512, 512)):
# 加载图片
image = Image.open(image_path)
# 调整图片大小为目标大小
image = image.resize(target_size, Image.BILINEAR)
# 使用 torchvision.transforms 将 PIL 图片转换为 PyTorch 张量
transform = transforms.Compose([transforms.ToTensor(), # 转换为张量
transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) # 归一化
])
# 应用变换并添加批次维度 [1, C, H, W]
image_tensor = transform(image).unsqueeze(0)
return image_tensor
5.2.2 Load ONNX model
def create_onnx_model(ckpt_path):
import onnxruntime as ort
ort_session = ort.InferenceSession(ckpt_path)
print("\033[1;31mONNX模型创建完毕...\033[0m")
return ort_session
5.2.3 Running the ONNX model
onnx_res = onnx_model.run(None, {
"input": [test_img.squeeze(0)]})
Something needs to be explained here:
-
onnx_model.run
: This is how to run an ONNX model.onnx_model
Is an instance of an ONNX model created through the ONNX Runtime. -
None
: This is a placeholder for specifying the desired output name. In this example,None
that means we don't specify output names, so ONNX Runtime will return all outputs. -
{"input": [test_img.squeeze(0)]}
: This is a dictionary of input data. ONNX models typically require a dictionary to specify the input data, where the keys are the input names and the values are the input data. Here, the input name is "input" and the corresponding input data istest_img.squeeze(0)
.
test_img.squeeze(0)
: This is totest_img
squeeze (remove) the first dimension of the Tensor (usually the batch dimension) so that it conforms to the input requirements of the ONNX model. Typically, the input Tensor of an ONNX model expects no batch dimension, so we use to.squeeze(0)
remove the first dimension to make the input data compatible with the ONNX model.
After running this command, onnx_res
the output of the ONNX model is included. The result is usually a list of output Tensors (remember, one list
), where each element corresponds to a model output . These results can be accessed and processed based on the model's output. In this particular example, further processing may be required onnx_res
to convert it into usable data or other subsequent operations, depending on the application scenario.
5.2.4 Save model results as pictures
def save_torch_res(torch_res, suffix):
# 转换 PyTorch 张量为 NumPy 数组
torch_res_numpy = torch_res[0].squeeze(0).numpy()
# 如果形状不是 [H, W],可以进一步调整
print(np.shape(torch_res_numpy))
# 如果形状不是 [H, W],可以进一步调整
if torch_res_numpy.shape[0] == 1:
torch_res_numpy = torch_res_numpy[0]
# 创建灰度图像
gray_image = Image.fromarray((torch_res_numpy * 255).astype('uint8'), mode='L')
# 将灰度图像转换为伪彩色图像(伪彩色映射可根据需要更改)
pseudo_color_image = gray_image.convert('P', palette=Image.ADAPTIVE, colors=256)
# 保存伪彩色图像
pseudo_color_image.save("results/pytorch_output_pseudo_color_image.png")
print("伪彩色图像已保存为 'results/pytorch_output_pseudo_color_image.png'")
def save_onnx_res(onnx_res, suffix):
# 转换 ONNX 结果为 NumPy 数组
onnx_res_numpy = np.array(onnx_res[0])
# 如果形状不是 [H, W],可以进一步调整
if onnx_res_numpy.shape[0] == 1:
onnx_res_numpy = np.squeeze(onnx_res_numpy, axis=0)
onnx_res_numpy = np.squeeze(onnx_res_numpy, axis=0)
# 创建灰度图像
gray_image = Image.fromarray((onnx_res_numpy * 255).astype('uint8'), mode='L')
# 将灰度图像转换为伪彩色图像(伪彩色映射可根据需要更改)
pseudo_color_image = gray_image.convert('P', palette=Image.ADAPTIVE, colors=256)
# 保存伪彩色图像
pseudo_color_image.save("results/onnx_output_pseudo_color_image.png")
print("伪彩色图像已保存为 'results/onnx_output_pseudo_color_image.png'")