[Deep Learning Framework Format Conversion] [GPU] Detailed explanation of the Pytorch model to ONNX model format process [Getting Started]

[Deep Learning Framework Format Conversion] [GPU] Detailed explanation of the Pytorch model to ONNX model format process [Getting Started]

Tip: The blogger has selected many blog posts from big guys and personally tested them to be effective. He shares his notes and invites everyone to study and discuss them together.


Preface

Neural network models are usually trained under deep learning frameworks (PyTorc, TensorFlow, Caffe, etc.). These deep learning frameworks for specific environments have many dependencies and are large in scale. They are not suitable for installation in production environments. Onnx supports most frameworks. Model conversion facilitates model integration, and deep learning models require a large amount of computing power to meet real-time operation requirements. The operating efficiency of the model needs to be optimized. The combination of onnx can bring stable speed-up.
onnx can also be converted into TensorRT (GPU) format and OpenVINO (CPU) format for inference, further improving the speed.
Series learning directory:
[CPU] Detailed explanation of the process of converting Pytorch model to ONNX model
[GPU] Detailed explanation of the process of converting Pytorch model to ONNX format
[ONNX model] ] Rapid deployment
[ONNX model] Multi-threaded rapid deployment
[ONNX model] Opencv calls onnx


PyTorch model environment construction (GPU)

The blogger uses the PFNet algorithm of disguised object segmentation (COS) as an example to explain in detail: [ PFNet-pytorch code ].
Use PyTorch to run a disguised object segmentation model PFNet, and deploy the model to the ONNX Runtime inference engine.
The blogger installed the anaconda environment in the win10 environment and built a PyTorch environment for running the PFNet model ( GPU version tutorial )

# 创建虚拟环境
conda create -n pytorch2onnx_gpu python=3.10 -y
# 激活环境
activate pytorch2onnx_gpu 
# 下载githup源代码到合适文件夹,并cd到代码文件夹内(科学上网)
git clone https://github.com/Mhaiyang/CVPR2021_PFNet.git
# 安装pytorch(gpu)
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

The blogger will not explain the content of the code in detail here, but only focuses on the use of the code, that is, the testing process of the code. The author of the source code provided pre-training weights and test data, and the blogger compiled it on [ Baidu Cloud , extraction code: a660] for everyone to download.
Download ResNet50-19C8E357.pth Place it in CVPR2021_PFNET \ Backbone \ Resnet:

Download pfnet.pth Place it under CVPR2021_PFNET:

Download Test Data set Camo_testingdataset.zip, Chameleon_teestinin gdataSet.zip and COD10K_TESTINGASEASET.ZIP decompress the renames in CVPR2021_PFNET \ Data \ test :

Use pre-trained weights for testing and modify the contents of the infer.py file

# 1.修改infer.py,只保留在test中有的数据集
to_test = OrderedDict([
                       ('CHAMELEON', chameleon_path),
                       ('CAMO', camo_path),
                       ('COD10K', cod10k_path),
                       # ('NC4K', nc4k_path)
                       ])
                       
# 2.修改infer.py,选择本机拥有的gpu
device_ids = [1] 
# 博主这里是  device_ids = [0]

# 3.修改config.py中的内容
# datasets_root = '../data/NEW'修改成datasets_root = './data              

You can view the effect in CVPR2021_PFNet\results:

Here the PyTorch model environment (GPU) is completed.


Install onnx and onnxruntime (GPU)

You need to install onnx and onnxruntime in the anaconda virtual environment. You need to pay attention to the versions of onnxruntime-gpu, cuda, and cudnn. For details, refer to the official instructions.

The blogger is win10+cuda11.8+cudnn8.7.0, which corresponds to onnxruntime-gpu==1.15. .0

import torch
# 查询cuda版本
print(torch.version.cuda)
# 查询cudnn版本
print(torch.backends.cudnn.version())

# 激活环境
activate pytorch2onnx_gpu 
# 安装onnx
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple onnx
# 安装GPU版
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple onnxruntime-gpu==1.15.0

pytorch2onnx

Create a new pytorch2onnx.py file in the CVPR2021_PFNet directory and execute the file

import onnx
from onnx import numpy_helper
import torch
from PFNet import PFNet
backbone_path = './backbone/resnet/resnet50-19c8e357.pth'
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
example = torch.randn(1,3, 416, 416).to(device)     # 1 3 416 416
print(example.dtype)
model = PFNet(backbone_path)                        # PFNet网络模型

model.load_state_dict(torch.load(r'PFNet.pth'))     # 加载训练好的模型
model = model.to(device)                            # 模型放到cuda或者cpu上
model.eval()

torch.onnx.export(model, example, r"PFNet.onnx")     # 导出模型
model_onnx = onnx.load(r"PFNet.onnx")                   # onnx加载保存的onnx模型
onnx.checker.check_model(model_onnx)                    # 检查模型是否有问题
print(onnx.helper.printable_graph(model_onnx.graph))    # 打印onnx网络

The pytorch model is converted into onnx model successfully.
Now put aside any pytorch-related dependencies and use the onnx model to complete the test. The code is rewritten with reference to the infer.py part of the source code.

import onnxruntime as ort
import numpy as np
from collections import OrderedDict
from config import *
from PIL import Image
from numpy import mean
import time
import datetime

def composed_transforms(image):
    mean = np.array([0.485, 0.456, 0.406])  # 均值
    std = np.array([0.229, 0.224, 0.225])  # 标准差
    # transforms.Resize是双线性插值
    resized_image = image.resize((args['scale'], args['scale']), resample=Image.BILINEAR)
    # onnx模型的输入必须是np,并且数据类型与onnx模型要求的数据类型保持一致
    resized_image = np.array(resized_image)
    normalized_image = (resized_image/255.0 - mean) / std
    return np.round(normalized_image.astype(np.float32), 4)

def check_mkdir(dir_name):
    if not os.path.exists(dir_name):
        os.makedirs(dir_name)

to_test = OrderedDict([
                       # ('CHAMELEON', chameleon_path),
                       # ('CAMO', camo_path),
                       ('COD10K', cod10k_path),
                       ])
args = {
    
    
    'scale': 416,
    'save_results': True
}

def main():
    # 保存检测结果的地址
    results_path = './results2'
    exp_name = 'PFNet'
    providers = ["CUDAExecutionProvider"]
    ort_session = ort.InferenceSession("PFNet.onnx", providers=providers)  # 创建一个推理session
    input_name = ort_session.get_inputs()[0].name
    # 输出有四个
    output_names = [output.name for output in ort_session.get_outputs()]
    start = time.time()
    for name, root in to_test.items():
        time_list = []
        image_path = os.path.join(root, 'image')
        if args['save_results']:
            check_mkdir(os.path.join(results_path, exp_name, name))
        img_list = [os.path.splitext(f)[0] for f in os.listdir(image_path) if f.endswith('jpg')]
        for idx, img_name in enumerate(img_list):
            img = Image.open(os.path.join(image_path, img_name + '.jpg')).convert('RGB')
            w, h = img.size
            #  对原始图像resize和归一化
            img_var = composed_transforms(img)
            # np的shape从[w,h,c]=>[c,w,h]
            img_var = np.transpose(img_var, (2, 0, 1))
            # 增加数据的维度[c,w,h]=>[bathsize,c,w,h]
            img_var = np.expand_dims(img_var, axis=0)
            start_each = time.time()
            prediction = ort_session.run(output_names, {
    
    input_name: img_var})
            time_each = time.time() - start_each
            time_list.append(time_each)
            # 除去多余的bathsize维度,NumPy变会PIL同样需要变换数据类型
            # *255替换pytorch的to_pil
            prediction = (np.squeeze(prediction[3])*255).astype(np.uint8)
            if args['save_results']:
               (Image.fromarray(prediction).resize((w, h)).convert('L').save(os.path.join(results_path, exp_name, name, img_name + '.png')))
        print(('{}'.format(exp_name)))
        print("{}'s average Time Is : {:.3f} s".format(name, mean(time_list)))
        print("{}'s average Time Is : {:.1f} fps".format(name, 1 / mean(time_list)))
    end = time.time()
    print("Total Testing Time: {}".format(str(datetime.timedelta(seconds=int(end - start)))))
if __name__ == '__main__':
    main()

You can view the effect in CVPR2021_PFNet\results2:
Insert image description here


Summarize

This article introduces the process of converting Pytorch models to ONNX format in both CPU and GPU modes as simply and in detail as possible. Later, the deployment of ONNX models will be further explained based on your own learning and needs.

Guess you like

Origin blog.csdn.net/yangyu0515/article/details/132900276