[Model Deployment] Getting Started Tutorial (3): PyTorch to ONNX Detailed Explanation

Model Deployment Introductory Tutorial (3): PyTorch to ONNX Detailed Explanation - Zhihu (zhihu.com)

In the previous two tutorials , we led everyone to successfully deploy the first model and solved some difficulties that may be encountered in model deployment. Starting today, we will introduce ONNX related knowledge from shallow to deep. ONNX is currently one of the most important intermediate representations for model deployment. After learning the technical details of ONNX, you can avoid a lot of model deployment problems.
When converting a PyTorch model to an ONNX model, we often only need to call one sentence easily torch.onnx.export. The interface of this function looks simple, but there are many "hidden rules" in its use. In this tutorial, we will introduce the principle and precautions of converting PyTorch model to ONNX model in detail. In addition, we will also introduce the operator correspondence between PyTorch and ONNX to teach you how to deal with operator support problems that may be encountered during PyTorch model conversion.

A preview: In the following articles, we will continue to introduce how to support more ONNX operators in PyTorch, so that everyone can completely go through the deployment route from PyTorch to ONNX; introduce the knowledge of ONNX itself, and modify and debug ONNX The common approach to the model allows you to solve most of the deployment problems related to ONNX by yourself. Stay tuned~

torch.onnx.export Break down


In this section, we will introduce the conversion function from PyTorch to ONNX in detail -  torch.onnx.export. We hope that you can use this model conversion interface more flexibly, and understand its implementation principle to better deal with the error of this function (due to the compatibility of model deployment, this function often reports errors when deploying complex models).


Calculation graph export method

TorchScript  is a format for serializing and optimizing PyTorch models. During optimization, a torch.nn.Modulemodel is converted into a TorchScript  torch.jit.ScriptModulemodel. Now, TorchScript is also often used as an intermediate representation. We have a detailed introduction to TorchScript in other articles ( Interpretation of TorchScript (1): Getting to know TorchScript for the first time - Zhihu ), and the introduction of TorchScript here is only used to illustrate the principle of converting PyTorch models to ONNX.
torch.onnx.exportThe model required in is actually a torch.jit.ScriptModule. To convert the ordinary PyTorch model into such a TorchScript model, there are two methods of exporting the calculation graph: trace and script. If torch.onnx.exporta normal PyTorch model ( torch.nn.Module) is passed in, then the model will be exported using the trace method by default. This process is shown in the figure below:

Recall the knowledge of our first tutorial : the tracking method can only export the static graph of the model by actually running the model once, that is, it cannot identify the control flow (such as loops) in the model; the recording method can correctly record all control flow. Let's take the following code as an example to see the difference between these two conversion methods:

import torch 
 
class Model(torch.nn.Module): 
    def __init__(self, n): 
        super().__init__() 
        self.n = n 
        self.conv = torch.nn.Conv2d(3, 3, 3) 
 
    def forward(self, x): 
        for i in range(self.n): 
            x = self.conv(x) 
        return x 
 
 
models = [Model(2), Model(3)] 
model_names = ['model_2', 'model_3'] 
 
for model, model_name in zip(models, model_names): 
    dummy_input = torch.rand(1, 3, 10, 10) 
    dummy_output = model(dummy_input) 
    model_trace = torch.jit.trace(model, dummy_input) 
    model_script = torch.jit.script(model) 
 
    # 跟踪法与直接 torch.onnx.export(model, ...)等价 
    torch.onnx.export(model_trace, dummy_input, f'{model_name}_trace.onnx', example_outputs=dummy_output) 
    # 记录法必须先调用 torch.jit.sciprt 
    torch.onnx.export(model_script, dummy_input, f'{model_name}_script.onnx', example_outputs=dummy_output) 

In this code, we define a model with a loop, and the model ncontrols the number of times the input tensor is convolved through parameters. Afterwards, we each created a model of n=2and n=3. We export these two models by tracking and recording methods, respectively.
It is worth mentioning that since the two models ( model_tracemodel_script) here are TorchScript models, exportthe function does not need to run the model again. (If the model is obtained using the tracking method, torch.jit.traceit will be run once during execution; and when the recording method is used to export, the model does not need to be actually run) dummy_inputThe and` in the parameters dummy_outputare just to get the type of input and output tensors and shape.
Run the above code, we visualize the 4 onnx files obtained with Netron:

First look at the ONNX model structure obtained by the tracking method. It can be seen that  nthe structure of the ONNX model is different for different models.

With the recording method, the final ONNX model uses  Loop nodes to represent loops. In this way, even for different  n, ONNX models have the same structure.

The PyTorch version used in this article is 1.8.2. According to feedback, other versions of PyTorch may get different results.

Since the inference engine supports static graphs better, usually we don't need to explicitly convert the PyTorch model into a TorchScript model when deploying the model, and can directly export the PyTorch model with trace  torch.onnx.export . Understanding this part of the knowledge is mainly to better locate whether the problem occurs in the PyTorch to TorchScript stage when the model conversion error is reported.

Parameter explanation

After understanding the principle of the conversion function, let's introduce the function of the main parameters of the function in detail. We will mainly introduce how each parameter should be set in different model deployment scenarios from the perspective of application, instead of listing all the setting methods of each parameter. For the detailed API documentation of this function, please refer to:  torch.onnx ‒ PyTorch 1.11.0 documentation is defined in the file as follows
torch.onnx.export : torch.onnx.__init__.py

def export(model, args, f, export_params=True, verbose=False, training=TrainingMode.EVAL, 
           input_names=None, output_names=None, aten=False, export_raw_ir=False, 
           operator_export_type=None, opset_version=None, _retain_param_name=True, 
           do_constant_folding=True, example_outputs=None, strip_doc_string=True, 
           dynamic_axes=None, keep_initializers_as_inputs=None, custom_opsets=None, 
           enable_onnx_checker=True, use_external_data_format=False): 

The first three required parameters are the model, model input, and the name of the exported onnx file. We are already familiar with these parameters. Let's focus on some of the commonly used optional parameters later.

export_params

Whether to store model weights in the model. Generally, the intermediate representation contains two types of information: model structure and model weight. These two types of information can be stored in the same file or stored in separate files. ONNX uses the same file to represent the structure and weight of the record model.
We generally default this parameter to True when we deploy. If the onnx file is used to transfer models between different frameworks (such as PyTorch to Tensorflow) rather than for deployment, this parameter can be set to False.

input_names, output_names

Set the names of the input and output tensors. If not set, some simple name (such as a number) will be automatically assigned.
Each input and output tensor of an ONNX model has a name. When many inference engines run ONNX files, they need to input data in the form of "name-tensor value" data pairs, and obtain output data according to the name of the output tensor. When doing tensor-related settings (such as adding dynamic dimensions), you also need to know the name of the tensor.
In the actual deployment pipeline, we all need to set the names of the input and output tensors, and ensure that ONNX and the inference engine use the same set of names.

opset_version

Which ONNX operator set version to refer to when converting, the default is 9. The operator correspondence between PyTorch and ONNX will be introduced in detail later.

dynamic_axes

Specifies which dimensions of the input and output tensors are dynamic.
In order to pursue efficiency, ONNX defaults that all tensors involved in the operation are static (the shape of the tensor does not change). But in practical applications, we hope that the input tensor of the model is dynamic, especially the fully convolutional model that has no shape restrictions. Therefore, we need to explicitly indicate which dimensions of the input and output tensors are variable in size.
Let's look at an dynamic_axesexample setup:

import torch 
 
class Model(torch.nn.Module): 
    def __init__(self): 
        super().__init__() 
        self.conv = torch.nn.Conv2d(3, 3, 3) 
 
    def forward(self, x): 
        x = self.conv(x) 
        return x 
 
 
model = Model() 
dummy_input = torch.rand(1, 3, 10, 10) 
model_names = ['model_static.onnx',  
'model_dynamic_0.onnx',  
'model_dynamic_23.onnx'] 
 
dynamic_axes_0 = { 
    'in' : [0], 
    'out' : [0] 
} 
dynamic_axes_23 = { 
    'in' : [2, 3], 
    'out' : [2, 3] 
} 
 
torch.onnx.export(model, dummy_input, model_names[0],  
input_names=['in'], output_names=['out']) 
torch.onnx.export(model, dummy_input, model_names[1],  
input_names=['in'], output_names=['out'], dynamic_axes=dynamic_axes_0) 
torch.onnx.export(model, dummy_input, model_names[2],  
input_names=['in'], output_names=['out'], dynamic_axes=dynamic_axes_23) 

First, we export 3 ONNX models, which are models with no dynamic dimension, 0-dimensional dynamics, and 2nd and 3rd-dimensional dynamics.
In this code, we represent dynamic dimensions in a list, for example:

dynamic_axes_0 = { 
    'in' : [0], 
    'out' : [0] 
} 


Since ONNX requires each dynamic dimension to have a name, writing it like this will lead to a UserWarning, warning us that the system will automatically assign names to them if we set the dynamic dimensions through a list. One way to explicitly add dynamic dimension names is as follows:

dynamic_axes_0 = { 
    'in' : {0: 'batch'}, 
    'out' : {0: 'batch'} 
} 

Since we don't have any more operations on dynamic dimensions in this code, we can simply specify dynamic dimensions with a list.
After that, let's use the following code to take a look at the role of dynamic dimensions:

import onnxruntime 
import numpy as np 
 
origin_tensor = np.random.rand(1, 3, 10, 10).astype(np.float32) 
mult_batch_tensor = np.random.rand(2, 3, 10, 10).astype(np.float32) 
big_tensor = np.random.rand(1, 3, 20, 20).astype(np.float32) 
 
inputs = [origin_tensor, mult_batch_tensor, big_tensor] 
exceptions = dict() 
 
for model_name in model_names: 
    for i, input in enumerate(inputs): 
        try: 
            ort_session = onnxruntime.InferenceSession(model_name) 
            ort_inputs = {'in': input} 
            ort_session.run(['out'], ort_inputs) 
        except Exception as e: 
            exceptions[(i, model_name)] = e 
            print(f'Input[{i}] on model {model_name} error.') 
        else: 
            print(f'Input[{i}] on model {model_name} succeed.') 

We use a (1, 3, 10, 10)tensor of shape when exporting the calculation graph from the model. Now, let's try to use the shapes as (1, 3, 10, 10), (2, 3, 10, 10), (1, 3, 20, 20)input, run these models with ONNX Runtime, see under what circumstances an error will be reported, and save the corresponding error information. The resulting output should be as follows:

Input[0] on model model_static.onnx succeed. 
Input[1] on model model_static.onnx error. 
Input[2] on model model_static.onnx error. 
Input[0] on model model_dynamic_0.onnx succeed. 
Input[1] on model model_dynamic_0.onnx succeed. 
Input[2] on model model_dynamic_0.onnx error. 
Input[0] on model model_dynamic_23.onnx succeed. 
Input[1] on model model_dynamic_23.onnx error. 
Input[2] on model model_dynamic_23.onnx succeed. 

It can be seen that (1, 3, 10, 10)the input with the same shape does not make mistakes on all models. For inputs with different batch (0th dimension) or length and width (2nd and 3rd dimensions), no error will occur until the corresponding dynamic dimension is set. We can find out which dimensions are wrong in the error message. For example, we can use the following code to view the error message input[1]in model_static.onnx:

print(exceptions[(1, 'model_static.onnx')]) 
 
# output 
# [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: in for the following indices index: 0 Got: 2 Expected: 1 Please fix either the inputs or the model. 

This error message tells us that inthe 0th dimension of the input named does not match. Originally the length of this dimension should be 1, but our input is 2. In actual deployment, if we encounter similar errors, we can solve the problem by setting dynamic dimensions.

Use suggestions

By learning the previous knowledge, we have basically mastered  torch.onnx.exportthe partial realization principle and parameter setting method of the function, which is enough to complete the conversion of the simple model. But in practical applications, using this function will step on many pitfalls. Here, our model deployment team shares some experience accumulated in actual combat with you.

Make models behave differently when converted in ONNX

Sometimes, we want the model to have some different behaviors when it is exported to ONNX. The model has one set of logic when it is inferred directly with PyTorch, and another set of logic in the exported ONNX model. For example, we can put some post-processing logic in the model to simplify code other than running the model. torch.onnx.is_in_onnx_export()To achieve this task, the function  torch.onnx.export()is only true while executing. Here is an example:

import torch 
 
class Model(torch.nn.Module): 
    def __init__(self): 
        super().__init__() 
        self.conv = torch.nn.Conv2d(3, 3, 3) 
 
    def forward(self, x): 
        x = self.conv(x) 
        if torch.onnx.is_in_onnx_export(): 
            x = torch.clip(x, 0, 1) 
        return x 


Here, we only limit the value of the output tensor to [0, 1] when the model is exported. Using  is_in_onnx_exportit does allow us to easily add logic related to model deployment in the code. However, these codes are very unfriendly to developers and users who only care about model training, and the abrupt deployment logic will reduce the overall readability of the code. At the same time, is_in_onnx_exportit can only be "patched" in every place where deployment logic needs to be added, which makes it difficult to perform unified management. We will introduce how to use MMDeploy's rewriting mechanism to avoid these problems later.

Operations tracked using interrupt tensors

Is the tracking export method from PyTorch to ONNX a panacea? If we do some "out-of-the-box" operations in the model, the tracking method will turn some intermediate results that depend on the input into constants, so that the exported ONNX model will be different from the original model. Here's an example of what would cause this "trace break":

class Model(torch.nn.Module): 
    def __init__(self): 
        super().__init__() 
 
    def forward(self, x): 
        x = x * x[0].item() 
        return x, torch.Tensor([i for i in x]) 
 
model = Model()       
dummy_input = torch.rand(10) 
torch.onnx.export(model, dummy_input, 'a.onnx') 

If you try to export this model, you will get a lot of warnings, telling you that the converted model may not be correct. It's no wonder that in this model, we used .item()the tensor in torch to convert it into a normal Python variable, and also tried to traverse the torch tensor and create a new torch tensor with a list. These logics involving the conversion of tensors and ordinary variables will cause the final ONNX model to be incorrect.
On the other hand, we can also use this property to order the intermediate results of the model to become constant under the premise of ensuring correctness. This technique is often used to make models static, that is, all tensor shapes in the model become constant. In future tutorials, we will detail these "advanced" operations in deployment examples.

Use tensors as input (PyTorch version < 1.9.0)

As shown in our first tutorial , older (< 1.9.0) PyTorch  torch.onnx.export()throws an error when feeding Python values ​​into a model. For the sake of compatibility, we still recommend using tensor as the model input when converting the model.

PyTorch's operator support for ONNX

After ensuring torch.onnx.export()that the calling method is correct, the most likely problem when converting PyTorch to ONNX is that the operator is not compatible. Here we will introduce how to judge whether a PyTorch operator is compatible in ONNX, so as to help you better classify errors when encountering errors. The specific method of adding operators will be introduced in a later article.
When converting ordinary torch.nn.Modulemodels, PyTorch will use the tracking method to perform forward reasoning, and integrate the operators encountered into a calculation graph; on the other hand, PyTorch will also translate each operator encountered into ONNX. operator. During this translation process, the following situations may be encountered:

  • This operator can be translated into an ONNX operator one-to-one.
  • This operator has no direct corresponding operator in ONNX and will be translated into one or more ONNX operators.
  • The operator does not define rules for translation into ONNX, and an error is reported.

So, how to check the correspondence between PyTorch operators and ONNX operators? Since PyTorch operators are aligned to ONNX, here we first look at the definition of ONNX operators, and then look at the operator mapping relationship defined by PyTorch.


ONNX Operator Documentation

The definition of ONNX operators can be viewed in the official operator documentation . This document is very important, we have to "consult" this document when we encounter any problems related to ONNX operators.

The operator change table at the beginning of the most important part of this document. The first column of the table is the operator name, and the second column is the version number of the operator set that the operator changed, which is the version number of the operator set we torch.onnx.exportmentioned earlier. opset_versionBy viewing the version number of the first change of an operator, we can know from which version an operator is supported; by viewing the first change record of an operator less than or equal to, we can know the opset_versioncurrent version of the operator set The definition rule of the operator in .

By clicking the link in the table, we can view the input and output parameter specifications and usage examples of an operator. For example, the above figure is the definition rule of Relu in ONNX. This definition indicates that Relu should have one input and one input, and the input and output are of the same type, both tensor.

Mapping of PyTorch to ONNX operator

In PyTorch, all definitions related to ONNX are placed  in torch.onnxthe directory , as shown in the following figure:

Among them, symbolic_opset{n}.py(symbol table file) refers to the newly added content when PyTorch supports the nth version of the ONNX operator set. As we mentioned before, bicubic interpolation is supported in version 11. Let's take it as an example to see how to find the mapping of operators.
First, use the search function torch/onnxto search for "bicubic" in the folder, and you can find this interpolation in the definition file of the 11th version:

Afterwards, we jump step by step to the bottom ONNX mapping function according to the calling logic of the code:

upsample_bicubic2d = _interpolate("upsample_bicubic2d", 4, "cubic") 
 
-> 
 
def _interpolate(name, dim, interpolate_mode): 
    return sym_help._interpolate_helper(name, dim, interpolate_mode) 
 
-> 
 
def _interpolate_helper(name, dim, interpolate_mode): 
    def symbolic_fn(g, input, output_size, *args): 
        ... 
 
    return symbolic_fn 

Finally, in symbolic_fn, we can see how the interpolation operator is mapped to multiple ONNX operators. Among them, each g.opis a definition of ONNX. For example,  Resize the operator is written like this:

return g.op("Resize", 
                input, 
                empty_roi, 
                empty_scales, 
                output_size, 
                coordinate_transformation_mode_s=coordinate_transformation_mode, 
                cubic_coeff_a_f=-0.75,  # only valid when mode="cubic" 
                mode_s=interpolate_mode,  # nearest, linear, or cubic 
                nearest_mode_s="floor")  # only valid when mode="nearest" 

By looking up the definition of the Resize operator in the aforementioned ONNX operator document , we can know the meaning of each parameter. Using a similar method, we can query the parameter meanings of other ONNX operators, and then know how the parameters in PyTorch are passed into each ONNX operator step by step.
After mastering how to query the relationship between PyTorch and ONNX, we can preset a version number in the actual application  torch.onnx.export(), opset_versionand check the corresponding PyTorch symbol table file when encountering problems. If an operator does not exist, or the mapping relationship of the operator does not meet our requirements, we may have to bypass it with other operators, or customize the operator.

Summarize

In this tutorial, we systematically introduced the principle of converting PyTorch to ONNX. We first focused on explaining the most frequently used torch.onnx.export function, and then gave a method to query PyTorch's support for ONNX operators. Through this article, we hope that you can successfully convert most ONNX models that do not need to add new operators, and can effectively locate the cause of the problem when encountering operator problems. Specifically, after reading this article, you should understand the following knowledge:

  • What is the difference between tracing method and recording method in exporting a computation graph with control statements.
  • torch.onnx.export()How to set in  input_names, output_names, dynamic_axes.
  • Use  torch.onnx.is_in_onnx_export()to make models behave differently when converted to ONNX.
  • How to query ONNX operator documentation ( https://github.com/onnx/onnx/blob/main/docs/Operators.md ).
  • How to query PyTorch's support for new features of a certain ONNX version.
  • How to judge whether PyTorch supports a certain ONNX operator, and what is the supported method.

The knowledge introduced in this issue is relatively abstract, do you think it is a bit "watery"? It doesn't matter, in the next tutorial, we will introduce a variety of methods to add operator support for PyTorch to ONNX in the form of code examples, and remove more obstacles for everyone on the road of PyTorch to ONNX. Stay tuned!

practice practice

  1. The Asinh operator appears in the 9th ONNX operator set. How does PyTorch support this operator in the symbol table file of version 9?
  2. The BitShift operator appeared in the 11th ONNX operator set. How does PyTorch support this operator in the symbol table file of version 11?
  3. In the first tutorial , we said that PyTorch (as of operator set #11) does not support setting dynamic scaling factors in interpolation. Which parameter in the Resize operator mapping relationship does this coefficient correspond to  torch.onnx.symbolic_helper._interpolate_helper? symbolic_fnHow did we modify this parameter?

The answers to the exercises will be revealed in the next tutorial~ Let's try it together~

Welcome everyone to experience in MMDeploy~

https://github.com/open-mmlab/mmdeploy​github.com/open-mmlab/mmdeploy

If our sharing brings you some help, welcome to like, collect and pay attention, love~

Series Portal

OpenMMLab: Interpretation of TorchScript (1): Getting to know TorchScript for the first time

OpenMMLab: Interpretation of TorchScript (2): Torch jit tracer implementation analysis

OpenMMLab: Interpretation of TorchScript (3): subgraph rewriter in jit24 Agreed · 0 Comments Article43 Agreed · 0 Comments Article is uploading...ReuploadCancel

OpenMMLab: Interpretation of TorchScript (4): Alias ​​Analysis in Torch jit

OpenMMLab: Introduction to Model Deployment (1): Introduction to Model Deployment 172 Agreed 22 Comments 498 Agree 50 Comments are uploading...ReuploadCancel

OpenMMLab: Introduction to Model Deployment Tutorial (2): Solving the Problems in Model Deployment

OpenMMLab: Introductory Tutorial for Model Deployment (3): PyTorch to ONNX Detailed Explanation

OpenMMLab: Model Deployment Tutorial (4): Support more ONNX operators in PyTorch 68 Agreed· 19 Comments 190 Agree· 53 Comments are uploading...ReuploadCancel

OpenMMLab: Introduction to Model Deployment Tutorial (5): Modification and Debugging of ONNX Model 86 Agreed· 4 Comments 217 Agreed· 25 Comments are uploading...ReuploadCancel

Guess you like

Origin blog.csdn.net/qq_43456016/article/details/130254810
Recommended