[Model Deployment] Getting Started Tutorial (5): Modification and Debugging of ONNX Models

Model Deployment Introductory Tutorial (5): Modification and Debugging of ONNX Model - Zhihu (zhihu.com)

Table of contents

The underlying implementation of ONNX

ONNX storage format

Structure definition of ONNX

Read and write ONNX models

Construct ONNX model

Read and modify ONNX models

Debug ONNX model

submodel extraction

Output the value of the ONNX intermediate node

Summarize

Series Portal


The tutorial series on getting started with model deployment continues to be updated. In the first two tutorials, we learned  how to convert PyTorch models to ONNX models , and learned how to customize operators for PyTorch or ONNX when the expressive capabilities of native operators are insufficient . For a long time, we have exported the ONNX model through PyTorch, and basically have not explored the construction knowledge of the ONNX model alone.
I don’t know if you will have some questions: what format is the ONNX model stored in the bottom layer? How to construct an ONNX model without relying on the deep learning framework and only using the ONNX API? If there is no source code, only an ONNX model, how to debug this model? Don't worry, today we will announce them one by one for you.
In this tutorial, we will focus on ONNX, a set of neural network definition standards, and explore the construction, reading, sub-model extraction, and debugging of ONNX models. First, we will learn the underlying representation of ONNX. Afterwards, we will use ONNX API to construct and read the model. Finally, we'll learn how to debug ONNX models by taking advantage of the submodel extraction capabilities provided by ONNX.

The underlying implementation of ONNX

ONNX storage format

ONNX is defined by Protobuf  at the bottom layer  . Protobuf, the full name of Protocol Buffer, is a set of mechanisms proposed by Google to represent and serialize data. When using Protobuf, users need to write a data definition file first, and then store the data into a binary file according to this definition file. It can be said that the data definition file is the data class, and the binary file is the instance of the data class.
Here is an example of a Protobuf data definition file:

message Person { 
  required string name = 1; 
  required int32 id = 2; 
  optional string email = 3; 
} 

This definition means that in  Person this data type,   these two fields must be included, and the fields are optionally  nameincluded . According to this definition file, the user can choose a programming language, define a class containing member variables  , ,  and   store an instance of this class as a binary file with Protobuf; otherwise, the user can also use the binary file and the corresponding Data definition file, read out an   instance of a class. For ONNX, Protobuf's data definition files are in its open source library . These files define the data type specifications of models, nodes, and tensors in the neural network; and binary files are the ".onnx" files we are familiar with. Each onnx file follows the The data definition specification stores all relevant data of a neural network. It is quite troublesome to generate ONNX model directly with Protobuf. Fortunately, ONNX provides many practical APIs, we can construct and read ONNX models without knowing Protobuf at all.idemailnameidemailPersonPerson


Structure definition of ONNX


Before using the API to operate the ONNX model, we need to understand the structure definition rules of ONNX and learn how ONNX describes a neural network in the Protobuf definition file.
Recall that a neural network is essentially a computational graph. The nodes of the calculation graph are operators, and the edges are the tensors involved in the operation. By visualizing the ONNX model, we know that ONNX records the attribute information of all operator nodes, and stores the tensor information involved in the operation in the input and output information of the operator nodes. In fact, the structure of the ONNX model can be roughly represented by a class diagram as follows:

As shown in the figure, an ONNX model can be  ModelProto represented by classes. ModelProto It contains log information such as the version and creator, and also contains the structure of the storage calculation graph  graph. GraphProto The class consists of input tensor information, output tensor information, and node information. Tensor information  ValueInfoProto class includes tensor name, basic data type, shape. The node information  NodeProto class includes the operator name, operator input tensor name, and operator output tensor name.
Let's look at a concrete example. If we have a described  output=a*x+b ONNX model  model, we  print(model) can output the following:

ir_version: 8 
graph { 
  node { 
    input: "a" 
    input: "x" 
    output: "c" 
    op_type: "Mul" 
  } 
  node { 
    input: "c" 
    input: "b" 
    output: "output" 
    op_type: "Add" 
  } 
  name: "linear_func" 
  input { 
    name: "a" 
    type { 
      tensor_type { 
        elem_type: 1 
        shape { 
          dim {dim_value: 10} 
          dim {dim_value: 10} 
        } 
      } 
    } 
  } 
  input { 
    name: "x" 
    type { 
      tensor_type { 
        elem_type: 1 
        shape { 
          dim {dim_value: 10} 
          dim {dim_value: 10} 
        } 
      } 
    } 
  } 
  input { 
    name: "b" 
    type { 
      tensor_type { 
        elem_type: 1 
        shape { 
          dim {dim_value: 10} 
          dim {dim_value: 10} 
        } 
      } 
    } 
  } 
  output { 
    name: "output" 
    type { 
      tensor_type { 
        elem_type: 1 
        shape { 
          dim { dim_value: 10} 
          dim { dim_value: 10} 
        } 
      } 
    } 
  } 
} 
opset_import {version: 15} 

Corresponding to the class diagram above, the information of this model consists of   global information such as  ir_version, and  graph information. Instead,   it contains a multiply node, an add node, three input tensors   , and an output tensor  . In the next section, we will use the API to construct this model and output this result.opset_importgraphgrapha, x, boutput

Read and write ONNX models

Construct ONNX model


In the previous section, we know that the ONNX model is organized in the following structure:

  • ModelProto
    • GraphProto
      • NodeProto
      • ValueInfoProto

Now, let's put aside PyTorch and try to construct an  output=a*x+b ONNX model describing a linear function entirely with ONNX's Python API. We will construct this model from the bottom up based on the structure above.
First, we can  construct an  object helper.make_tensor_value_info describing tensor information  with ValueInfoProtoAs shown in the previous class diagram, we need to pass in three pieces of information: the name of the tensor, the basic data type of the tensor, and the shape of the tensor. In ONNX, whether it is an input tensor or an output tensor, their representation is the same.  So here we construct   objects for three inputs a, x, b and one output  in a similar fashion  . As shown in the code below:outputValueInfoProto

import onnx 
from onnx import helper 
from onnx import TensorProto 
 
a = helper.make_tensor_value_info('a', TensorProto.FLOAT, [10, 10]) 
x = helper.make_tensor_value_info('x', TensorProto.FLOAT, [10, 10]) 
b = helper.make_tensor_value_info('b', TensorProto.FLOAT, [10, 10]) 
output = helper.make_tensor_value_info('output', TensorProto.FLOAT, [10, 10]) 

Afterwards, we need to construct operator node information  NodeProto, which can be  helper.make_node achieved by passing in three pieces of information: operator type, input operator name, and output operator name. Here we first construct the described  c=a*x multiplication node, and then construct  output=c+b the addition node. As shown in the code below:

mul = helper.make_node('Mul', ['a', 'x'], ['c']) 
add = helper.make_node('Add', ['c', 'b'], ['output']) 

In computers, graphs are generally represented by a node set and an edge set. On the other hand, ONNX cleverly saves the edge information in the node information, eliminating the need to save the edge set. In ONNX, if the input name of a node is the same as the output name of a previous node, the two nodes are connected by default. As shown in the above example: Mul the node defines the output  c, and Add the node defines the input  c, then  Mul the node and  Add the node are connected.
It is precisely because of the implicit definition rules of such edges that ONNX has certain requirements for the input of nodes: the input of a node is either the input of the entire model or the output of a previous node. If we  a, x, b take an input node out of the calculation graph (this operation will be introduced in the code later), or change the   output  Mul of  , the final ONNX model does not meet the standard.cd

An ONNX model that does not meet the criteria may not be correctly recognized by the inference engine. ONNX provides an API  onnx.checker.check_model  to determine whether an ONNX model meets the criteria.

Next, we helper.make_graph use to  construct the computational graph GraphProto. helper.make_graph The function needs to pass in four parameters: node, graph name, input tensor information, and output tensor information. As shown in the following code, we can pass in the previously constructed  NodeProto objects and  ValueInfoProto objects in order.

graph = helper.make_graph([mul, add], 'linear_func', [a, x, b], [output]) 

The node parameter here  make_graph has a requirement: the nodes of the computation graph must be given in topological order.

Topological order is a mathematical concept related to directed graphs. If all nodes are traversed in topological order, it can be guaranteed that the input of each node can be found in the output of the previous node (for the ONNX model, we also regard the input tensor of the calculation graph as the "previous output").

It doesn’t matter if you are not familiar with this concept. Let’s take the calculation graph just constructed as the research object, and use the two examples shown in the figure below to intuitively understand the topological order.

Here we only focus  Mul on  Add the sum nodes and the edges between them  c. In case 1: if our nodes are  [Mul, Add] given in order, then when traversing  Add , its input  c can be found in the previous Muloutput. However, as shown in case 2: if our nodes  [Add, Mul] are given in the order of , then  Add no input edge can be found and the computation graph cannot be constructed successfully. Here  [Mul, Add] is the topological order of the directed graph, but  [Add, Mul] not satisfied.

Finally, we  encapsulate helper.make_model the calculation graph  GraphProto into the model  ModelProto , and an ONNX model is constructed. make_model The function can also add information such as the model maker and version. For the sake of simplicity, we did not add additional information. As shown in the code below:

model = helper.make_model(graph) 

After constructing the model, we use the following three lines of code to check the correctness of the model, output the model in text form, and store it in a ".onnx" file. onnx.checker.check_model It is necessary to check whether the model meets the ONNX standard here   , because ONNX allows us to use onnx.save the storage model regardless of whether the model meets the standard or not. We certainly don't want to generate a model that doesn't meet the criteria.

onnx.checker.check_model(model) 
print(model) 
onnx.save(model, 'linear_func.onnx') 

On successful execution of this code, the program will output information about the model in text format, which should be the same as the output we showed in the previous section.
To sort it out, the code to construct a model with ONNX Python API is as follows:

import onnx 
from onnx import helper 
from onnx import TensorProto 
 
# input and output 
a = helper.make_tensor_value_info('a', TensorProto.FLOAT, [10, 10]) 
x = helper.make_tensor_value_info('x', TensorProto.FLOAT, [10, 10]) 
b = helper.make_tensor_value_info('b', TensorProto.FLOAT, [10, 10]) 
output = helper.make_tensor_value_info('output', TensorProto.FLOAT, [10, 10]) 
 
# Mul 
mul = helper.make_node('Mul', ['a', 'x'], ['c']) 
 
# Add 
add = helper.make_node('Add', ['c', 'b'], ['output']) 
 
# graph and model 
graph = helper.make_graph([mul, add], 'linear_func', [a, x, b], [output]) 
model = helper.make_model(graph) 
 
# save model 
onnx.checker.check_model(model) 
print(model) 
onnx.save(model, 'linear_func.onnx') 

As usual, we can run the model with ONNX Runtime to see if the model is correct:

import onnxruntime 
import numpy as np 
 
sess = onnxruntime.InferenceSession('linear_func.onnx') 
a = np.random.rand(10, 10).astype(np.float32) 
b = np.random.rand(10, 10).astype(np.float32) 
x = np.random.rand(10, 10).astype(np.float32) 
 
output = sess.run(['output'], {'a': a, 'b': b, 'x': x})[0] 
 
assert np.allclose(output, a * x + b) 

If all goes well, this code will not have any error messages. This shows that our model is equivalent to performing  a * x + b this calculation.


Read and modify ONNX models

By using the API to construct the ONNX model, we have thoroughly understood which modules ONNX consists of. Now, let's see how to read an existing ".onnx" file and extract model information from it.
First, we can read an ONNX model with the following code:

import onnx 
model = onnx.load('linear_func.onnx') 
print(model) 

When exporting the model before, we passed  onnx.save an  ModelProto object. In the same way, when reading the ONNX model above  onnx.load , what we harvest is also an  ModelProto object. After outputting this object, we should get the exact same output as before.
Next, let's take a look at how to   read graph GraphProto, node  NodeProto, and tensor information  :ValueInfoProto

graph = model.graph 
node = graph.node 
input = graph.input 
output = graph.output 
print(node) 
print(input) 
print(output) 

Using the above codes, we can access the graph, node, and tensor information of the model respectively. Here you may have questions: How to find out  graph.node,graph.input the  node, input names of these attributes? In fact, the name of the property is written in the output of each object. Let's take  print(node) the output of as an example:

[input: "a" 
input: "x" 
output: "c" 
op_type: "Mul" 
, input: "c" 
input: "b" 
output: "output" 
op_type: "Add" 
] 

In this output, we can see  node that it is actually a list, and the objects in the list have attributes  input, output, op_type(here  input is also a list, and the two elements it contains are displayed). We can use the following code to get  the properties of node the first node in  Mul :

node_0 = node[0] 
node_0_inputs = node_0.input 
node_0_outputs = node_0.output 
input_0 = node_0_inputs[0] 
input_1 = node_0_inputs[1] 
output = node_0_outputs[0] 
op_type = node_0.op_type 
 
print(input_0) 
print(input_1) 
print(output) 
print(op_type) 
 
# Output 
""" 
a 
x 
c 
Mul 
""" 

When we want to know what attributes a certain data object of the ONNX model has, we don't need to look through the ONNX document, we just need to output the data object first, and then find out the attribute name in the output result.
After reading the information of the ONNX model, it is very easy to modify the ONNX model. We can create new nodes and tensor information according to the model construction method in the previous section, and combine them with the original model to form a new model, or directly modify the attributes of a data object without violating the ONNX specification.
Here we look at an example of directly modifying model properties:

import onnx 
model = onnx.load('linear_func.onnx') 
 
node = model.graph.node 
node[1].op_type = 'Sub' 
 
onnx.checker.check_model(model) 
onnx.save(model, 'linear_func_2.onnx') 

After reading in the previous  linear_func.onnx model, we can directly modify the type of the second node  node[1].op_type, turning addition into subtraction. Thus, our model describes  a * x - b this linear function. If you are interested, you can run the new model with ONNX Runtime  linear_func_2.onnxto verify whether it is  a * x - b equivalent to or not.

Debug ONNX model

In actual deployment, if there is a problem with the ONNX model exported by the deep learning framework, it is generally solved by modifying the code of the framework instead of starting with ONNX. We treat the ONNX model as an unmodifiable black box.
Now that we have studied the principles of ONNX in depth, we can try to debug the ONNX model itself. In this section, let us see how to use the sub-model extraction function provided by ONNX to debug the ONNX model.

submodel extraction

ONNX officially provides developers with the function of sub-model extraction (extract). Sub-model extraction, as the name implies, is to extract a sub-model from a given ONNX model. The node set and edge set of this sub-model are all subsets of the corresponding set in the original model. Let's use PyTorch to export a more complex ONNX model and perform extraction operations on top of it:

import torch 
 
class Model(torch.nn.Module): 
 
    def __init__(self): 
        super().__init__() 
        self.convs1 = torch.nn.Sequential(torch.nn.Conv2d(3, 3, 3), 
                                          torch.nn.Conv2d(3, 3, 3), 
                                          torch.nn.Conv2d(3, 3, 3)) 
        self.convs2 = torch.nn.Sequential(torch.nn.Conv2d(3, 3, 3), 
                                          torch.nn.Conv2d(3, 3, 3)) 
        self.convs3 = torch.nn.Sequential(torch.nn.Conv2d(3, 3, 3), 
                                          torch.nn.Conv2d(3, 3, 3)) 
        self.convs4 = torch.nn.Sequential(torch.nn.Conv2d(3, 3, 3), 
                                          torch.nn.Conv2d(3, 3, 3), 
                                          torch.nn.Conv2d(3, 3, 3)) 
    def forward(self, x): 
        x = self.convs1(x) 
        x1 = self.convs2(x) 
        x2 = self.convs3(x) 
        x = x1 + x2 
        x = self.convs4(x) 
        return x 
 
model = Model() 
input = torch.randn(1, 3, 20, 20) 
 
torch.onnx.export(model, input, 'whole_model.onnx') 


The visualization result of this model is shown in the figure below (the serial number of the edge needs to be input to extract the sub-model, for everyone to read, this picture shows the serial number of the edge to be used later):

In the previous chapters, we learned that the edges of ONNX are represented by tensors of the same name. In other words, the edge number here is actually the output tensor number of the previous node and the input tensor number of the next node. Since this model is exported with PyTorch, these tensor numbers are automatically generated by PyTorch.


Next, we can extract a submodel with the following code:

import onnx  
 
onnx.utils.extract_model('whole_model.onnx', 'partial_model.onnx', ['22'], ['28']) 

The visualization result of the sub-model is shown in the figure below:

By observing the code and the output graph, it should not be difficult to guess that the function of this code is to extract the subgraph from edge 22 to edge 28 of the original calculation graph and form a submodel. onnx.utils.extract_model It is the function that completes the extraction of the sub-model. Its parameters are the original model path, the output model path, the input edge of the sub-model (input tensor), and the output edge of the sub-model (output tensor).
Intuitively, sub-model extraction is to extract all nodes between the input edge and the output edge. So, what are the restrictions on the use of this function? Based on that  whole_model.onnx, let's take a look at an example of three submodel extractions.

add extra output

We newly set an output tensor when extracting, as shown in the following code:

onnx.utils.extract_model('whole_model.onnx', 'submodel_1.onnx', ['22'], ['27', '31']) 

We can see that the sub-model will add a new edge that outputs the tensor, as shown in the following figure:

Add redundant input

If we still extract the sub-model between side 22 and side 28 as before, but add one more input  input.1, then the extracted sub-model will have a redundant input  input.1, as shown in the following code:

onnx.utils.extract_model('whole_model.onnx', 'submodel_2.onnx', ['22', 'input.1'], ['28']) 

As can be seen from the figure below: no matter what value is passed to this input, it will not affect the output of the sub-model. It can be considered that if only part of the input of the sub-model can be used to get the output, then those "earlier" extra inputs are redundant.

Insufficient information entered

This time, the submodel input we are trying to extract is edge 24 and the output is edge 28. As shown in the code and figure below:

# Error 
onnx.utils.extract_model('whole_model.onnx', 'submodel_3.onnx', ['24'], ['28']) 

It can be seen from the figure that if you want to calculate the result of side 28 through side 24, at least you need to input side 26, or the upper side. It is impossible to calculate the result of side 28 only by virtue of side 24, so it will report an error when extracting the sub-model in this way.

Through the above several usage examples, we can sort out the implementation principle of sub-model extraction: create a new model and fill in the given input and output. Then reverse all the directed edges of the graph, start traversing nodes from the output edge, and stop when encountering the input edge, and use the nodes traversed in this way as the nodes of the sub-model.
If you haven't fully understood the extraction principle, it doesn't matter, we just try to ensure that when filling in the input and output of the sub-model, the output can just be determined by the input.

Output the value of the ONNX intermediate node

When using ONNX models, one of the most common requirements is to be able to use the inference engine to output the value of intermediate nodes. This is mostly seen in the accuracy alignment of the deep learning framework model and the ONNX model, because as long as the value of the intermediate node can be output, the operator with deviation in accuracy can be located. Let's see how to achieve this task with submodel extraction.
In the first submodel extraction example just now, we added an output edge that was not present in the original model. Using the same principle, we can add some new outputs while keeping the original input and output unchanged, and extract a "sub-model" that can output intermediate nodes. For example:

 onnx.utils.extract_model('whole_model.onnx', 'more_output_model.onnx', ['input.1'], ['31', '23', '25', '27'])

In this sub-model, while maintaining the original input  input.1and output  31 , we added several other edges to the output. As shown below:

In this way, when running  more_output_model.onnx this model with ONNX Runtime, we can get more output.
In order to facilitate debugging, we can also split the original model into multiple disjoint sub-models. In this way, at each debugging, only some submodules of the original model can be debugged. for example:

onnx.utils.extract_model('whole_model.onnx', 'debug_model_1.onnx', ['input.1'], ['23']) 
onnx.utils.extract_model('whole_model.onnx', 'debug_model_2.onnx', ['23'], ['25']) 
onnx.utils.extract_model('whole_model.onnx', 'debug_model_3.onnx', ['23'], ['27']) 
onnx.utils.extract_model('whole_model.onnx', 'debug_model_4.onnx', ['25', '27'], ['31']) 

In this example, we split the original more complex model into four simpler sub-models, as shown in the figure below. When debugging, we can first debug the top-level sub-model, and after confirming that the top-level sub-model is correct, use its output as the input of the subsequent sub-model.
For example, for these submodels, we can first debug the first submodel and store the output 23. Then use tensor 23 as the input of the second and third sub-models, and debug these two models. Finally, use the same method to debug the fourth sub-model. It can be said that with the sub-model extraction function, even in the face of a huge model, we can extract the problematic sub-module from it and carefully debug only this sub-module.

Submodel extraction is certainly a handy ONNX debugging tool. However, in actual situations, we generally use frameworks such as PyTorch to export ONNX models. There are two problems here:

  1. Once the PyTorch model changes, the edge numbers of the ONNX model will also change. In this way, every time the same sub-module is extracted, it is necessary to check the serial number in the ONNX model again. Such a cumbersome debugging method will not be used in practice.
  2. Even if we can ensure that the edge number of ONNX does not change, it is difficult for us to match the PyTorch code with the ONNX node - when the model structure becomes very complex, it is impossible to identify the meaning of each node in ONNX.

In MMDeploy, we added model chunking to PyTorch models. Using this feature, we can export the original model into multiple disjoint sub-ONNX models by only modifying the implementation code of the PyTorch model. We'll cover it in a later tutorial.

https://github.com/open-mmlab/mmdeploy​github.com/open-mmlab/mmdeploy

Summarize

In this tutorial, we set aside PyTorch and learned about the ONNX model itself. The old rules, let's summarize the knowledge points of this tutorial:

  • ONNX uses Protobuf to define specification and serialization models.
  • An ONNX model is mainly composed of  ModelProto, GraphProto, NodeProto, ValueInfoProto objects of these data classes.
  • Using  onnx.helper.make_xxx, we can construct the data object of the ONNX model.
  • onnx.save() Models can be saved, onnx.load() models can be read, and onnx.checker.check_model() models can be checked for compliance.
  • onnx.utils.extract_model() Some nodes can be taken out from the original model, and a new sub-model can be formed with the newly defined input and output edges.
  • Using the sub-model extraction function, we can output the intermediate results of the original ONNX model to realize the debugging of the ONNX model.

So far, our study of ONNX related knowledge has come to an end. To review, we first learned how to use the API from PyTorch to ONNX; then, we learned how to use custom operators to solve the problem of insufficient expressive ability of PyTorch and ONNX; finally, we learned the debugging method of the ONNX model separately. By learning ONNX from shallow to deep, we can basically deal with most of the problems related to ONNX in model deployment.

If you want to know more about ONNX API, you can read ONNX's official Python  API documentation .
However, if we just passed the knowledge, we may not be able to skillfully apply these PyTorch and ONNX APIs. In the next tutorial, we will use PyTorch and ONNX to write some practical tools related to the ONNX model, as a summary of the past few tutorials, so stay tuned!

Interested friends, welcome to MMDeploy to experience~

https://github.com/open-mmlab/mmdeploy​github.com/open-mmlab/mmdeploy

Series Portal

OpenMMLab: Interpretation of TorchScript (1): Getting to know TorchScript for the first time

OpenMMLab: Interpretation of TorchScript (2): Torch jit tracer implementation analysis

OpenMMLab: Interpretation of TorchScript (3): subgraph rewriter in jit24 Agreed · 0 Comments Article43 Agreed · 0 Comments Article is uploading...ReuploadCancel

OpenMMLab: Interpretation of TorchScript (4): Alias ​​Analysis in Torch jit

OpenMMLab: Introduction to Model Deployment (1): Introduction to Model Deployment 172 Agreed 22 Comments 498 Agree 50 Comments are uploading...ReuploadCancel

OpenMMLab: Introduction to Model Deployment Tutorial (2): Solving the Problems in Model Deployment

OpenMMLab: Introductory Tutorial for Model Deployment (3): PyTorch to ONNX Detailed Explanation

OpenMMLab: Model Deployment Tutorial (4): Support more ONNX operators in PyTorch 68 Agreed· 19 Comments 190 Agree· 53 Comments are uploading...ReuploadCancel

OpenMMLab: Introduction to Model Deployment Tutorial (5): Modification and Debugging of ONNX Model 86 Agreed· 4 Comments 217 Agreed· 25 Comments are uploading...ReuploadCancel

Guess you like

Origin blog.csdn.net/qq_43456016/article/details/130256097