TensorFlow pb model modification and optimization

After the TensorFlow model training is completed, a final pb model is usually saved through the frozen process. The saved pb model is saved in the GraphDef data structure, which can be serialized and saved as a binary pb model or a text pbtxt model. GraphDef is essentially a DAG directed acyclic graph, which mainly stores an operator node list, each operator has a name, attr, etc., and contains the connection relationship between nodes through input. The input nodes of the entire GraphDef are identified by Placeholder nodes, and the model parameter weights are usually saved by Const nodes. Unlike onnx, GraphDef does not identify the output. The advantage is that the output of any node can be referenced through node_name:idx. The disadvantage is that it is generally necessary to manually open and view the model output through netron, or analyze the node without an output node as a model through code analysis. output node. The following is a brief introduction to some commonly used processing methods of the pb model.

pb model save

# write pb model
with tf.io.gfile.GFile(model_path, "wb") as f:
    f.write(graph_def.SerializeToString())

# write pbtxt model
tf.io.write_graph(graph_def, os.path.dirname(model_path), os.path.basename(model_path))

Build the model and save

import tensorflow as tf
import numpy as np

tf.compat.v1.disable_eager_execution()
tf.compat.v1.reset_default_graph()

m = 200
k = 256
n = 128

a_shape = [m, k]
b_shape = [k, n]

np.random.seed(0)
input_np = np.random.uniform(low=0.0, high=1.0, size=a_shape).astype("float32")
kernel_np = np.random.uniform(low=0.0, high=1.0, size=b_shape).astype("float32")

pld1 = tf.compat.v1.placeholder(dtype="float32", shape=a_shape, name="input1")
kernel = tf.constant(kernel_np, dtype="float32")
feed_dict = {pld1: input_np}

result_tf = tf.raw_ops.MatMul(a=pld1, b=kernel, transpose_a=False, transpose_b=False)

with tf.compat.v1.Session() as sess:
    results = sess.run(result_tf, feed_dict=feed_dict)
    print("results:", results)

dump_model_name = "matmul_graph.pb"

graph = tf.compat.v1.get_default_graph()
graph_def = graph.as_graph_def()
with tf.io.gfile.GFile(dump_model_name, "wb") as f:
    f.write(graph_def.SerializeToString())

Of course, other methods are generally used instead of raw_ops to build models.

pb model read

from google.protobuf import text_format

graph_def = tf.compat.v1.GraphDef()

# read pb model
with tf.io.gfile.GFile(model_path, "rb") as f:
    graph_def.ParseFromString(f.read())

# read pbtxt model
with open(model_path, "r") as pf:
    text_format.Parse(pf.read(), graph_def)

node information printing

general information:

node.name
node.op
node.input
node.device
# please ref https://www.tensorflow.org/api_docs/python/tf/compat/v1/AttrValue
node.attr[attr_name].f # b, i, tensor, etc.

# graph_def中node遍历：
for node in graph_def.node:
    ##

For the input of a node, generally use node_name:idx such as node_name:0 to indicate that the input comes from the idx-th output of the previous operator. If :0 is omitted, it defaults to the 0th output. A ^ symbol preceding the name is a control edge. This input is a string list, and the order in it also corresponds to the order of each input of this node.

create node

from tensorflow.core.framework import attr_value_pb2
from tensorflow.core.framework import node_def_pb2
from tensorflow.python.framework import tensor_util

pld_node = node_def_pb2.NodeDef()
pld_node.name = name
pld_node.op = "Placeholder"

shape = tf.TensorShape([None, 3, 256, 256])
pld_node.attr["shape"].CopyFrom(attr_value_pb2.AttrValue(shape=shape.as_proto()))
dtype = tf.dtypes.as_dtype("float32")
pld_node.attr["dtype"].CopyFrom(attr_value_pb2.AttrValue(type=dtype.as_datatype_enum))

# other commonly used setting
node.input.extend(in_node_names)
node.attr["value"].CopyFrom(
    attr_value_pb2.AttrValue(tensor=tensor_util.make_tensor_proto(
        np_array, np_array.type, np_array.shape)))

Create GraphDef and add node

graph_def_n = tf.compat.v1.GraphDef()
for node in graph_def_o.node:
    node_n = node_def_pb2.NodeDef()
    node_n.CopyFrom(node)
    graph_def_n.node.extend([node_n])

# you probably need copy other value like version, etc. from old graph
graph_def_n.version = graph_def_o.version
graph_def_n.library.CopyFrom(graph_def_o.library)
graph_def_n.versions.CopyFrom(graph_def_o.versions)

return graph_def_n

There is no topo sorting requirement for adding nodes to the graph with an onnx model

Set the shape of the placeholder

Refer to the previous section of creating a node, by modifying the shape property of the Placeholder.

Model shape derivation

Need to import the model to tf: tf.import_graph_def(graph_def, name=''). Of course, you need to set the correct shape of pld first.

Then get the output tensor of node: graph.get_tensor_by_name(node_name + ":0").

Finally, the shape and dtype can be obtained from the tensor.

pb model diagram optimization

The idea is generally relatively simple:

1. Subgraph connection relationship matching, for example, to match the pattern connection relationship of conv2d+bn+relu. Since each node only saves its input node connection relationship, the input and output of each node is generally required for DFS/BFS traversal of the graph. This can first read all node connection relationships and create an output information map at the same time based on the input information.

2. Subgraph replacement, first create a new operator, and then replace the old operator with the new operator. This requires creating a new node or modifying the original node directly. Old unnecessary operators can be discarded when creating a new graph copy, and new nodes can be directly extended to graph_def.

3. If you replace it with the built-in operator of TF, the operator definition can refer to the definition in tensorflow raw_ops, but some attributes (such as data type attr "T") are not listed:

https://www.tensorflow.org/api_docs/python/tf/raw_ops

Of course, it can also be replaced with a custom operator, which requires the user to develop and register a custom operator:

https://www.tensorflow.org/guide/create_op

As mentioned above, TensorFlow's pb model modification and optimization can be implemented directly using python code, which greatly simplifies the development process. Of course, TensorFlow can also register the grappler and post rewrite graph optimization pass to perform graph optimization at the C++ level. The latter can be used not only for inference, but also for training optimization.

Mutual conversion between saved model and pb model

You can refer to: Tensorflow model export summary - know almost

The saved model saves an entire training graph, and the parameters are not frozen. However, it is only used for model inference serving and does not require a complete training graph, and the parameters are not frozen and cannot be converted to TensorRT and other extreme optimizations. Of course, saved_model->frozen pb->saved model can also be used to take advantage of both.

pb to onnx

使用tf2onnx库GitHub - onnx/tensorflow-onnx: Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX

#!/bin/bash

graphdef=input_model.pb
inputs=Placeholder_1:0,Placeholder_2:0
outputs=output0:0,output1:0

output=${graphdef}.onnx

python -m tf2onnx.convert \
    --graphdef ${graphdef} \
    --output ${output} \
    --inputs ${inputs} \
    --outputs ${outputs}\
    --opset 12

Onnx model modification and optimization reference:

onnx model graph optimization/model modification_Luchang-Li's Blog-CSDN Blog

h5 model to pb

Tensorflow h5 to pb_there2belief's blog-CSDN blog_h5 to pb