Convert ONNX model to fp16 half-precision floating point method in Python environment

background

Running the model I want on TX2 and NX is still a bit slow, because TensorRT8.2 of Jetpack4.6.2 has problems with NX support with 16G memory and cannot run (8G memory is no problem), TensorRT7 that can run does not support me. The einsum operation used in the edge model, so I thought about changing it to fp16 and running it to see

reference

https://blog.csdn.net/znsoft/article/details/114538684

process

  1. The reference code is actually quite simple, but the installation process of the python environment is a bit bumpy. It is recommended to create a virtual environment for installation. It seems that someone has directly installed the environment.
  2. Create a new virtual environment for python3.7 . I created a new conda environment based on python3.7. Note that as of now 20220513, this winmltools cannot be installed on python3.8 , and the build wheel will report an error and get stuck, so I finally installed 3.7 Python, by the way, how can this broken thing have to install so many versions of scipy or something, it’s outrageous
  3. Direct command line installation:
pip install winmltools
  1. After installation, you can probably modify the model according to the following code:
from winmltools.utils import convert_float_to_float16
from winmltools.utils import load_model, save_model
onnx_model = load_model('model.onnx')
new_onnx_model = convert_float_to_float16(onnx_model)
save_model(new_onnx_model, 'model_fp16.onnx')

report error

I encountered a small problem with this model here, and reported an error:

(op_type:AveragePool, name:AveragePool_141): Inferred shape and existing shape differ in dimension 2: (8) vs (7)
Traceback (most recent call last):
  File "G:/jupyter/fp16_convert/fp16_convert.py", line 4, in <module>
    new_onnx_model = convert_float_to_float16(onnx_model)
  File "D:\ProgramData\Anaconda\envs\fp16_convert\lib\site-packages\onnxconverter_common\float16.py", line 139, in convert_float_to_float16
    model = func_infer_shape(model)
  File "D:\ProgramData\Anaconda\envs\fp16_convert\lib\site-packages\onnx\shape_inference.py", line 36, in infer_shapes
    inferred_model_str = C.infer_shapes(model_str)
RuntimeError: Inferred shape and existing shape differ in dimension 2: (8) vs (7)

Process finished with exit code 1

Since I have verified it, it may be that other models have encountered a small bug when transferring to onnx, just skip the infer part. Jump to shape_inference.py according to the error content and make the following modifications:


def infer_shapes(model):  # type: (ModelProto) -> ModelProto
    if not isinstance(model, ModelProto):
        raise ValueError('Shape inference only accepts ModelProto, '
                         'incorrect type: {}'.format(type(model)))
    model_str = model.SerializeToString()
    return onnx.load_from_string(model_str)
    inferred_model_str = C.infer_shapes(model_str)
    return onnx.load_from_string(inferred_model_str)

Re-run the code, generate it successfully, and put it on the NX development board to run, which is about 1.5 times faster than float.

Guess you like

Origin blog.csdn.net/weixin_42492254/article/details/124757094