[python] tensorrt8 버전에서 onnx to tensorrt 엔진

배경

최근에 trtexec.exe로 생성된 trt 파일을 python 버전에서 실행할 수 없는 버그가 해결되었습니다. 환경에서 pytorch와 함께 오는 cuda와 trt 간의 충돌입니다. 그런데 해결하려고 하니 문제가 발생해서 환경이 좀 망가지고 그다음에 trtexec.exe로 생성된 엔진이 직접 망가져서 출력이 모두 NaN이었습니다. 좋아요, 그럼 파이썬 환경에서 생성하겠습니다. 근데 인터넷에 검색된 onnx2tensorrt 코드는 기본적으로 7세대 이전이라 뭔가 전에 해봤는데 실행이 안되네요. Python 환경에서 tensorrt8이 onnx를 엔진으로 변환하는 방법을 기록하려면 오늘 공식 코드를 참조하세요.

참고

깃허브 코드

간단한 과정

실제로 많은 공식 코드가 필요하지 않으며 일부만 선택하면 됩니다.

import tensorrt as trt
import os

EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
TRT_LOGGER = trt.Logger()


def get_engine(onnx_file_path, engine_file_path=""):
    """Attempts to load a serialized engine if available, otherwise builds a new TensorRT engine and saves it."""

    def build_engine():
        """Takes an ONNX file and creates a TensorRT engine to run inference with"""
        with trt.Builder(TRT_LOGGER) as builder, builder.create_network(
            EXPLICIT_BATCH
        ) as network, builder.create_builder_config() as config, trt.OnnxParser(
            network, TRT_LOGGER
        ) as parser, trt.Runtime(
            TRT_LOGGER
        ) as runtime:
            config.max_workspace_size = 1 << 32  # 4GB
            builder.max_batch_size = 1
            # Parse model file
            if not os.path.exists(onnx_file_path):
                print(
                    "ONNX file {} not found, please run yolov3_to_onnx.py first to generate it.".format(onnx_file_path)
                )
                exit(0)
            print("Loading ONNX file from path {}...".format(onnx_file_path))
            with open(onnx_file_path, "rb") as model:
                print("Beginning ONNX file parsing")
                if not parser.parse(model.read()):
                    print("ERROR: Failed to parse the ONNX file.")
                    for error in range(parser.num_errors):
                        print(parser.get_error(error))
                    return None

            # # The actual yolov3.onnx is generated with batch size 64. Reshape input to batch size 1
            # network.get_input(0).shape = [1, 3, 608, 608]

            print("Completed parsing of ONNX file")
            print("Building an engine from file {}; this may take a while...".format(onnx_file_path))
            plan = builder.build_serialized_network(network, config)
            engine = runtime.deserialize_cuda_engine(plan)
            print("Completed creating Engine")
            with open(engine_file_path, "wb") as f:
                f.write(plan)
            return engine

    if os.path.exists(engine_file_path):
        # If a serialized engine exists, use it instead of building an engine.
        print("Reading engine from file {}".format(engine_file_path))
        with open(engine_file_path, "rb") as f, trt.Runtime(TRT_LOGGER) as runtime:
            return runtime.deserialize_cuda_engine(f.read())
    else:
        return build_engine()


def main():
    """Create a TensorRT engine for ONNX-based YOLOv3-608 and run inference."""

    # Try to load a previously generated YOLOv3-608 network graph in ONNX format:
    onnx_file_path = "model.onnx"
    engine_file_path = "model.trt"

    get_engine(onnx_file_path, engine_file_path)


if __name__ == "__main__":
    main()

발췌 후 본인 모델에 따라 몇 가지를 삭제했는데 참조 모델의 onnx가 10인 것 같아서 1로 변환된 부분을 추가해줬습니다. 동적 입력 크기를 수행할 필요가 없으므로 추가하는 방법을 모르겠습니다. pycharm으로 해보고 실패했는데 메모리 할당에 문제가 있는 것 같습니다 1<<32 크기가 좀 무모한건지 모르겠습니다. 그런데 터미널을 pycharm으로 바꾸니 코드가 성공적으로 실행되었는데, 매우 추상적이다.

(mypytorch) PS F:\DeepStereo\AppleShow2> python onnx2trt.py      
onnx2trt.py:20: DeprecationWarning: Use set_memory_pool_limit instead.
  config.max_workspace_size = 1 << 32  # 4GB
Loading ONNX file from path G:\jupyter\Model_Zoo\resources_iter10_modify\crestereo_combined_iter10_240x320.onnx...
Beginning ONNX file parsing
[06/16/2022-16:59:16] [TRT] [W] onnx2trt_utils.cpp:365: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down t
o INT32.
Completed parsing of ONNX file
Building an engine from file G:\jupyter\Model_Zoo\resources_iter10_modify\crestereo_combined_iter10_240x320.onnx; this may take a while...
[06/16/2022-17:03:52] [TRT] [W] TensorRT was linked against cuBLAS/cuBLAS LT 11.8.0 but loaded cuBLAS/cuBLAS LT 11.3.1
[06/16/2022-17:07:42] [TRT] [W] TensorRT was linked against cuBLAS/cuBLAS LT 11.8.0 but loaded cuBLAS/cuBLAS LT 11.3.1
[06/16/2022-17:07:43] [TRT] [W] TensorRT was linked against cuBLAS/cuBLAS LT 11.8.0 but loaded cuBLAS/cuBLAS LT 11.3.1
Completed creating Engine

Supongo que te gusta

Origin blog.csdn.net/weixin_42492254/article/details/125319112
Recomendado
Clasificación