Debug result = unpickler.load() ModuleNotFoundError: No module named ‘models‘

1. The problem when converting yolov5 to trt trained by torch is as follows:

Using CUDA device0 _CudaDeviceProperties(name='NVIDIA GeForce RTX 3080', total_memory=10017MB)

Find Pytorch weight
Traceback (most recent call last):
  File "export.py", line 243, in <module>
    ckpt = torch.load(opt.weight, map_location=device)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 592, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 851, in _load
    result = unpickler.load()
ModuleNotFoundError: No module named 'models'

2.Solution:

Directly use the export.py that comes with yolov5 to convert it to an .onnx model, and then convert it to trt through onnx. The problem is solved.

Find ONNX weight

TensorRT: starting export with TensorRT 8.4.0.6...
[08/24/2023-18:57:25] [TRT] [I] [MemUsageChange] Init CUDA: CPU +359, GPU +0, now: CPU 426, GPU 401 (MiB)
[08/24/2023-18:57:26] [TRT] [I] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 444 MiB, GPU 401 MiB
[08/24/2023-18:57:27] [TRT] [I] [MemUsageSnapshot] End constructing builder kernel library: CPU 819 MiB, GPU 523 MiB
[08/24/2023-18:57:27] [TRT] [I] ----------------------------------------------------------------
[08/24/2023-18:57:27] [TRT] [I] Input filename:   ../best.onnx
[08/24/2023-18:57:27] [TRT] [I] ONNX IR version:  0.0.6
[08/24/2023-18:57:27] [TRT] [I] Opset version:    11
[08/24/2023-18:57:27] [TRT] [I] Producer name:    pytorch
[08/24/2023-18:57:27] [TRT] [I] Producer version: 1.9
[08/24/2023-18:57:27] [TRT] [I] Domain:           
[08/24/2023-18:57:27] [TRT] [I] Model version:    0
[08/24/2023-18:57:27] [TRT] [I] Doc string:       
[08/24/2023-18:57:27] [TRT] [I] ----------------------------------------------------------------
[08/24/2023-18:57:27] [TRT] [W] onnx2trt_utils.cpp:365: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
TensorRT: Network Description:
TensorRT:       input "images" with shape (1, 3, 640, 640) and dtype DataType.FLOAT
TensorRT:       output "output" with shape (1, 25200, 20) and dtype DataType.FLOAT
TensorRT: building FP16 engine in ../best.engine
[08/24/2023-18:57:29] [TRT] [W] TensorRT was linked against cuBLAS/cuBLAS LT 11.8.0 but loaded cuBLAS/cuBLAS LT 11.3.0
[08/24/2023-18:57:29] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +637, GPU +268, now: CPU 1545, GPU 791 (MiB)
[08/24/2023-18:57:29] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +356, GPU +258, now: CPU 1901, GPU 1049 (MiB)
[08/24/2023-18:57:29] [TRT] [W] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.0.5
[08/24/2023-18:57:29] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
[08/24/2023-18:58:37] [TRT] [I] Some tactics do not have sufficient workspace memory to run. Increasing workspace size will enable more tactics, please check verbose output for requested sizes.
[08/24/2023-19:06:05] [TRT] [I] Detected 1 inputs and 4 output network tensors.
[08/24/2023-19:06:08] [TRT] [I] Total Host Persistent Memory: 218880
[08/24/2023-19:06:08] [TRT] [I] Total Device Persistent Memory: 1197056
[08/24/2023-19:06:08] [TRT] [I] Total Scratch Memory: 0
[08/24/2023-19:06:08] [TRT] [I] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 48 MiB, GPU 2470 MiB
[08/24/2023-19:06:08] [TRT] [I] [BlockAssignment] Algorithm ShiftNTopDown took 29.1457ms to assign 9 blocks to 142 nodes requiring 25804804 bytes.
[08/24/2023-19:06:08] [TRT] [I] Total Activation Memory: 25804804
[08/24/2023-19:06:08] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +40, GPU +42, now: CPU 40, GPU 42 (MiB)
export.py:172: CryptographyDeprecationWarning: Python 3.6 is no longer supported by the Python core team. Therefore, support for it is deprecated in cryptography and will be removed in a future release.
  from cryptography.fernet import Fernet
TensorRT: export success, saved as ../best.engine

3.Causes and other solutions

After checking online, the main reason is that torch.save(model, path) is used when saving the trained model, and model = torch.load(path) is used when loading; the source code for loading pt in export.py as follows:

if pt:
        logger.info("Find Pytorch weight")
        ckpt = torch.load(opt.weight, map_location=device)
        if opt.noema:
            model = ckpt['model']
        else:
            model = ckpt['ema'] if ckpt.get('ema') else ckpt['model']
            
        meta = get_meta_data(ckpt, model, meta)

        if opt.int8:
            zero_scale_fix(model, device)
            if model.__name__ != "EfficentYolo":
                for sub_fusion_list in op_concat_fusion_list[model.__name__]:
                    ops = [get_module(model, op_name) for op_name in sub_fusion_list]
                    concat_quant_amax_fuse(ops)
                for sub_fusion_list in op_concat_fusion_list[model.type]:
                    ops = [get_module(model, op_name) for op_name in sub_fusion_list]
                    concat_quant_amax_fuse(ops)
    
        model.float()
        if not opt.int8:
            model.fuse()
        model.to(device)
        model.eval()
        if opt.int8:
            quant_nn.TensorQuantizer.use_fb_fake_quant = True
        im = torch.zeros(1, 3, *imgsz).to(device)

        # 模型detect layer为了支持onnx的导出，所必须的更改
    #     model.detect.inplace = False
        if not(hasattr(model, 'type') and model.type in ['anchorfree', 'anchorbase']):
            model.type = 'anchorbase'
        model.detect.dynamic = dynamic
        model.detect.export = True  # 减少输出数量
        # 验证torch模型是否正常
        for _ in range(2):
            y = model(im)  # dry runs
            
        # 从模型中读取模型的labels，并保存到labels.txt下
        labels = str({
    
    i:l for i,l in enumerate(model.labels)})
        
        with open(file.parents[0]/'labels.txt','w') as f:
            f.write(labels)
        logger.info("the torch model is very successful, it's no possible!")
        
        if 'onnx' in opt.include or 'trt' in opt.include:
            try:
                import tensorrt as trt
                if model.type == 'anchorfree':
                    export_onnx(model, im, file, opt.opset, train=False, dynamic=False, simple=opt.simple)
                elif model.type == 'anchorbase':
                    if int(trt.__version__[0]) == 7:  # TensorRT 7 handling https://github.com/ultralytics/yolov5/issues/6012
                        model.detect.inplace = False
                        grid = model.detect.anchor_grid
                        model.detect.anchor_grid = [a[..., :1, :1, :] for a in grid]
                        export_onnx(model, im, file, opt.opset, train=False, dynamic=False, simple=opt.simple)  # opset 12
                        model.detect.anchor_grid = grid
                    else:  # TensorRT >= 8
                        export_onnx(model, im, file, opt.opset, train=False, dynamic=False, simple=opt.simple)  # opset 13
            except:
                logger.info("TRT ERROR, will custom onnx!")
                export_onnx(model, im, file, opt.opset, train=False, dynamic=False, simple=opt.simple)
                
            onnx_file = file.with_suffix('.onnx')
            add_meta_to_model(onnx_file, meta)
            if opt.int8:
                get_remove_qdq_onnx_and_cache(file.with_suffix('.onnx'))
                add_meta_to_model(str(onnx_file).replace('.onnx', '_wo_qdq.onnx'), meta)
                
        if 'trt' in opt.include:
            if opt.old:
                meta = False
            export_engine(onnx_file, None, meta=meta, half=opt.half, int8=opt.int8, workspace=opt.worker, encode=opt.encode, verbose=opt.verbose)
    else:    
        logger.info("Find ONNX weight")
        if not opt.old:
            meta = get_meta_data(file, None, meta)
            meta['half'] = opt.half
            meta['int8'] = opt.int8
            meta['encode'] = opt.encode
        if opt.old:
            meta = False

The guesses may be:
(1) When the model is trained, it saves some other parameter information. These parameters may involve the location of the training model, etc. When the model is migrated to other machines, such as when transferring trt to the machine that needs to be used, it cannot be found. When you reach this position, you can first convert it to the general onnx model, and then convert it to trt.
(2) Generally when this problem occurs, it is probably not the final trained model. For example, for example, the size of yolov5m after training is about 40M, but the size of best.pt and last.pt of the models saved midway should be 160M. About 120M of additional information may belong to the current machine. If the model is transplanted to other machines for use and conversion, problems will arise.