YOLOv5-7.0 instance segmentation + TensorRT deployment

1: Introduction

Combining YOLOv5 for segmentation tasks and deploying TensorRT is a challenging and exciting task. The segmentation task requires the model to not only detect the existence of the target, but also accurately understand the boundary and contour of the target, and assign a corresponding category label to each pixel, so that the computer can have a deeper understanding and interpretation of the image. As a high-performance deep learning inference engine, TensorRT can significantly accelerate the model inference process and provide powerful support for real-time applications.

In this article, we will explore how to combine YOLOv5 with segmentation tasks to achieve simultaneous object detection and pixel-level semantic segmentation. We will introduce the technology and steps of model fusion in detail, and discuss in depth how to use TensorRT to optimize the model to achieve efficient deployment in embedded devices and edge computing environments. By elaborating experimental results and performance metrics, we will demonstrate the effectiveness and potential of this approach, giving readers a comprehensive understanding of combining YOLOv5, segmentation tasks, and TensorRT deployment.

Two: python

  1. Open pycharm and enter pip install labelme in the terminal.
  2. In order to facilitate our subsequent labeling work, we need to open the C drive->User->Username->.labelmerc file. After opening the file, change the auto_save in the first line to true. To facilitate the labeling box, we need to change create_polygon to W to facilitate the modification of the labeling. Box change edit_polygon to J
  3. After the download is completed, enter labelme in the pycharm terminal, open the folder of your data set, and label it. There will be no picture demonstration here.
  4. After marking, we need to convert the json file into a txt file, and put the required code below
    import json
    import os
    import argparse
    from tqdm import tqdm
    
    
    def convert_label_json(json_dir, save_dir, classes):
        json_paths = os.listdir(json_dir)
        classes = classes.split(',')
    
        for json_path in tqdm(json_paths):
            # for json_path in json_paths:
            path = os.path.join(json_dir, json_path)
            with open(path, 'r') as load_f:
                json_dict = json.load(load_f)
            h, w = json_dict['imageHeight'], json_dict['imageWidth']
    
            # save txt path
            txt_path = os.path.join(save_dir, json_path.replace('json', 'txt'))
            txt_file = open(txt_path, 'w')
    
            for shape_dict in json_dict['shapes']:
                label = shape_dict['label']
                label_index = classes.index(label)
                points = shape_dict['points']
    
                points_nor_list = []
    
                for point in points:
                    points_nor_list.append(point[0] / w)
                    points_nor_list.append(point[1] / h)
    
                points_nor_list = list(map(lambda x: str(x), points_nor_list))
                points_nor_str = ' '.join(points_nor_list)
    
                label_str = str(label_index) + ' ' + points_nor_str + '\n'
                txt_file.writelines(label_str)
    
    
    if __name__ == "__main__":
        """
        python json2txt_nomalize.py --json-dir my_datasets/color_rings/jsons --save-dir my_datasets/color_ringsts --classes "cat,dogs"
        """
        parser = argparse.ArgumentParser(description='json convert to txt params')
        parser.add_argument('--json-dir', type=str, default=r'json', help='json path dir')
        parser.add_argument('--save-dir', type=str, default=r'txt',help='txt save dir')
        parser.add_argument('--classes', type=str,default="1", help='classes')
        args = parser.parse_args()
        json_dir = args.json_dir
        save_dir = args.save_dir
        classes = args.classes
        convert_label_json(json_dir, save_dir, classes)
  5. After converting it to a txt file, divide the data set and conduct training (this step is enough if you have the skills, and I will not demonstrate it here)

  6. Convert the best.pt you trained into a wts file through gen_wts.py. For convenience, put best.pt in the directory and enter in the terminal: python gen_wts.py -w best.pt

     The code of gen_wts.py is as follows

    import sys
    import argparse
    import os
    import struct
    import torch
    from utils.torch_utils import select_device
    
    
    def parse_args():
        parser = argparse.ArgumentParser(description='Convert .pt file to .wts')
        parser.add_argument('-w', '--weights', required=True,
                            help='Input weights (.pt) file path (required)')
        parser.add_argument(
            '-o', '--output', help='Output (.wts) file path (optional)')
        parser.add_argument(
            '-t', '--type', type=str, default='detect', choices=['detect', 'cls'],
            help='determines the model is detection/classification')
        args = parser.parse_args()
        if not os.path.isfile(args.weights):
            raise SystemExit('Invalid input file')
        if not args.output:
            args.output = os.path.splitext(args.weights)[0] + '.wts'
        elif os.path.isdir(args.output):
            args.output = os.path.join(
                args.output,
                os.path.splitext(os.path.basename(args.weights))[0] + '.wts')
        return args.weights, args.output, args.type
    
    
    pt_file, wts_file, m_type = parse_args()
    print(f'Generating .wts for {m_type} model')
    
    # Initialize
    device = select_device('cpu')
    # Load model
    print(f'Loading {pt_file}')
    model = torch.load(pt_file, map_location=device)  # load to FP32
    model = model['ema' if model.get('ema') else 'model'].float()
    
    if m_type == "detect":
        # update anchor_grid info
        anchor_grid = model.model[-1].anchors * model.model[-1].stride[..., None, None]
        # model.model[-1].anchor_grid = anchor_grid
        delattr(model.model[-1], 'anchor_grid')  # model.model[-1] is detect layer
        # The parameters are saved in the OrderDict through the "register_buffer" method, and then saved to the weight.
        model.model[-1].register_buffer("anchor_grid", anchor_grid)
        model.model[-1].register_buffer("strides", model.model[-1].stride)
    
    model.to(device).eval()
    
    print(f'Writing into {wts_file}')
    with open(wts_file, 'w') as f:
        f.write('{}\n'.format(len(model.state_dict().keys())))
        for k, v in model.state_dict().items():
            vr = v.reshape(-1).cpu().numpy()
            f.write('{} {} '.format(k, len(vr)))
            for vv in vr:
                f.write(' ')
                f.write(struct.pack('>f', float(vv)).hex())
            f.write('\n')
    

 3. TensorRT

  1. Download the tensorrt split version corresponding to YOLOv5-7.0wang -xinyu/tensorrtx: Implementation of popular deep learning networks with TensorRT network definition API (github.com)
  2. Use cmake to decompress it. If you find it troublesome, you can just configure it yourself.
  3. I have written about the configuration of TensorRT in my previous article. If you are not sure, you can take a look at Windows YOLOv5-TensorRT deployment_tensorrt deployment in windows_Mr Dinosaur's blog-CSDN blog
  4. Open config.h and modify your detection category and image size
  5. Open yolov5_seg.cpp, find the main function and modify the file path​​​​​​
  6. If you have trouble running the source code (I am), of course you can modify it yourself to generate its engine file. After the engine file is generated, you can perform split testing.
  7. The inference speed of segmentation results is average, slower than detection.

 

 

 

 

 

4. Summary

  • For deployment, these are basically the operations. You can encapsulate the interface for easy calling later. I will update it later if necessary.
  • The segmentation speed is about 10ms slower than the detection speed. If you have speed requirements, you need to think twice.
  • Segmentation is more friendly to large targets. If you want to detect small targets, use target detection.

 I haven’t updated for more than half a year. I apologize to the fans and will update from time to time!

 

 

 

 

Guess you like

Origin blog.csdn.net/qq_58355216/article/details/132225268