This blog will teach you how to become a qualified algorithm porter step by step from plotting to the final deployment of yolov5_obb (yolov5 framework for oblique frame target detection) using tensorrt. yolov5_obb is a neural network for oblique frame target detection. I won’t talk about the specific principles. There are a lot of Chinese blogs about this network on Baidu. The author also posted a series of articles on how to rewrite ultralytics. yolov5 is used for oblique frame detection.

This code is actually modified on the basis of version 6.0 of ultralytics/yolov5, so the pre-training model can use the pre-training model of ultralytics/yolov5 6.0

In principle, this blog needs to be eaten by a porter with a certain deep learning ability, which means that he can be proficient in the installation and partial use of torch, opencv, cuda, cudnn, and tensorrt.

The environment of this article: GTX1080TI, cuda10.2 cudnn8.2.4 Tensorrt8.0.1.6 Opencv4.5.4, the code address in this article is: here

1. Model training and model conversion

1. Data annotation

The labeling software is rolabelimg. See here for the labeling method . Baidu crawls directly to the tomato to do it. We detect the tomato and its direction. Label as shown.

2. Label conversion

Convert the marked xml file to the dota format of DOTA_devkit, as follows:

txt的格式为：Format: x1, y1, x2, y2, x3, y3, x4, y4, category, difficulty

275.0 463.0 411.0 587.0 312.0 600.0 222.0 532.0 tomato 0
341.0 376.0 487.0 487.0 434.0 556.0 287.0 444.0 tomato 0
428.0 6.0 519.0 66.0 492.0 108.0 405.0 50.0 tomato 0

code show as below:

# *_* coding : UTF-8 *_*
# 功能描述   ：把旋转框 cx,cy,w,h,angle，转换成四点坐标x1,y1,x2,y2,x3,y3,x4,y4,class,difficulty

import os
import xml.etree.ElementTree as ET
import math

label=['tomato']
def edit_xml(xml_file):
    """
    修改xml文件
    :param xml_file:xml文件的路径
    :return:
    """
    print(xml_file)
    tree = ET.parse(xml_file)
    f=open(xml_file.replace('xml','txt').replace('anns','labelTxt'),'w')
    objs = tree.findall('object')
    for ix, obj in enumerate(objs):
        obj_type = obj.find('type')
        type = obj_type.text
        
        if type == 'bndbox':
            obj_bnd = obj.find('bndbox')
            obj_xmin = obj_bnd.find('xmin')
            obj_ymin = obj_bnd.find('ymin')
            obj_xmax = obj_bnd.find('xmax')
            obj_ymax = obj_bnd.find('ymax')
            xmin = float(obj_xmin.text)
            ymin = float(obj_ymin.text)
            xmax = float(obj_xmax.text)
            ymax = float(obj_ymax.text)
            obj_bnd.remove(obj_xmin)  # 删除节点
            obj_bnd.remove(obj_ymin)
            obj_bnd.remove(obj_xmax)
            obj_bnd.remove(obj_ymax)
            x0 = xmin
            y0 = ymin
            x1 = xmax
            y1 = ymin
            x2 = xmin
            y2 = ymax
            x3 = xmax
            y3 = ymax
        elif type == 'robndbox':
            obj_bnd = obj.find('robndbox')
            obj_bnd.tag = 'bndbox'   # 修改节点名
            obj_cx = obj_bnd.find('cx')
            obj_cy = obj_bnd.find('cy')
            obj_w = obj_bnd.find('w')
            obj_h = obj_bnd.find('h')
            obj_angle = obj_bnd.find('angle')
            cx = float(obj_cx.text)
            cy = float(obj_cy.text)
            w = float(obj_w.text)
            h = float(obj_h.text)
            angle = float(obj_angle.text)

            x0, y0 = rotatePoint(cx, cy, cx - w / 2, cy - h / 2, -angle)
            x1, y1 = rotatePoint(cx, cy, cx + w / 2, cy - h / 2, -angle)
            x2, y2 = rotatePoint(cx, cy, cx + w / 2, cy + h / 2, -angle)
            x3, y3 = rotatePoint(cx, cy, cx - w / 2, cy + h / 2, -angle)
        classes=int(obj.find('name').text)
        axis=list([str(x0),str(y0),str(x1), str(y1),str(x2), str(y2),str(x3), str(y3),label[classes],'0'])
        bb = " ".join(axis)
        f.writelines(bb)
        f.writelines("\n")
    f.close()
# 转换成四点坐标
def rotatePoint(xc, yc, xp, yp, theta):
    xoff = xp - xc;
    yoff = yp - yc;
    cosTheta = math.cos(theta)
    sinTheta = math.sin(theta)
    pResx = cosTheta * xoff + sinTheta * yoff
    pResy = - sinTheta * xoff + cosTheta * yoff
    return int(xc + pResx), int(yc + pResy)

if __name__ == '__main__':
    for path in os.listdir('anns/'):
        edit_xml('anns/'+path)

After the conversion, create a new images and labelTxt folder, copy the pictures and labels, and the final directory is:

data_dir/images/*.jpg
data_dir/labelTxt/*.txt

3. Model training

3.1 Compile and install

By default, you have already downloaded the yolov5_obb code and the pre-trained model (for convenience, I also put the training code in the repo, you can use mine directly, click here), now there are two places that need to be compiled

1. Install nms_rotated

pip install -r requirements.txt
cd utils/nms_rotated
python setup.py develop  #or "pip install -v -e ."

2. Install DOTA_devkit

sudo apt-get install swig
swig -c++ -python polyiou.i
python setup.py build_ext --inplace

3.2 Data Segmentation

yolov5_obb requires the input image to be rectangular, so you need to use DOTA_devkit's ImgSplit_multi_process.py to convert it into a 512x512 image input (not limited to 512, as long as it can be divisible by 32, for example). The resulting directory is as follows:

data_dir/split/images/*.jpg
data_dir/split/labelTxt/*.txt

3.3 Divide training set and verification set

Mainly to generate train.txt and val.txt, the code is as follows

# -*- coding: utf-8 -*-

import os
import random

# obb data split
annfilepath=r'/split/labelTxt/'
saveBasePath=r'split/'
train_percent=0.95
total_file = os.listdir(annfilepath)
num=len(total_file)
list=range(num)
tr=int(num*train_percent)
train=random.sample(list,tr)
ftrain = open(os.path.join(saveBasePath,'train.txt'), 'w')
fval = open(os.path.join(saveBasePath,'val.txt'), 'w')
for i  in list:
    name=total_file[i].split('.')[0]+'\n'
    if i in train:
        ftrain.write(name)
    else:
        fval.write(name)
ftrain.close()
fval.close()
print("train size",tr)
print("valid size",num-tr)

The final directory is as follows, my test.txt is a direct copy of val.txt:

data_dir/split/images/*.jpg
data_dir/split/labelTxt/*.txt
data_dir/split/train.txt
data_dir/split/val.txt
data_dir/split/test.txt

3.4 Model Training---Taking the yolov5s model as an example

If you want to change the data set, you need to change the following places

1. Change data/dotav15_poly.yaml to your own dataset path. And the number of categories and category labels, I am a category, just change the number of categories to 1, and the label is tomato

2. Change it to your own category number in models/yolov5s.yaml, I am a category, just change it to 1

3. Start training

python train.py --weights yolov5s.pt --cfg models/yolov5s.yaml --data data/dotav15_poly.yaml --hyp data/hyps/obb/hyp.finetune_dota.yaml --imgsz 512

4. Generate wts model

As mentioned earlier, this code is modified based on the yolov5-6.0 version. Wangxinyu has implemented the c++ version reasoning of tensort in his tensorrtx repo. Here, I am also modifying the final post-processing on the basis of him, adding rotation angle calculation and oblique frame NMS code, so as to realize the final oblique frame reasoning framework.

The wangxinyu boss is different from the general process. We use pt-onnx-engine most of the time, while the wangxinyu boss uses pt-wts-engine. The difference is that the latter needs to rewrite the network structure in c++, and then use wts weights It is quite difficult to initialize this network with parameters. Before I wrote a simple deeplabv3+ structure and wrote more than 500 lines. I definitely don’t know how to use complex structures like yolov5, so I directly use his code for post-processing. Revise.

By default, you have downloaded my code [If you haven't downloaded it, click here ]

The first step, copy gen_wts.py to the root directory of yolov5_obb

The second step is to generate the wts file

python gen_wts.py -w {runs下你训练好的pt模型路径} -o yolov5s.wts

2. Tensorrt model conversion

First copy yolov5s.wts to my c++ project {under the Yolov5_obb_Tensorrt_Infer directory}

1. Compile and install

By default, you have already installed cuda10.2, cudnn8.2, opencv4.5, tensorrt8.0, cmake3.15, and some other components, and use the following method to compile

cd 到下来解压后的目录
mkdir build
cd build
cmake ..
make

2. Generate engine file

After make in 1, the executable file yolov5_gen (exe in windows) will be generated in the build and run in the terminal

sudo ./yolov5_gen  -s ../yolov5s.wts ../yolov5s.engine s

Unsurprisingly, after running for some time, the file yolov5s.engine is generated (preferably installed according to my software version).

3. tensorrt reasoning

The main thing is to use the nvinfer library in opencv and tensorrt. Some codes are as follows:

After make in 1, the executable file yolov5_use (exe in windows) will be generated in the build and run in the terminal

sudo ./yolov5_use ../yolovs.engine ../images/test.jpg

If there is no accident, the following result will be displayed. You can see that the speed after tensorrt reasoning is about 4ms (my 1080ti entropy)

3. Reference

1. yolov5_obb of hukaixuan boss

2. Tensorrtx by wangxinyu

3. The link on the website again: code address

Yolov5 rotating frame (oblique frame) detection tensorrt deployment (C++) from entry to grave