[Target tracking] Yolov5_DeepSort_Pytorch trains its own data

For the basic configuration, please see [Target Tracking] Yolov5_DeepSort_Pytorch recurrence

table of Contents

1. Environment

2. Data preparation for target detection

1) Data labeling

2) Separate training set and validation set

3) Modify JPEGImages to images

4) Convert xml to txt and generate train.txt and val.txt for final training

5) Modify the training configuration (two places)

3. Train the target detection model

4. Prepare classification/re-identification data (four places)

5. Train the classification/re-identification model

6. Test tracking (video)

reference


Ok. It seems that after I have written it in great detail, everyone still has some questions about the data set. Let me roughly say that the target detection data set can only be tested and divided into one category.

Then extract the corresponding data, and then divide it into which categories. The classified data can also come from other categories that correspond to the categories you want to track.

Explain the process. Please think more and follow the blog carefully.

1. Environment

ubuntu16.04
cuda10.1
cudnn7
python3.6
 
 
Cython
matplotlib>=3.2.2
numpy>=1.18.5
opencv-python>=4.1.2
Pillow
PyYAML>=5.3
scipy>=1.4.1
tensorboard>=2.2
torch>=1.7.0
torchvision>=0.8.1
tqdm>=4.41.0
seaborn>=0.11.0
easydict
thop
pycocotools

2. Data preparation for target detection

1) Data labeling

Here you can use cvat annotation, and then download the data as VOC:

Directory voc:

Put the data in the Yolov5_DeepSort_Pytorch/yolov5/data directory.

2) Separate training set and validation set

In fact, I only have train.txt and val.txt.

# -*- coding: UTF-8 -*-
'''
@author: gu
@contact: [email protected]
@time: 2021/3/4 上午11:52
@file: generate_txt.py
@desc: reference https://blog.csdn.net/qqyouhappy/article/details/110451619
'''

import os
import random

trainval_percent = 1
train_percent = 0.9
xmlfilepath = 'datasets/voc/Annotations'
txtsavepath = 'datasets/voc/ImageSets'
total_xml = os.listdir(xmlfilepath)

num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)

ftrainval = open('data/voc/ImageSets/Main/trainval.txt', 'w')
ftest = open('data/voc/ImageSets/Main/test.txt', 'w')
ftrain = open('data/voc/ImageSets/Main/train.txt', 'w')
fval = open('data/voc/ImageSets/Main/val.txt', 'w')

for i in list:
    name = total_xml[i][:-4] + '\n'
    print(name)
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest.close()

 The file name in train.txt looks like this:

003_000855
004_000146
002_000830
002_000720
002_002105
001_000888

3) Modify JPEGImages to images

Modify JPEGImages to images. This place is because yolov5 reads images and labels by default.

4) Convert xml to txt and generate train.txt and val.txt for final training

# voc_label.py

import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join

sets = ['train', 'val']
classes = ["***"] # your class


def convert(size, box):
    dw = 1. / size[0]
    dh = 1. / size[1]
    x = (box[0] + box[1]) / 2.0
    y = (box[2] + box[3]) / 2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return (x, y, w, h)


def convert_annotation(image_id):
    in_file = open('data/voc/Annotations/%s.xml' % (image_id))
    out_file = open('data/voc/labels/%s.txt' % (image_id), 'w')
    tree = ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)
    for obj in root.iter('object'):
        # difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),
             float(xmlbox.find('ymax').text))
        bb = convert((w, h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')


wd = getcwd()
print(wd)
for image_set in sets:
    if not os.path.exists('data/voc/labels/'):
        os.makedirs('datasets/voc/labels/')
    image_ids = open('data/voc/ImageSets/Main/%s.txt' % (image_set)).read().strip().split()
    list_file = open('../data/voc/%s.txt' % (image_set), 'w')
    for image_id in image_ids:
        list_file.write('data/voc/images/%s.jpg\n' % (image_id))
        convert_annotation(image_id)
        print(image_id)
    list_file.close()

The train.txt file for training looks like this (in the /data/voc/ directory):

data/voc/images/003_000855.jpg
data/voc/images/004_000146.jpg
data/voc/images/002_000830.jpg
data/voc/images/002_000720.jpg

Directory at this time:

data
  voc
    Annotations
    images
    ImageSets
    labels
    train.txt
    val.txt

5) Modify the training configuration (two places)

----Data, in the data directory, copy coco.yaml and modify it as follows:

# PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC/
# Train command: python train.py --data voc.yaml
# Default dataset location is next to /yolov5:
#   /parent_folder
#     /VOC
#     /yolov5


# download command/URL (optional)
#download: bash data/scripts/get_voc.sh

# train and = data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: ./data/voc/train.txt  # 16551 images
val: ./data/voc/val.txt  # 4952 images

# number of clsses
nc: 1

# class names
names: [ '***'] # your class

----Model, in the models directory, modify the yaml file corresponding to the yolo model you want to train. Here, take yolov5s as an example (nc is the total number of categories):

# parameters
nc: 1  # number of classes

Only need to modify the total number of categories.

3. Train the target detection model

1) Run in the /Yolov5_DeepSort_Pytorch/yolov5 directory:

python train.py --data data/your_data_config_file.yaml --cfg models/yolov5s.yaml  --weights weights/yolov5s.pt --device 0

Put the pre-trained model in the following directory. For example: yolov5s.pt.

2) The model test can use the following commands:

python ./yolov5/detect.py --weights ./yolov5/weights/yolov5s.pt --source ./your_video.mp4 --save-txt

3) Model accuracy verification:

cd yolov5
python ./detect.py --weights ./weights/yolov5s.pt --data ./data/your_data_yaml_file.yaml  --source ./your_video.mp4 --save-txt

4) Training after disconnection, use --resume

5) Want to use tensorboard, comment model/yolo.py line 282-286

6) Data enhancement---data/hyp.scratch.yaml and data/hyp.finetune.yaml.

Among them 5, 6 reference: https://blog.csdn.net/weixin_41868104/article/details/114685071

4. Prepare classification/re-identification data (four places)

In fact, you have to write some scripts for processing data based on your own data. I will give three steps roughly here.

---Pull data

You can extract the data in the annotation gt, and then use it to train the model.

# -*- coding: UTF-8 -*-
'''
@author: gu
@contact: [email protected]
@time: 2021/3/4 下午8:01
@file: crop_image.py
@desc: https://blog.csdn.net/qq_36249824/article/details/108428698
'''
import cv2
import xml.etree.ElementTree as ET
import numpy as np

import xml.dom.minidom
import os
import argparse


def main():
    # JPG文件的地址
    img_path = 'data/voc/images/'
    # XML文件的地址
    anno_path = 'data/voc/Annotations/'
    # 存结果的文件夹

    cut_path = 'data/voc/crops/'
    if not os.path.exists(cut_path):
        os.makedirs(cut_path)
    # 获取文件夹中的文件
    imagelist = os.listdir(img_path)
    # print(imagelist
    for image in imagelist:
        image_pre, ext = os.path.splitext(image)
        img_file = img_path + image
        img = cv2.imread(img_file)
        xml_file = anno_path + image_pre + '.xml'
        # DOMTree = xml.dom.minidom.parse(xml_file)
        # collection = DOMTree.documentElement
        # objects = collection.getElementsByTagName("object")

        tree = ET.parse(xml_file)
        root = tree.getroot()
        # if root.find('object') == None:
        #     return
        obj_i = 0
        for obj in root.iter('object'):
            obj_i += 1
            cls = obj.find('name').text
            xmlbox = obj.find('bndbox')
            b = [int(float(xmlbox.find('xmin').text)), int(float(xmlbox.find('ymin').text)), int(float(xmlbox.find('xmax').text)),
                 int(float(xmlbox.find('ymax').text))]
            img_cut = img[b[1]:b[3], b[0]:b[2], :]
            path = os.path.join(cut_path, cls)
            # 目录是否存在,不存在则创建
            mkdirlambda = lambda x: os.makedirs(x) if not os.path.exists(x) else True
            mkdirlambda(path)
            cv2.imwrite(os.path.join(cut_path, cls, '{}_{:0>2d}.jpg'.format(image_pre, obj_i)), img_cut)
            print("&&&&")


if __name__ == '__main__':
    main()

---Using pre-trained models and annotations, you have to train yourself with some data, and then keep repeating.

--- Divide the data into train and test. The catalogs are represented by categories 1, 2, 3, 4 or others. This is because the default reading method is too lazy to change.

The directory looks like this:

deep_sort_pytorch
  deep_sort
    deep
      data
        train
          1
            1_0001.jpg
            ...
            1_nnnn.jpg
          2
          ...
          n
        test
          1
            1_0001.jpg
            ...
            1_nnnn.jpg
          2
          ...
          n

 

---Modify the preprocessing of train dataset in train.py as follows:

transform_train = torchvision.transforms.Compose([
    torchvision.transforms.Resize((128, 64)),
    torchvision.transforms.RandomCrop((128, 64), padding=4),
    torchvision.transforms.RandomHorizontalFlip(),
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Normalize(
        [0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

 

5. Train the classification/re-identification model

Run in Yolov5_DeepSort_Pytorch/deep_sort_pytorch/deep_sort/deep directory:

python train.py --data-dir data/

6. Test tracking (video)

Run in the /Yolov5_DeepSort_Pytorch directory:

python track.py --weights weights/yolov5s_our.pt --source your_video.mp4 --save-txt

Due to data privacy, the picture is not shown here.

If there are multiple categories to be tracked, set it in --classes. If the number of categories is 2, add a parameter:

--classes 0 1 

 

reference

1.yolov5 trains its own VOC dataset

2. [Pytorch learning] Import and preprocessing of image data sets

 

Guess you like

Origin blog.csdn.net/qq_35975447/article/details/114412978