YOLOV7 training data set (pycharm)

Reference article: YoloV6 actual combat: Teach you step by step how to use Yolov6 for object detection (with data set)_yolov6 detects with its own model_AI Hao's blog-CSDN blog

YOLO | Use YOLOv7 to train your own data set (super detailed version)_yolo training set_Xia Tian|여름이다's blog-CSDN blog

1. Environment configuration

1. Download yolov7

git clone https://github.com/WongKinYiu/yolov7

2. Add environment

Open pycharm, go to File->Settings->Python Interpreter->Add Interpreter,

Find venv/bin/python.exe under the yolov7 file and select Add

 3. Install the required libraries

Follow the prompts to install with one click, or enter in the terminal:

cd yolov7
pip install -r requirements.txt
pip install opencv-python-headless

Download the weights of YOLOV7

wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7.pt

Download training weights for YOLOV7

wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7_training.pt

Create a new weight folder and move the above files to the weight folder

    mkdir weights
    cp yolov7.pt weights/
    cp yolov7_training.pt weights/ 

Parameter Description:

    --weights weight/yolov7.pt   # 这个参数是把已经训练好的模型路径传进去,就是刚刚下载的文件
    --source inference/images   # 传进去要预测的图片

2. Data processing

1. Get the data set

Find a public data set for testing. The data set is a data set labeled by Labelme. Download address:

https://download.csdn.net/download/hhhhhhhhhhwwwwwwwwww/14003627

 After downloading, extract it to the yolov7 folder

2. Generate data labels

Idea:

The first step is to use the train_test_split method to split the training set, validation set and test set.
The second step calls the change_2_yolo5 method to convert the data in json into txt data in yolov5 format, and returns the image list of the training set, verification set and test set.
The third step is to create a data set folder, and then copy the pictures and txt files to the corresponding directory.

Create a new script make_yolo_data.py and insert the code:

import os
import shutil
import numpy as np
import json
from glob import glob
import cv2
from sklearn.model_selection import train_test_split
from os import getcwd


def convert(size, box):
    dw = 1. / (size[0])
    dh = 1. / (size[1])
    x = (box[0] + box[1]) / 2.0 - 1
    y = (box[2] + box[3]) / 2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return (x, y, w, h)


def change_2_yolo5(files, txt_Name):
    imag_name = []
    for json_file_ in files:
        json_filename = labelme_path + json_file_ + ".json"
        out_file = open('%s/%s.txt' % (labelme_path, json_file_), 'w')
        json_file = json.load(open(json_filename, "r", encoding="utf-8"))
        # image_path = labelme_path + json_file['imagePath']
        imag_name.append(json_file['imagePath'])
        height, width, channels = cv2.imread(labelme_path + json_file_ + ".jpg").shape
        for multi in json_file["shapes"]:
            points = np.array(multi["points"])
            xmin = min(points[:, 0]) if min(points[:, 0]) > 0 else 0
            xmax = max(points[:, 0]) if max(points[:, 0]) > 0 else 0
            ymin = min(points[:, 1]) if min(points[:, 1]) > 0 else 0
            ymax = max(points[:, 1]) if max(points[:, 1]) > 0 else 0
            label = multi["label"]
            if xmax <= xmin:
                pass
            elif ymax <= ymin:
                pass
            else:
                cls_id = classes.index(label)
                b = (float(xmin), float(xmax), float(ymin), float(ymax))
                bb = convert((width, height), b)
                out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
                # print(json_filename, xmin, ymin, xmax, ymax, cls_id)
    return imag_name


def image_txt_copy(files, scr_path, dst_img_path, dst_txt_path):
    """
    :param files: 图片名字组成的list
    :param scr_path: 图片的路径
    :param dst_img_path: 图片复制到的路径
    :param dst_txt_path: 图片对应的txt复制到的路径
    :return:
    """
    for file in files:
        img_path = scr_path + file
        shutil.copy(img_path, dst_img_path + file)
        scr_txt_path = scr_path + file.split('.')[0] + '.txt'
        shutil.copy(scr_txt_path, dst_txt_path + file.split('.')[0] + '.txt')


if __name__ == '__main__':
    classes = ["aircraft", "oiltank"]
    # 1.标签路径
    labelme_path = "datasets/LabelmeData/"
    isUseTest = True  # 是否创建test集
    # 3.获取待处理文件
    files = glob(labelme_path + "*.json")
    files = [i.replace("\\", "/").split("/")[-1].split(".json")[0] for i in files]
    trainval_files, test_files = train_test_split(files, test_size=0.1, random_state=55)
    # split
    train_files, val_files = train_test_split(trainval_files, test_size=0.1, random_state=55)
    train_name_list = change_2_yolo5(train_files, "train")
    print(train_name_list)
    val_name_list = change_2_yolo5(val_files, "val")
    test_name_list = change_2_yolo5(test_files, "test")
    # 创建数据集文件夹。
    file_List = ["train", "val", "test"]
    for file in file_List:
        if not os.path.exists('./datasets/images/%s' % file):
            os.makedirs('./datasets/images/%s' % file)
        if not os.path.exists('./datasets/labels/%s' % file):
            os.makedirs('./datasets/labels/%s' % file)
    image_txt_copy(train_name_list, labelme_path, './datasets/images/train/', './datasets/labels/train/')
    image_txt_copy(val_name_list, labelme_path, './datasets/images/val/', './datasets/labels/val/')
    image_txt_copy(test_name_list, labelme_path, './datasets/images/test/', './datasets/labels/test/')

Remember to modify the path in the main function and replace the last path with the path you want to save.

The generated results are as follows, and the data set is divided into

 

3. Training

1. Make your own configuration file

One is yolov7-mydataset.yaml, located under the project yolov7/cfg/training, then copy yolov7.yaml, paste it into yolov7-mydataset.yaml, and then modify the number of categories to get a new configuration file of your own.

nc is the number of categories, change it to your own, for example, change 38 below to the number of your own categories.

# parameters
nc: 38 # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple

The other one is mydataset.yaml. The new location is under the project yolov7/data. Copy and paste a yaml file and rename it to mydataset.yaml. The changes are as follows

# Please insure that your custom_dataset are put in same parent dir with YOLOv6_DIR
train: ./datasets/images/train # train images
val: ./datasets/images/val # val images
test: ./datasets/images/test # test images (optional)

# whether it is coco dataset, only coco dataset should be set to True.
is_coco: False
# Classes
nc: 2  # number of classes
names: ['aircraft', 'oiltank']  # class names

2. Modify train.py

--weights: pre-training path, fill in '' to indicate not to use pre-training weights
--cfg: parameter path (newly created yolov7-mydataset.yaml file in ./cfg/training)
--data: data set path (./data The newly created mydayaset.yaml file in )
--epochs: number of training rounds
--batch-size: batch size
--device: training device, cpu-->train with cpu, 0-->train with GPU, 0,1, 2,3-->Train with multi-core GPU
--workers: maximum number of dataloader workers
--name : save to project/name

Find the parameter location in train.py and modify these parameters.

parser.add_argument('--cfg', type=str, default='cfg/training/yolov7-mydatasets.yaml', help='model.yaml path')
parser.add_argument('--data', type=str, default='data/mydataset.yaml', help='data.yaml path')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')

Run and wait for training to complete.

You can view the results under runs->train->exp, which are training and verification results.

4. Test

1. Modify test.py configuration parameters

Open test.py and modify parameters

Note that runs/train/exp4/weights/best.pt is generated after training and should be changed to your own address.

 parser = argparse.ArgumentParser(prog='test.py')
    parser.add_argument('--weights', nargs='+', type=str, default='runs/train/exp4/weights/best.pt', help='model.pt path(s)')
    parser.add_argument('--data', type=str, default='data/mydataset.yaml', help='*.data path')

    parser.add_argument('--task', default='test', help='train, val, test, speed or study')
    parser.add_argument('--device', default='0', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')

Among them, the weights are in the training folder just now, and the data is the same as the training

weights: path to training weights.
source: The path of the test image. I put the test image under the tools/images folder.
img-size: consistent with the training images.
conf-thres: the minimum value of confidence.
iou-thres: IoU value.
max-det: The target detected in a single picture cannot exceed this value.

 2. Result display

Pre-test pictures:

Generated results:

Experiment ends

5. Problem analysis

1. Path problem (resolved)

Some paths must be written in full, starting from /home, otherwise it will not run.

2. gpu training (to be solved)

CUDA and other environments are configured, but it still displays false and cannot use GPU training.

Alternative: Use CPU for training. The disadvantage is that it is very slow and needs to reduce the number of iterations.

Still needs to be resolved.

Supongo que te gusta

Origin blog.csdn.net/zhuanzhuxuexi/article/details/132042186
Recomendado
Clasificación