Latest in 2023 - Use yolov8 to train your own data set

1. Code download

First, you can download yolov8
Insert image description here
from the official website or use git to download it.

git clone https://github.com/ultralytics/ultralytics

Open it with pycharm as follows
Insert image description here

2. Environment configuration

2.1 Create a new environment

Find the following window under the start menu
Insert image description here
and click
Insert image description here
Create new environment yolov8.

conda create -n yolov8 python=3.8

activate new environment

conda activate yolov8

2.2 Install pytorch

You can find the corresponding installation commands on the pytorch official website. The version requirements here recommend torch=1.12.0+. The installation commands for torch=1.12.0 are posted below. You can choose CUDA
11.8 according to your computer situation.

pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118

CUDA 11.7

pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117

CUDA 11.6

pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

CUDA 11.3

pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 --extra-index-url https://download.pytorch.org/whl/cu113

CUDA 10.2

pip install torch==1.12.0+cu102 torchvision==0.13.0+cu102 --extra-index-url https://download.pytorch.org/whl/cu102

CPU only

pip install torch==1.12.0+cpu torchvision==0.13.0+cpu --extra-index-url https://download.pytorch.org/whl/cpu

2.3 Install third-party packages

Insert image description here
Here are various third-party packages we need to install.
Insert image description here
First locate the environment location to the location of the requirements, and then enter the following command

pip install -r requirements.txt -i https://mirrors.bfsu.edu.cn/pypi/web/simple/

2.4 Install ultralytics

Ultralytics integrates various packages and models of yolo.

pip install ultralytics

2.5 Bug resolution

[WARNING: Ignore distutils configs in setup.cfg due to encoding errors], if this occurs during the installation process, you can directly save setup.cfg as a txt file.
Insert image description here

2.6 Manually download weights

Although yolov8 will automatically download the weights for us, after all, the website is abroad, and download failures often occur. So whatever model you want, download it manually first, and then it will be available in the github project.
Insert image description here
Then paste it into the detect file.
Insert image description here

2.7 Check availability

Use the official picture to predict, the command is as follows

yolo task=detect mode=predict model=yolov8n.pt source=assets/  device=cpu save=True

Insert image description here

3. Train your own data set

3.1 Processing data sets

Because I mainly do target detection, I put all the data sets in detect.
Insert image description here
Insert image description here

Because our data set is in voc format, it needs to be converted into yolo format. First create a folder like this.
Run xml2txt.py, in this file it will convert the XML format annotation file in Annotations to the yolo format annotation file in txt.

import xml.etree.ElementTree as ET
import os, cv2
import numpy as np
from os import listdir
from os.path import join

classes = []

def convert(size, box):
    dw = 1. / (size[0])
    dh = 1. / (size[1])
    x = (box[0] + box[1]) / 2.0 - 1
    y = (box[2] + box[3]) / 2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return (x, y, w, h)


def convert_annotation(xmlpath, xmlname):
    with open(xmlpath, "r", encoding='utf-8') as in_file:
        txtname = xmlname[:-4] + '.txt'
        txtfile = os.path.join(txtpath, txtname)
        tree = ET.parse(in_file)
        root = tree.getroot()
        filename = root.find('filename')
        img = cv2.imdecode(np.fromfile('{}/{}.{}'.format(imgpath, xmlname[:-4], postfix), np.uint8), cv2.IMREAD_COLOR)
        h, w = img.shape[:2]
        res = []
        for obj in root.iter('object'):
            cls = obj.find('name').text
            if cls not in classes:
                classes.append(cls)
            cls_id = classes.index(cls)
            xmlbox = obj.find('bndbox')
            b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),
                 float(xmlbox.find('ymax').text))
            bb = convert((w, h), b)
            res.append(str(cls_id) + " " + " ".join([str(a) for a in bb]))
        if len(res) != 0:
            with open(txtfile, 'w+') as f:
                f.write('\n'.join(res))


if __name__ == "__main__":
    postfix = 'jpg'
    imgpath = 'VOCdevkit/JPEGImages'
    xmlpath = 'VOCdevkit/Annotations'
    txtpath = 'VOCdevkit/txt'
    
    if not os.path.exists(txtpath):
        os.makedirs(txtpath, exist_ok=True)
    
    list = os.listdir(xmlpath)
    error_file_list = []
    for i in range(0, len(list)):
        try:
            path = os.path.join(xmlpath, list[i])
            if ('.xml' in path) or ('.XML' in path):
                convert_annotation(path, list[i])
                print(f'file {
      
      list[i]} convert success.')
            else:
                print(f'file {
      
      list[i]} is not xml format.')
        except Exception as e:
            print(f'file {
      
      list[i]} convert error.')
            print(f'error message:\n{
      
      e}')
            error_file_list.append(list[i])
    print(f'this file convert failure\n{
      
      error_file_list}')
    print(f'Dataset Classes:{
      
      classes}')

Insert image description here
This needs to be saved, and the yaml file needs to be filled in later.
Run split_data.py. This file divides the training, validation, and test sets.

import os, shutil
from sklearn.model_selection import train_test_split

val_size = 0.1
test_size = 0.2
postfix = 'jpg'
imgpath = 'VOCdevkit/JPEGImages'
txtpath = 'VOCdevkit/txt'

os.makedirs('images/train', exist_ok=True)
os.makedirs('images/val', exist_ok=True)
os.makedirs('images/test', exist_ok=True)
os.makedirs('labels/train', exist_ok=True)
os.makedirs('labels/val', exist_ok=True)
os.makedirs('labels/test', exist_ok=True)

listdir = [i for i in os.listdir(txtpath) if 'txt' in i]
train, test = train_test_split(listdir, test_size=test_size, shuffle=True, random_state=0)
train, val = train_test_split(train, test_size=val_size, shuffle=True, random_state=0)
print(f'train set size:{
      
      len(train)} val set size:{
      
      len(val)} test set size:{
      
      len(test)}')

for i in train:
    shutil.copy('{}/{}.{}'.format(imgpath, i[:-4], postfix), 'images/train/{}.{}'.format(i[:-4], postfix))
    shutil.copy('{}/{}'.format(txtpath, i), 'labels/train/{}'.format(i))

for i in val:
    shutil.copy('{}/{}.{}'.format(imgpath, i[:-4], postfix), 'images/val/{}.{}'.format(i[:-4], postfix))
    shutil.copy('{}/{}'.format(txtpath, i), 'labels/val/{}'.format(i))

for i in test:
    shutil.copy('{}/{}.{}'.format(imgpath, i[:-4], postfix), 'images/test/{}.{}'.format(i[:-4], postfix))
    shutil.copy('{}/{}'.format(txtpath, i), 'labels/test/{}'.format(i))

Create a new data.yaml

Insert image description here
The path must be written as an absolute path, otherwise an error will be reported.
This way the data set is processed.

3.2 Training data

Enter training command

yolo task=detect mode=train model=yolov8s.yaml data=yolo/v8/detect/fish_datasets/data.yaml epochs=100 batch=4

3.3 Verification data

Enter the verification command and use the trained model to verify

yolo task=detect mode=val model=ultralytics/yolo/v8/detect/runs/detect/train5/weights/best.pt  data=ultralytics/yolo/v8/detect/fish_datasets/data.yaml device=cpu

Insert image description here

3.4 Forecast data

yolo task=detect mode=predict model=ultralytics/yolo/v8/detect/runs/detect/train5/weights/best.pt source=ultralytics/yolo/v8/detect/fish_datasets/images/val  device=cpu

Insert image description here

3.5 Model export

Use the following command to export the model

yolo task=detect mode=export model=ultralytics/yolo/v8/detect/runs/detect/train5/weights/best.pt 

This article also refers to articles written by many experts. You can also check out their tutorials.
YOLOV8's strongest operation tutorial
YOLOv8 tutorial series: 1. Use a custom data set to train the YOLOv8 model (detailed version of the tutorial, you only need to read one article -> Parameter adjustment guide), including environment construction/data preparation/model training/prediction/verification/ Export and other
YOLOv8 from environment construction to inference training

Guess you like

Origin blog.csdn.net/weixin_44752340/article/details/130846432