Nanny level tutorial - YOLO series (YOLOv5/YOLOv7/YOLOv8) algorithm training data set

This blog post teaches you how to run through the YOLO series of algorithms quickly, conveniently and effectively. If you need to make corrections, please leave a message, and I will further update and correct them.

1. Experimental environment

        Regarding the experimental environment, the source code of the YOLO series algorithm official website is written in the PyTorch framework, so you need to install the PyTorch environment on your computer, and install the required libraries according to the requirement.txt for different YOLO algorithms or different versions of the same algorithm. Regarding how to build the PyTorch environment, I have taught in the early stage, you can click the link below.

[Learning experience sharing NO.22] PyTorch environment construction

2. Data set preparation

        Taking the public NEU-DET data set of steel surface defects of Northeastern University as an example, first divide the data set as follows, and divide it into training set, verification set and test set according to the ratio of 6:2:2 or 7:1:2. Some friends will appear that only the training set and the verification set are divided, and the test set is not divided. It is actually inaccurate to use the mAP obtained from the final train.py training as the final model evaluation result. Let me explain here by the way. The training set is equivalent to the students' learning materials in class, the verification set is equivalent to the after-school homework and can be repeatedly corrected for practice tests, and the test set is equivalent to the final final exam. A student's learning is good or bad. In the end, it depends on the final final exam for evaluation. Interested friends can watch Wu Enda's in-depth learning video, which contains explanations of related content. Regarding the code for dividing the data set and the YOLO format conversion of the data set, there are also teaching and code examples before, as shown below. Learning experience sharing five: YOLOv5 data set division and YOLO format conversion

2906698832c745a6a10635c220ccb5a8.png

 To create data.yaml, you need to determine the path of the data, the number of categories nc of the dataset, and the name of the category. 

e103b08694274112b68624af8dcf0ce9.png

3. YOLO series algorithm project construction

First download the source code of the YOLO series of algorithms from the official website, open it with the IDE of Pycharm, and choose to configure the environment. The configure button is shown below.

255c66132ff442eab693d9b8df812c76.png

 The following is the YOLOv5 project. train.py is the training data set program, val.py is the evaluation verification model program, and detect.py is the model obtained by applying the training.

e59f38fd4ebb451da0a59e9317fae556.png

To train your own data set in train.py, you need to modify the data and change it to your own data set yaml file; modify the name and project and change it to the training name for easy recording; modify cfg, modify the configuration network file; modify epochs  , Generally, the default is 300. If you load the pre-trained model, you can change it to 100; modify the resume, if it is a terminal in the middle, change it to True, which means retraining from the breakpoint; modify the batch-size, and modify it according to your own graphics card. It is best to increase the modification slowly in multiples of 8, and the default is 32.8e4b5c6dc96f49b1a8f6e7ad8a3490d0.png

 In val.py , verify the model you have trained. Like train.py, it is recommended to change iou-thres to 0.5 in different places, so that mAP0.5 can be tested. General experimental equipment does not affect the mAP results obtained by evaluation, so you don't have to worry about the problem that the model obtained by training and verifying on the server will be different on your computer.

43df71c4c2c2409e803597b873f93380.png

 detect.py uses the model trained by itself. The parameter that needs to be modified is weights, which can be changed to the weight directory of the model obtained by your own training, and other parameters can be adjusted by yourself. 

c2d49987ec7e4435982d56f1c05e62c6.png

 

Guess you like

Origin blog.csdn.net/m0_70388905/article/details/130036429