Teach you to use YOLOV5 to train your own target detection model
Hello everyone, this is dejahu. It hasn't been updated for several months. I checked the followings in the past two days, and suddenly more than 1k friends followed. They must be friends from the big homework series tutorials. Since so many friends are paying attention to this big homework series, and it is almost time for the completion of the project and the submission of the big homework, then I will just update it. Compared with the previous issue of fruit and vegetable classification and garbage identification, the content of this issue has been upgraded in terms of content and novelty. This time, we will use YOLOV5 to train a mask detection model, which is more in line with the current epidemic situation, and the goal There are also many knowledge points involved in the detection. In addition to the big homework for everyone, this content can also be used as a graduation project for some small partners. Without further ado, let's start today's content.
B station explanation video: teach you to use YOLOV5 to train your own target detection model
Code address: YOLOV5-mask-42: YOLOV5-based mask detection system - providing teaching video (gitee.com)
The processed data set and trained model: YOLOV5 mask detection data set + code + model 2000 labeled data + teaching video.zip-Deep Learning Document Resources-CSDN Library
More related datasets: List of target detection datasets - with YOLOV5 model training and use tutorial - Dejahu's Blog - CSDN Blog
Let's take a look at the effect we want to achieve first. We will train a mask detection model through data, and package it with pyqt5 to realize the functions of picture mask detection, video mask detection and camera real-time mask detection.
download code
The download address of the code is: [ YOLOV5-mask-42: Mask Detection System Based on YOLOV5 - Provide Teaching Video (gitee.com) ](https://github.com/ultralytics/yolov5)
Configuration Environment
For anaconda friends who are not familiar with pycharm, please read this csdn blog first to understand the basic operations of pycharm and anaconda
After the anaconda installation is complete, please switch to the domestic source to improve the download speed. The command is as follows:
conda config --remove-key channels
conda config --add channels https://mirrors.ustc.edu.cn/anaconda/pkgs/main/
conda config --add channels https://mirrors.ustc.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.bfsu.edu.cn/anaconda/cloud/pytorch/
conda config --set show_channel_urls yes
pip config set global.index-url https://mirrors.ustc.edu.cn/pypi/web/simple
First create a virtual environment for python3.8, please execute the following operations on the command line:
conda create -n yolo5 python==3.8.5
conda activate yolo5
pytorch installation (installation of gpu version and cpu version)
The actual test situation is that YOLOv5 can be used in both CPU and GPU conditions, but the speed of training under CPU conditions will be outrageous, so the conditional friends must install the GPU version of Pytorch, and the unconditional friends are the most Better to rent a server to use.
For the specific steps of GPU version installation, please refer to this article: Install GPU version of Tensorflow and Pytorch under Windows in 2021_dejahu's blog - CSDN Blog
The following points need to be noted:
- Before installing, be sure to update your graphics card driver, go to the official website to download the corresponding model driver installation
- 30 series graphics cards can only use the cuda11 version
- Be sure to create a virtual environment so that there is no conflict between the various deep learning frameworks
What I created here is the python3.8 environment, the installed version of Pytorch is 1.8.0, and the command is as follows:
conda install pytorch==1.8.0 torchvision torchaudio cudatoolkit=10.2 # 注意这条命令指定Pytorch的版本和cuda的版本
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cpuonly # CPU的小伙伴直接执行这条命令即可
After the installation is complete, let's test whether the GPU is
Installation of pycocotools
Later, I found a simpler installation method under Windows. You can use the following command to install directly without downloading and then installing
pip install pycocotools-windows
Installation of other packages
In addition, you also need to install other required packages of the program, including opencv, matplotlib and these packages, but the installation of these packages is relatively simple, and can be executed directly through the pip command. We cd to the directory of the yolov5 code, and execute the following commands directly: The installation of the package can be completed.
pip install -r requirements.txt
pip install pyqt5
pip install labelme
have a test
Execute the following code in the yolov5 directory
python detect.py --source data/images/bus.jpg --weights pretrained/yolov5s.pt
After execution, the following information will be output
The results after detection can be found in the runs directory
According to the official instructions, the detection code here is very powerful and supports detection of various images and video streams. The specific usage methods are as follows:
python detect.py --source 0 # webcam
file.jpg # image
file.mp4 # video
path/ # directory
path/*.jpg # glob
'https://youtu.be/NUsoVlDFqZg' # YouTube video
'rtsp://example.com/media.mp4' # RTSP, RTMP, HTTP stream
data processing
This is changed to yolo's annotation form, and then the content of the first phase of data conversion will be specially published.
Data Labeling The recommended software here is labelimg, which can be installed through the pip command
Execute the command in your virtual environment pip install labelimg -i https://mirror.baidu.com/pypi/simple
to install, and then directly execute the labelimg software on the command line to start the data labeling software.
After the software is started, the interface is as follows:
Data annotation
Although it is yolo's model training, here we still choose to label in voc format. First, it is convenient to use the data set in other codes. Second, I provide data format conversion.
The labeling process is:
1. Open the picture directory
2. Set the directory where the annotation files are saved and set up automatic saving
3. Start labeling, frame, mark the label of the target, crtl+s
save, and then switch to the next one to continue labeling, repeating repeatedly
The shortcut keys of labelimg are as follows. Learning the shortcut keys can help you improve the efficiency of data labeling.
After the annotation is completed, you will get a series of txt files. The txt file here is the annotation file of the target detection. The names of the txt file and the image file are in one-to-one correspondence, as shown in the following figure:
Open the specific annotation file, and you will see the following content. Each line in the txt file represents a target, which is distinguished by a space and represents the category id of the target, and the normalized center point x coordinate, y coordinate, w and h of the target box.
4. Modify the dataset configuration file
The marked data should be placed in the following format, which is convenient for the program to index.
YOLO_Mask
└─ score
├─ images
│ ├─ test # 下面放测试集图片
│ ├─ train # 下面放训练集图片
│ └─ val # 下面放验证集图片
└─ labels
├─ test # 下面放测试集标签
├─ train # 下面放训练集标签
├─ val # 下面放验证集标签
The configuration file here is for the convenience of our later training. We need to create a mask_data.yaml
file in the data directory, as shown in the following figure:
At this point, the data set processing part is basically finished, and the following content will be model training!
model training
Basic training of the model
Create a model configuration file under models with the following mask_yolov5s.yaml
content:
Before model training, please make sure the following files are in the code directory
Execute the following code to run the program:
python train.py --data mask_data.yaml --cfg mask_yolov5s.yaml --weights pretrained/yolov5s.pt --epoch 100 --batch-size 4 --device cpu
After the training code is successfully executed, the following information will be output on the command line. The next step is to wait for the model training to end with peace of mind.
According to the size of the dataset and the performance of the device, the model is trained after a long wait, and the output is as follows:
train/runs/exp3
The trained model and log files can be found in the directory of
Of course, there are still some tricky operations. For example, you can continue training from the interruption point in the middle of the model training. These are left to everyone to explore by yourself.
Model evaluation
In addition to the detection effect you can see at the beginning of the blog, there are also some academic evaluation indicators used to represent the performance of our model. The most commonly used evaluation indicator for target detection is mAP, which is between 0 and A number between 1, the closer the number is to 1, the better the performance of your model.
Generally, we will come into contact with two indicators, namely recall and precision. The two indicators p and r are simply to judge the quality of the model from one angle, and they are both values between 0 and 1. Among them, close to 1 means that the performance of the model is better, and close to 0 means that the performance of the model is worse. In order to comprehensively evaluate the performance of target detection, the mean average density map is generally used to further evaluate the quality of the model. By setting different confidence thresholds, we can get the p-value and r-value calculated by the model under different thresholds. In general, the p-value and r-value are negatively correlated, and they can be drawn as shown below. As shown in the curve, the area of the curve is called AP. Each target in the target detection model can calculate an AP value, and the mAP value of the model can be obtained by averaging all AP values. Taking this article as an example, we can calculate For the AP values of the two targets wearing a helmet and without a helmet, we average the AP values of the two groups to obtain the mAP value of the entire model. The closer the value is to 1, the better the performance of the model.
For more academic definitions, you can check them out on Zhihu or csdn. Take the model we trained this time as an example. After the model ends, you will find three images, which represent the recall rate and accuracy of our model on the validation set. rate and mean average density.
Taking PR-curve as an example, you can see that our model has a mean average density of 0.832 on the validation set.
If there is no such curve in your directory, it may be because your model stops halfway through training and does not perform the verification process. You can generate these pictures by the following commands.
python val.py --data data/mask_data.yaml --weights runs/train/exp_yolov5s/weights/best.pt --img 640
Finally, here is a detailed explanation list of evaluation indicators, which can be said to be the most primitive definition.
model usage
The use of the model is all integrated in the detect.py
directory, you can refer to the content you want to detect according to the following instructions
# 检测摄像头
python detect.py --weights runs/train/exp_yolov5s/weights/best.pt --source 0 # webcam
# 检测图片文件
python detect.py --weights runs/train/exp_yolov5s/weights/best.pt --source file.jpg # image
# 检测视频文件
python detect.py --weights runs/train/exp_yolov5s/weights/best.pt --source file.mp4 # video
# 检测一个目录下的文件
python detect.py --weights runs/train/exp_yolov5s/weights/best.pt path/ # directory
# 检测网络视频
python detect.py --weights runs/train/exp_yolov5s/weights/best.pt 'https://youtu.be/NUsoVlDFqZg' # YouTube video
# 检测流媒体
python detect.py --weights runs/train/exp_yolov5s/weights/best.pt 'rtsp://example.com/media.mp4' # RTSP, RTMP, HTTP stream
For example, taking our mask model as an example, if we execute python detect.py --weights runs/train/exp_yolov5s/weights/best.pt --source data/images/fishman.jpg
the command, we can get such a detection result.
Build a visual interface
The part of the visual interface is in the window.py
file, which is the interface design completed by pyqt5. Before starting the interface, you need to replace the model with the model you trained. The replacement position is in window.py
line 60, and you can modify it to your model address. , if you have a GPU, you can set the device to 0, which means to use the 0th row GPU, which can speed up the recognition of the model.
! ! ! Using GPU
After the replacement, right-click to run to start the graphical interface. Go and test it yourself to see the effect.
find me
You can find me in these ways.
Station B: Four Twelve-
CSDN: Four Twelve
Know: Four Twelve
Weibo: Four Twelve-
Follow now and be an old friend!