learn better from others,
be the better one.
—— "Weika Zhixiang"
The length of this article is 4238 words , and it is expected to read for 9 minutes
foreword
The previous article " OpenCV--Self-study Notes " has built a yolov5 environment. As a target detection application, the most important thing is to train your own digital set and reasoning, so this article is dedicated to introducing the use of yolov5 to train your own data. Set, and use OpenCV's DNN for inference.
achieve effect
What is the core of realizing yolov5 to train its own data and recognize it?
A
To implement yolov5 to train your own data set, the core is how to label files. Like the figure above, we are doing digital Huarong Road recognition. When classifying and labeling each number, the third-party library labelimg is used.
Then because in the end we use the DNN of C++ OpenCV for inference, we also need to install an onnx library to convert the model to onnx.
Install labelimg and onnx
##激活上篇中搭建的yolov5虚拟环境
conda activate yolov5
##安装labelimg
pip install labelimg
I have already installed it, so when I run it again, I will be prompted that the installation is complete.
##接着再安装onnx
pip install onnx
Download training images
As can be seen from the above implementation effect, we still did the digital recognition of Huarong Road, so we must first collect our training pictures and create our training directory first.
As can be seen from the above figures, we have built a numpuzzles folder, which has added images (image path) and labels (label path) two folders, and created train (training set) under the two folders and val (verification set) folder, so that the folder is created, and the next step is to download the training pictures. I found 50 pictures of digital Huarong Road on the Internet and stored them in the images/train folder
images/val contains two previously saved images, and the number of verified images is a bit small, but it doesn’t matter
In this way, the preparatory work is completed, and the next step is the core step, labeling the data.
label data
Run labelimg on the command line to open the labeling software
Open OpenDir to find the picture folder we downloaded
Click the Yolo button above to switch. The default may not be Yolo, so we often click and switch to Yolo.
Then click CreateRectBox. After drawing a rectangle with the mouse at the position to be marked, a label prompt box for label will pop up. Enter the label value manually for the first time, and you can directly enter or select the same value later. After clicking OK, the current category is marked successfully , the right side will display the classification and the position of the drawn rectangle, the specific operation is as shown in the following GIF animation.
When all types are marked, click Save, and a txt file will be generated, and we will save it to the labels/train directory created initially
Corresponding to picture 001, the marked 001.txt file is also generated. After opening 001.txt, you can see a bunch of file numbers. The first in each line is the number of the marked mark, and the following 4 values represent the rectangular box. The coordinate positions of the four vertices.
Correspondingly generated classes.txt contains the value name corresponding to our first serial number. Since the order of Huarong Road in my first picture is disordered, the first one is 10, so I did it according to the order I marked, there is no Make the serial number and value the same, and it will be a little troublesome to find when marking the relative back, but it doesn't matter.
After marking all 50 pictures according to the above method and saving them, our training data set is completed. Of course, the two pictures in the verification set should also be marked according to this method, and the next step is to start training.
training data
Create a .yaml file
The first step is to create our .yaml file first. Through this, we can set the pictures and label positions of the training set and verification set, and also set the corresponding type of the label.
In the yolov5/data folder, we create a numpuzzles.yaml file
-
path: the root directory of the dataset
-
train: the relative path between the training set and path
-
val: the relative path between the verification set and path
-
nc: the number of categories, because this data set has only one category (fire), nc is 1.
-
names: category names.
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/numpuzzles # dataset root dir
train: images/train # train images (relative to 'path')
val: images/val # val images (relative to 'path')
test: # test images (optional)
# Classes
nc: 15 # number of classes
names: ['10','3','6','1','9','15','14','4','7','5','2','13','11','12','8'] # class names
Modify train.py to start training
Use VS Code to find the yolov5 folder, find the train.py file, and modify several parameters in parse_opt, one is the initial yolov5 model, one is the numpuzzles.yaml file just created, and the other is the number of training rounds .
As you can see in the picture above, I use the yolov5.pt model for weights, and the data has been assigned to the 'data/numpuzzles.yaml' we just created, and the number of training rounds has been changed to 500 rounds. Here is the directly modified file. The debugging method runs, of course, it can also be directly trained by inputting commands and parameters in the command line.
The start of the operation is a long waiting for training, because I only have 50 pictures here, and I trained for 100 rounds at first, but the inference effect was very poor. Later, I changed to training for 500 rounds. The main reason is that my machine does not have a graphics card, so I used CPU for training. It took 6 hours to finish training for 500 rounds.
After the training is completed, we can see our training effect under the folder of runs/train. There are two training models of best.pt and last.pt in the weight. Copy best.pt to the default directory of yolov5
reasoning test
Open the detect.py file with VS Code, and modify the model of weights in parse_opt to best.pt
Then run the inference directly, and you can see the effect of the inference in the runs/detect folder
It can be seen that after 500 rounds of training, the effect is quite good, and the next step is to use OpenCV DNN for inference.
Micro card Zhixiang
Inference with OpenCV
01
Transfer out onnx model
At the beginning of the article, it was said that the onnx model should be transferred, so we installed the onnx third-party tool and transferred it to use in export.py, or opened the export.py file with VS Code, and modified the transferred model file in parse_opt For best.pt, there is also an include for onnx
or run directly
python export.py --weights runs/train/exp3/weights/best.pt --include onnx
After the transfer, the model file of best.onnx is generated. If you want to view the model file of onnx, it is recommended to download Netron to view it. The effect is as follows:
You can see that the input, output and type of the model file are displayed on it.
02
C++ OpenCV inference
Define the model file directory and category array. The order of the ClassName inside must be consistent with the order of the previous numpuzzle.yaml. Directly add the code:
#pragma once
#include<iostream>
#include<opencv2/opencv.hpp>
#include<opencv2/dnn/dnn.hpp>
using namespace cv;
using namespace std;
dnn::Net net;
void deal_pred(Mat& img, Mat& preds, vector<Rect>& boxes, vector<float>& confidences, vector<int>& classIds, vector<int>& indexs)
{
float w_ratio = img.cols / 640.0;
float h_ratio = img.rows / 640.0;
cout << "size1:" << preds.size[1] << endl;
cout << "size2:" << preds.size[2] << endl;
Mat data(preds.size[1], preds.size[2], CV_32F, preds.ptr<float>());
for (int i = 0; i < data.rows; i++)
{
float conf = data.at<float>(i, 4);
if (conf < 0.45)
{
continue;
}
//第二个参数为5+种类数,数字华容道15个种类,所以后面参数为5+15=20
Mat clsP = data.row(i).colRange(5, 20);
Point IndexId;
double score;
minMaxLoc(clsP, 0, &score, 0, &IndexId);
if (score > 0.25)
{
float x = data.at<float>(i, 0);
float y = data.at<float>(i, 1);
float w = data.at<float>(i, 2);
float h = data.at<float>(i, 3);
int nx = int((x - w / 2.0) * w_ratio);
int ny = int((y - h / 2.0) * h_ratio);
int nw = int(w * w_ratio);
int nh = int(h * h_ratio);
Rect box;
box.x = nx;
box.y = ny;
box.width = nw;
box.height = nh;
boxes.push_back(box);
classIds.push_back(IndexId.x);
confidences.push_back(score);
}
}
dnn::NMSBoxes(boxes, confidences, 0.25, 0.45, indexs);
}
int main(int argc, char** argv) {
//定义onnx文件
string onnxfile = "D:/Business/DemoTEST/pyTorch/yolov5/best.onnx";
//测试图片文件 D:/Business/DemoTEST/pyTorch/yolov5/data/images
string testfile = "D:/Business/DemoTEST/pyTorch/yolov5/data/images/h001.jpeg";
//string testfile = "D:/Business/DemoTEST/pyTorch/yolov5/data/images/bus.jpg";
//测试图片文件
string classNames[] = { "10","3","6","1","9","15","14","4","7","5","2","13","11","12","8" };
net = dnn::readNetFromONNX(onnxfile);
if (net.empty()) {
cout << "加载Onnx文件失败!" << endl;
return -1;
}
net.setPreferableBackend(dnn::DNN_BACKEND_OPENCV);
net.setPreferableTarget(dnn::DNN_TARGET_CPU);
//读取图片
Mat src = imread(testfile);
Mat inputBlob = dnn::blobFromImage(src, 1.0 / 255, Size(640, 640), Scalar(), true, false);
//输入参数值
net.setInput(inputBlob);
//预测结果
Mat output = net.forward();
vector<Rect> boxes;
vector<float> confidences;
vector<int> classIds;
vector<int> indexs;
deal_pred(src, output, boxes, confidences, classIds, indexs);
for (int i = 0; i < indexs.size(); i++)
{
rectangle(src, boxes[indexs[i]], (0, 0, 255), 2);
rectangle(src, Point(boxes[indexs[i]].tl().x, boxes[indexs[i]].tl().y - 20),
Point(boxes[indexs[i]].tl().x + boxes[indexs[i]].br().x, boxes[indexs[i]].tl().y), Scalar(200, 200, 200), -1);
putText(src, classNames[classIds[indexs[i]]], Point(boxes[indexs[i]].tl().x + 5, boxes[indexs[i]].tl().y - 10), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 0));
ostringstream conf;
conf << confidences[indexs[i]];
putText(src, conf.str(), Point(boxes[indexs[i]].tl().x + 60, boxes[indexs[i]].tl().y - 10), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 0));
}
imshow("img", src);
waitKey();
return 0;
}
Processing classification recognition Here, the intercepted Mat area, the second parameter 20, is set according to our classification number plus 5, and 15 classifications are set in the digital Huarong Road, plus 5 is equal to 20, so the parameter here is 20 .
In blobFromImage, because the color image we use directly, the default is RGB in yolov5, and the default order of OpenCV is BGR, so set the parameter swapRB to true and convert it .
OpenCV inference effect
The effect of C++ OpenCV reasoning is used here, so a Demo of Yolov5's own training plus OpenCV DNN reasoning is completed.
over
Wonderful review of the past
Installation of target detection yolov5
Getting started with pyTorch (5) - training your own data set