If the text is the original article, reproduced, please indicate the original source
of this article blog address: https://blog.csdn.net/qq21497936/article/details/109201809
you readers, have poor human knowledge and infinite, or change the demand, either Find professionals, or study the blog post of Fatty Red (Red Imitation) by yourself
: development technology collection (including Qt practical technology, Raspberry Pi, 3D, OpenCV, OpenGL, ffmpeg, OSG, single-chip microcomputer, software and hardware combination, etc.) continue to update In... (click on the portal)
OpenCV development column (click on the portal)
Previous: " OpenCV Development Notes (72): Red Fat Man takes you to use opencv+dnn+tensorFlow to identify objects in 8 minutes "
Next: Continued to add...
Preface
The effect of the cascade classifier is not very good, and the accuracy is relatively low relative to deep learning. The previous chapter used tensorflow in dnn, and this chapter uses the yolov3 model to identify specific classifications.
Demo
320x320, confidence level 0.6
608x608, confidence level 0.6 (608 in .cfg)
yolov3 model download
- coco.names : Model specific classification information.
https://github.com/pjreddie/darknet/blob/master/data/coco.names - yolov3.weights : weight file
https://pjreddie.com/media/files/yolov3.weights - yolov3.cfg : configuration file
https://github.com/pjreddie/darknet/blob/master/cfg/yolov3.cfg
If the above cannot be downloaded, provide other download addresses, because github is very slow.
CSDN: https://download.csdn.net/download/qq21497936/12995972
QQ group: 1047134658 (click " file " to search for " yolov3 ", the group will be updated simultaneously with the blog post)
OpenCV deep recognition basic process
Opencv3.4.x supports various models.
Supported models
opencv3.4.x support what depth study of model:
- Caffe: .caffemodel
official website: http://caffe.berkeleyvision.org
- tensorflow: .pb
official website: https://www.tensorflow.org
- Torch: .t7 | .net
official website: http
://torch.ch-darknet: .weights
official website: https://pjreddie.com/darknet-DLDT
: .bin
official website: https://software.intel.com/openvino-toolkit
Operation steps: yolov3
Models generated by different deep learning frameworks have some differences in operation and data output. Sort out the steps to use the model trained by opencv using tensorflow.
Step 1: Read the classification file
The model file corresponds to different classification files. The classification file is identified by the behavior, and the number of rows (starting from 0) is the number of classifications finally recognized.
std::string classesFile = "E:/qtProject/openCVDemo/dnnData/" \
"yolov3/coco.names";
// 读入分类名称,存入缓存
std::ifstream ifs(classesFile);
std::vector<std::string> classes;
std::string classLine;
while(std::getline(ifs, classLine))
{
classes.push_back(classLine);
}
Step 2: Load the model and configuration files, and build a neural network.
According to different models, use the cv::dnn::readNetFromXXX series functions to read, and the dnn model supported by the opencv3.4.x series (look up on the supported models).
The yolov3 model is as follows:
std::string modelWeights = "E:/qtProject/openCVDemo/dnnData/" \
"yolov3/yolov3.weights";
std::string modelCfg = "E:/qtProject/openCVDemo/dnnData/" \
"yolov3/yolov3.cfg";
// 加载yolov3模型
cv::dnn::Net net = cv::dnn::readNetFromDarknet(modelCfg, modelWeights);
if(net.empty())
{
qDebug() << __FILE__ << __LINE__ << "net is empty!!!";
return;
}
Step 3: Add the picture to be predicted into the neural network
After joining, you need to recognize the picture, then you need to input the picture into the neural network. Use the yolov3 model to pay special attention to normalization first, and then turn it into a picture of a specified size, as follows:
// 读取图片识别
mat = cv::imread("E:/testFile/15.jpg");
if(!mat.data)
{
qDebug() << __FILE__ << __LINE__ << "Failed to read image!!!";
return;
}
// cv::dnn::blobFromImage(mat, blob);
// 必须要设置,否则会跑飞
cv::dnn::blobFromImage(mat,
blob,
1.0f/255,
cv::Size(320, 320),
cv::Scalar(0, 0, 0),
true,
false);
net.setInput(blob);
The increase in width and height can improve the accuracy of detection. It is best to modify it according to the cfg file. This Demo is 320x320, and the actual .cfg file is 608x608. After testing, this is the pixel with the best recognition effect. Run away.
Step 4: Classify prediction and obtain the recognition result
After input, recognition is performed. The recognition is forward prediction (classification prediction), and the result is obtained. For the yolov3 model, there are 3 output layers, so you need to get 3 output layers first, and then you need to specify when predicting Predict these 3 output layers, otherwise they will run away.
// 获取输出的层
std::vector<cv::String> outPutNames;
std::vector<int> outLayers = net.getUnconnectedOutLayers();
for(int index = 0; index < outLayers.size(); index++)
{
outPutNames.push_back(layerNames[outLayers[index] - 1]);
qDebug() << __FILE__ << __LINE__
<< QString(layerNames[outLayers[index] - 1].c_str());
}
// 推理预测:可以输入预测的图层名称
std::vector<cv::Mat> probs;
net.forward(probs, outPutNames);
For the predicted result, stored in std::vectorcv::Mat type probs, each element is designated as cv::Mat type prob, each row represents a detected category, the specific column information is as follows:
(Note: specific Please refer to " Step 5" for the use of
Step 5: Classify and box select the mat that can be output by the confidence level
The key output steps are different for different recognitions. Yolov3 is as shown below:
// 置信度预制,大于执行度的将其使用rect框出来
for(int index = 0; index < probs.size(); index++)
{
for (int row = 0; row < probs[index].rows; row++)
{
// 获取probs中一个元素里面匹配对的所有对象中得分最高的
cv::Mat scores = probs[index].row(row).colRange(5, probs[index].cols);
cv::Point classIdPoint;
double confidence;
// Get the value and location of the maximum score
cv::minMaxLoc(scores, 0, &confidence, 0, &classIdPoint);
if(confidence > 0.6)
{
qDebug() << __FILE__ << __LINE__ << confidence << classIdPoint.x;
int centerX = (int)(probs.at(index).at<float>(row, 0) * mat.cols);
int centerY = (int)(probs.at(index).at<float>(row, 1) * mat.rows);
int width = (int)(probs.at(index).at<float>(row, 2) * mat.cols);
int height = (int)(probs.at(index).at<float>(row, 3) * mat.rows);
int left = centerX - width / 2;
int top = centerY - height / 2;
cv::Rect objectRect(left, top, width, height);
cv::rectangle(mat, objectRect, cv::Scalar(255, 0, 0), 2);
cv::String label = cv::format("%s:%.4f",
classes[classIdPoint.x].data(),
confidence);
cv::putText(mat,
label,
cv::Point(left, top - 10),
cv::FONT_HERSHEY_SIMPLEX,
0.4,
cv::Scalar(0, 0, 255));
qDebug() << __FILE__ << __LINE__
<< centerX << centerY << width << height;
}
}
}
Function prototype
Read yolov3 model and configuration file function prototype
Net readNetFromDarknet(const String &cfgFile,
const String &darknetModel = String());
Read from the file.
- Parameter 1 : The path of the .cfg file with the text description of the network architecture;
- Parameter 2 : The path of the .weights file of the learned network;
Read the picture (need to identify) function prototype
void blobFromImage(InputArray image,
OutputArray blob,
double scalefactor=1.0,
const Size& size = Size(),
const Scalar& mean = Scalar(),
bool swapRB=false,
bool crop=false,
int ddepth=CV_32F);.
Create area from image. You can choose to adjust and crop the image from the center.
- Parameter 1 : Image input image (1, 3 or 4 channels);
- Parameter 2 : Output image space;
- Parameter 3 : The zoom factor multiplier of the image value;
- Parameter 4 : The size of the output image space;
- Parameter five : subtract the average scalar of the average value from the channel. The value is intentional. If the image has a BGR order and swapRB is true, it will be arranged in (mean-R, mean-G, mean-B) order;
- Parameter six : swapRB mark, indicating the exchange of the first and last channel, it is necessary for the three-channel image;
- Parameter seven : cropping flag, indicating whether to crop the image after resizing;
- Parameter eight : the depth of the output blob, select CV_32F or CV_8U;
Set the neural network input function prototype
void cv::dnn::Net::setInput(InputArray blob,
const String& name = "",
double scalefactor = 1.0,
const Scalar& mean = Scalar());
Set the new input value for the network.
- Parameter one : a new blob. Should have CV_32F or CV_8U depth.
- Parameter 2 : Enter the name of the layer.
- Parameter 3 : Optional standardized scale.
- Parameter 4 : Optional average subtraction value.
Return the names of all layers (in order of their own index)
std::vector<String> getLayerNames() const;
Returns the index of the layer with unconnected output.
std::vector<int> getUnconnectedOutLayers() const;
Deep detection and recognition (forward prediction) function prototype
void cv::dnn::Net::Mat forward(const String& outputName = String());
Forward prediction, return the first output blob of the specified layer, generally return to the last layer, you can use cv::Net::getLayarNames() to get all the layer names.
- Parameter 1 : outputName needs to obtain the name of the output layer
Demo
void OpenCVManager::testYoloV3()
{
std::string classesFile = "E:/qtProject/openCVDemo/dnnData/" \
"yolov3/coco.names";
std::string modelWeights = "E:/qtProject/openCVDemo/dnnData/" \
"yolov3/yolov3.weights";
std::string modelCfg = "E:/qtProject/openCVDemo/dnnData/" \
"yolov3/yolov3.cfg";
// 读入分类名称,存入缓存
std::ifstream ifs(classesFile);
std::vector<std::string> classes;
std::string classLine;
while(std::getline(ifs, classLine))
{
classes.push_back(classLine);
}
// 加载yolov3模型
cv::dnn::Net net = cv::dnn::readNetFromDarknet(modelCfg, modelWeights);
if(net.empty())
{
qDebug() << __FILE__ << __LINE__ << "net is empty!!!";
return;
}
cv::Mat mat;
cv::Mat blob;
// 获得所有层的名称和索引
std::vector<cv::String> layerNames = net.getLayerNames();
int lastLayerId = net.getLayerId(layerNames[layerNames.size() - 1]);
cv::Ptr<cv::dnn::Layer> lastLayer = net.getLayer(cv::dnn::DictValue(lastLayerId));
qDebug() << __FILE__ << __LINE__
<< QString(lastLayer->type.c_str())
<< QString(lastLayer->getDefaultName().c_str())
<< QString(layerNames[layerNames.size()-1].c_str());
// 获取输出的层
std::vector<cv::String> outPutNames;
std::vector<int> outLayers = net.getUnconnectedOutLayers();
for(int index = 0; index < outLayers.size(); index++)
{
outPutNames.push_back(layerNames[outLayers[index] - 1]);
qDebug() << __FILE__ << __LINE__
<< QString(layerNames[outLayers[index] - 1].c_str());
}
while(true)
{
// 读取图片识别
mat = cv::imread("E:/testFile/15.jpg");
if(!mat.data)
{
qDebug() << __FILE__ << __LINE__ << "Failed to read image!!!";
return;
}
// cv::dnn::blobFromImage(mat, blob);
// 必须要设置,否则会跑飞
cv::dnn::blobFromImage(mat,
blob,
1.0f/255,
cv::Size(320, 320),
cv::Scalar(0, 0, 0),
true,
false);
net.setInput(blob);
// 推理预测:可以输入预测的图层名称
std::vector<cv::Mat> probs;
net.forward(probs, outPutNames);
// 显示识别花费的时间
std::vector<double> layersTimes;
double freq = cv::getTickFrequency() / 1000;
double t = net.getPerfProfile(layersTimes) / freq;
std::string label = cv::format("Inference time: %.2f ms", t);
cv::putText(mat,
label,
cv::Point(0, 15),
cv::FONT_HERSHEY_SIMPLEX,
0.5,
cv::Scalar(255, 0, 0));
// 置信度预制,大于执行度的将其使用rect框出来
for(int index = 0; index < probs.size(); index++)
{
for (int row = 0; row < probs[index].rows; row++)
{
// 获取probs中一个元素里面匹配对的所有对象中得分最高的
cv::Mat scores = probs[index].row(row).colRange(5, probs[index].cols);
cv::Point classIdPoint;
double confidence;
// Get the value and location of the maximum score
cv::minMaxLoc(scores, 0, &confidence, 0, &classIdPoint);
if(confidence > 0.6)
{
qDebug() << __FILE__ << __LINE__ << confidence << classIdPoint.x;
int centerX = (int)(probs.at(index).at<float>(row, 0) * mat.cols);
int centerY = (int)(probs.at(index).at<float>(row, 1) * mat.rows);
int width = (int)(probs.at(index).at<float>(row, 2) * mat.cols);
int height = (int)(probs.at(index).at<float>(row, 3) * mat.rows);
int left = centerX - width / 2;
int top = centerY - height / 2;
cv::Rect objectRect(left, top, width, height);
cv::rectangle(mat, objectRect, cv::Scalar(255, 0, 0), 2);
cv::String label = cv::format("%s:%.4f",
classes[classIdPoint.x].data(),
confidence);
cv::putText(mat,
label,
cv::Point(left, top - 10),
cv::FONT_HERSHEY_SIMPLEX,
0.4,
cv::Scalar(0, 0, 255));
qDebug() << __FILE__ << __LINE__
<< centerX << centerY << width << height;
}
}
}
cv::imshow(_windowTitle.toStdString(), mat);
cv::waitKey(0);
}
}
Corresponding project template v1.65.0
openCVDemo_v1.65.0_Basic template_yolov3 classification detection.rar.
Into the pit
Entry 1: Error when loading the model
error
Cause The
model file was loaded incorrectly.
Solve
check whether the file exists, whether the path is correct, and whether the model file can be matched.
Into the pit 2: Error when entering blob
error
The reason is
that no parameters are entered when predicting, and parameters need to be entered (note: there is no problem if tensorflow is not entered).
solve
Previous: " OpenCV Development Notes (72): Red Fat Man takes you to use opencv+dnn+tensorFlow to identify objects in 8 minutes "
Next: Continued to add...