OpenCV4学习笔记(64)——dnn模块之调用基于残差SSD神经网络的人脸检测模型

本次笔记整理的内容是:在OpenCV中利用dnn模块来调用基于残差SSD神经网络的人脸检测模型,OpenCV中的人脸检测模型分别由tensorflow框架和caffe框架训练而成。其中Tensorflow模型经过压缩,运算速度快,但精度比较低;而caffe模型使用FP16的浮点数据,检测精度更高,但运算速度略逊一筹。

一、下面先整理人脸检测tensorflow模型opencv_face_detector_uint8.pb的调用。
(1)首先加载模型,并设置计算后台和目标设备

	string model_path = "D:\\opencv_c++\\opencv_tutorial\\data\\models\\face_detector\\opencv_face_detector_uint8.pb";
	string config_path = "D:\\opencv_c++\\opencv_tutorial\\data\\models\\face_detector\\opencv_face_detector.pbtxt";
	Net face_detector = readNetFromTensorflow(model_path, config_path);
	face_detector.setPreferableBackend(DNN_BACKEND_INFERENCE_ENGINE);
	face_detector.setPreferableTarget(DNN_TARGET_CPU);

(2)调用摄像头

	VideoCapture capture;
	capture.open(0);
	if (!capture.isOpened()) 
	{
		cout << "can't open the camera" << endl;
		exit(-1);
	}

(3)循环读取图像,并转换为4维blob传入tensorflow模型中,经过前向传播得到结果矩阵prob,并对矩阵prob进行解码,得到预测结果矩阵detection。

Mat frame;
	while (capture.read(frame))
	{
		int start = getTickCount();
		flip(frame, frame, 1);
		Mat inputBlob = blobFromImage(frame, 1.0, Size(300, 300), Scalar(104.0, 177.0, 123.0), false);

		face_detector.setInput(inputBlob);
		Mat prob = face_detector.forward();

		Mat detection(prob.size[2], prob.size[3], CV_32F, prob.ptr<float>());
		float confidence_thresh = 0.5;
		for (int row = 0; row < detection.rows;row++)
		{
			float confidence = detection.at<float>(row, 2);
			if (confidence > confidence_thresh)
			{
				int classID = detection.at<float>(row, 1);
				int imageid= detection.at<float>(row, 0);
				int top_left_x = detection.at<float>(row, 3)*frame.cols;
				int top_left_y = detection.at<float>(row, 4)*frame.rows;
				int button_right_x = detection.at<float>(row, 5)*frame.cols;
				int button_right_y = detection.at<float>(row, 6)*frame.rows;
				int width = button_right_x - top_left_x;
				int height = button_right_y - top_left_y;
				Rect box(top_left_x, top_left_y, width, height);
				rectangle(frame, box, Scalar(0, 255, 0), 1, 8, 0);
				cout << classID << "," << notKnown << "," << confidence << endl;
			}
		}
		int end = getTickCount();
		double run_time = (double(end) - double(start)) / getTickFrequency();
		float FPS = 1 / run_time;
		putText(frame, format("FPS: %0.2f", FPS), Point(20, 20), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 255), 1, 8);
		imshow("frame", frame);
		char ch = waitKey(1);
		if (ch == 27)
		{
			break;
		}
	}
	capture.release();

这里解码后得到的detection矩阵中的每一行就是一个检测到的目标,其中某一行如下:
在这里插入图片描述
第一列是输入图像的编号;
第二列的该检测目标的分类ID;
第三列是预测类别的置信度;
第四列到第七列分别是目标矩形框的左上角x、y坐标和右下角x、y坐标,注意这里的坐标值是与图像宽高成比例的比值,所以使用的时候需要乘上图像的宽高。

通过以上代码就可以实现实时人脸检测了,这里就不作演示啦哈哈哈哈哈哈毕竟大夏天的。。。在家穿衣比较随意。。

二、下面整理人脸检测caffe模型res10_300x300_ssd_iter_140000_fp16.caffemodel的调用。
由于caffe模型和上面的tensorflow模型对输入输出数据的处理很类似,包括在对预测结果矩阵进行解码处理的部分,也是几乎一样的,所以这里就直接给出完整的演示代码。

	string caffe_model_path = "D:\\opencv_c++\\opencv_tutorial\\data\\models\\face_detector\\res10_300x300_ssd_iter_140000_fp16.caffemodel";
	string caffe_config_path = "D:\\opencv_c++\\opencv_tutorial\\data\\models\\face_detector\\deploy.prototxt";
	Net caffe_face_detector = readNetFromCaffe(caffe_config_path, caffe_model_path);
	caffe_face_detector.setPreferableBackend(DNN_BACKEND_INFERENCE_ENGINE);
	caffe_face_detector.setPreferableTarget(DNN_TARGET_CPU);

	VideoCapture capture;
	capture.open(0, CAP_DSHOW);
	if (!capture.isOpened())
	{
		cout << "can't open camera" << endl;
		exit(-1);
	}
	Mat frame;
	while (capture.read(frame))
	{
		double start = getTickCount();
		flip(frame, frame, 1);
		Mat inputBlob = blobFromImage(frame, 1, Size(300, 300), Scalar(104, 117, 123), true, false);

		caffe_face_detector.setInput(inputBlob);
		Mat prob = caffe_face_detector.forward();

		Mat detection(prob.size[2], prob.size[3], CV_32F, prob.ptr<float>());
		float confidence_thresh = 0.5;
		for (int row = 0; row < detection.rows; row++)
		{
			float confidence = detection.at<float>(row, 2);
			if (confidence > confidence_thresh)
			{
				int classID = detection.at<float>(row, 1);
				int top_left_x = detection.at<float>(row, 3) * frame.cols;
				int top_left_y = detection.at<float>(row, 4) * frame.rows;
				int button_right_x = detection.at<float>(row, 5) * frame.cols;
				int button_right_y = detection.at<float>(row, 6) * frame.rows;
				int width = button_right_x - top_left_x;
				int height = button_right_y - top_left_y;
				Rect box(top_left_x, top_left_y, width, height);
				rectangle(frame, box, Scalar(0, 255, 0), 1, 8);
			}
		}
		double end = getTickCount();
		double run_time = (end - start) / getTickFrequency();
		double fps = 1 / run_time;
		cv::putText(frame, format("FPS:  %0.2f", fps), Point(20, 20), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 255), 1, 8);
		cv::imshow("frame", frame);
		char ch = waitKey(1);
		if (ch == 27)
		{
			break;
		}
	}

上面两部分就是在dnn模块中调用OpenCV自带的人脸检测模型的方法啦,主要的问题在于对预测结果矩阵的解码处理,tensorflow模型和caffe模型的预测结果矩阵的结构是很类似的,只要了解了这点,那么在OpenCV通过调用这两个模型来实现实时人脸检测就不是什么难题啦。而且,检测效果还是很不错的,不管是人脸处于远近位置、或者是些许遮挡、又或者是侧脸的影响,都能检测出人脸并用矩形框标记,可以说把曾经的OpenCV人脸检测算法——级联检测器给按在地上摩擦(;´д`)ゞ

好的今天笔记就整理到此结束啦,谢谢阅读~

PS:本人的注释比较杂,既有自己的心得体会也有网上查阅资料时摘抄下的知识内容,所以如有雷同,纯属我向前辈学习的致敬,如果有前辈觉得我的笔记内容侵犯了您的知识产权,请和我联系,我会将涉及到的博文内容删除,谢谢!

猜你喜欢

转载自blog.csdn.net/weixin_45224869/article/details/106007730