MTCNN简介
MTCNN,Multi-task convolutional neural network(多任务卷积神经网络),通过多任务学习训练级联卷积神经网络来集成人脸检测和人脸对齐任务。MTCNN包含三个阶段,在第一阶段P-Net生成大量候选人脸窗口,第二阶段R-Net拒绝大量非人脸窗口,第三阶段O-Net生成最终人脸边界框和五个人脸关键点坐标。
工作流程如下图
训练过程
本文参考仓库:https://github.com/yeyupiaoling/Pytorch-MTCNN
数据及环境准备
- 使用anaconda创建虚拟环境:安装python3.6+pytorch1.7.0
- WIDER FACE数据集:WIDER FACE: A Face Detection Benchmark (shuoyang1213.me)
- CNN_FacePoint数据集:http://mmlab.ie.cuhk.edu.hk/archive/CNN_FacePoint.htm
训练PNet
按照readme
中将数据集放在指定的文件夹,就可以开始生成PNet训练数据。
运行generate_PNet_data.py
,漫长的等待…
在运行train_PNet.py
之前,安装下tensorboard
pip install tensorboard
train_PNet.py
插入以下代码
from torch.utils.tensorboard import SummaryWriter
# 添加tensorboard
writer = SummaryWriter("logs")
# 记录训练的次数
total_train_step = 0
在循环体中插入代码
total_train_step = total_train_step + 1
if total_train_step % 100 == 0:
acc = accuracy(class_out, label)
writer.add_scalar("train_loss", total_loss, total_train_step)
writer.add_scalar("train_accuracy", acc, total_train_step)
开始运行train_PNet.py
,打开PyCharm终端,输入
tensorboard --logdir=logs
打开网址便可看见实时训练曲线
若是想在tensorboard
中获取曲线图,直接截取的做法不是不行,但放在论文中会显得不合适。
于是,将csv
文件下载下来,后续手动平滑曲线。
import matplotlib.pyplot as plt
import pandas as pd
def tensorboard_smoothing(x, smooth=0.85):
x = x.copy()
weight = smooth
for i in range(1, len(x)):
x[i] = (x[i-1] * weight + x[i]) / (weight + 1)
weight = (weight + 1) * smooth
return x
fig, ax1 = plt.subplots(1, 1)
len_mean = pd.read_csv("train_loss.csv")
ax1.plot(len_mean['Step'], tensorboard_smoothing(len_mean['Value'], smooth=0.999), color="#3399FF")
ax1.set_xlabel("steps")
ax1.set_ylabel("loss", color="#000000")
ax1.set_title("training loss")
plt.show()
效果
训练RNet
运行generate_RNet_data.py
,漫长的等待…
在运行train_RNet.py
之前,同样插入tensorboard相关代码
训练ONet
运行generate_RNet_data.py
,漫长的等待…,但是报错了
经分析是GPU空间不足,修改代码
# 修改为cpu预测
device = torch.device("cpu")
重新运行generate_RNet_data.py
运行train_ONet.py
测试
经训练有三个模型:PNet
、RNet
、ONet
MTCNN模型跑单张图片
运行infer_path
WIDER FACE验证集准确率
- 先将MTCNN模型在验证集上跑出预测结果并保存
WIDER FACE验证集输出文件要求格式
< image name i >
< number of faces in this image = im >
< face i1 >
< face i2 >
...
< face im >
编写测试脚本widerface_test.py
import argparse
import os
import cv2
import numpy as np
import torch
from utils.utils import generate_bbox, py_nms, convert_to_square
from utils.utils import pad, calibrate_box, processed_image
parser = argparse.ArgumentParser()
parser.add_argument('--model_path', type=str, default='infer_models', help='PNet、RNet、ONet三个模型文件存在的文件夹路径')
#parser.add_argument('--image_path', type=str, default='dataset/test.jpg', help='需要预测图像的路径')
args = parser.parse_args()
device = torch.device("cuda")
# 获取P模型
pnet = torch.jit.load(os.path.join(args.model_path, 'PNet.pth'))
pnet.to(device)
softmax_p = torch.nn.Softmax(dim=0)
pnet.eval()
# 获取R模型
rnet = torch.jit.load(os.path.join(args.model_path, 'RNet.pth'))
rnet.to(device)
softmax_r = torch.nn.Softmax(dim=-1)
rnet.eval()
# 获取R模型
onet = torch.jit.load(os.path.join(args.model_path, 'ONet.pth'))
onet.to(device)
softmax_o = torch.nn.Softmax(dim=-1)
onet.eval()
# 使用PNet模型预测
def predict_pnet(infer_data):
# 添加待预测的图片
infer_data = torch.tensor(infer_data, dtype=torch.float32, device=device)
infer_data = torch.unsqueeze(infer_data, dim=0)
# 执行预测
cls_prob, bbox_pred, _ = pnet(infer_data)
cls_prob = torch.squeeze(cls_prob)
cls_prob = softmax_p(cls_prob)
bbox_pred = torch.squeeze(bbox_pred)
return cls_prob.detach().cpu().numpy(), bbox_pred.detach().cpu().numpy()
# 使用RNet模型预测
def predict_rnet(infer_data):
# 添加待预测的图片
infer_data = torch.tensor(infer_data, dtype=torch.float32, device=device)
# 执行预测
cls_prob, bbox_pred, _ = rnet(infer_data)
cls_prob = softmax_r(cls_prob)
return cls_prob.detach().cpu().numpy(), bbox_pred.detach().cpu().numpy()
# 使用ONet模型预测
def predict_onet(infer_data):
# 添加待预测的图片
infer_data = torch.tensor(infer_data, dtype=torch.float32, device=device)
# 执行预测
cls_prob, bbox_pred, landmark_pred = onet(infer_data)
cls_prob = softmax_o(cls_prob)
return cls_prob.detach().cpu().numpy(), bbox_pred.detach().cpu().numpy(), landmark_pred.detach().cpu().numpy()
# 获取PNet网络输出结果
def detect_pnet(im, min_face_size, scale_factor, thresh):
"""通过pnet筛选box和landmark
参数:
im:输入图像[h,2,3]
"""
net_size = 12
# 人脸和输入图像的比率
current_scale = float(net_size) / min_face_size
im_resized = processed_image(im, current_scale)
_, current_height, current_width = im_resized.shape
all_boxes = list()
# 图像金字塔
while min(current_height, current_width) > net_size:
# 类别和box
cls_cls_map, reg = predict_pnet(im_resized)
boxes = generate_bbox(cls_cls_map[1, :, :], reg, current_scale, thresh)
current_scale *= scale_factor # 继续缩小图像做金字塔
im_resized = processed_image(im, current_scale)
_, current_height, current_width = im_resized.shape
if boxes.size == 0:
continue
# 非极大值抑制留下重复低的box
keep = py_nms(boxes[:, :5], 0.5, mode='Union')
boxes = boxes[keep]
all_boxes.append(boxes)
if len(all_boxes) == 0:
return None
all_boxes = np.vstack(all_boxes)
# 将金字塔之后的box也进行非极大值抑制
keep = py_nms(all_boxes[:, 0:5], 0.7, mode='Union')
all_boxes = all_boxes[keep]
# box的长宽
bbw = all_boxes[:, 2] - all_boxes[:, 0] + 1
bbh = all_boxes[:, 3] - all_boxes[:, 1] + 1
# 对应原图的box坐标和分数
boxes_c = np.vstack([all_boxes[:, 0] + all_boxes[:, 5] * bbw,
all_boxes[:, 1] + all_boxes[:, 6] * bbh,
all_boxes[:, 2] + all_boxes[:, 7] * bbw,
all_boxes[:, 3] + all_boxes[:, 8] * bbh,
all_boxes[:, 4]])
boxes_c = boxes_c.T
return boxes_c
# 获取RNet网络输出结果
def detect_rnet(im, dets, thresh):
"""通过rent选择box
参数:
im:输入图像
dets:pnet选择的box,是相对原图的绝对坐标
返回值:
box绝对坐标
"""
h, w, c = im.shape
# 将pnet的box变成包含它的正方形,可以避免信息损失
dets = convert_to_square(dets)
dets[:, 0:4] = np.round(dets[:, 0:4])
# 调整超出图像的box
[dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph] = pad(dets, w, h)
delete_size = np.ones_like(tmpw) * 20
ones = np.ones_like(tmpw)
zeros = np.zeros_like(tmpw)
num_boxes = np.sum(np.where((np.minimum(tmpw, tmph) >= delete_size), ones, zeros))
cropped_ims = np.zeros((num_boxes, 3, 24, 24), dtype=np.float32)
for i in range(int(num_boxes)):
# 将pnet生成的box相对与原图进行裁剪,超出部分用0补
if tmph[i] < 20 or tmpw[i] < 20:
continue
tmp = np.zeros((tmph[i], tmpw[i], 3), dtype=np.uint8)
try:
tmp[dy[i]:edy[i] + 1, dx[i]:edx[i] + 1, :] = im[y[i]:ey[i] + 1, x[i]:ex[i] + 1, :]
img = cv2.resize(tmp, (24, 24), interpolation=cv2.INTER_LINEAR)
img = img.transpose((2, 0, 1))
img = (img - 127.5) / 128
cropped_ims[i, :, :, :] = img
except:
continue
cls_scores, reg = predict_rnet(cropped_ims)
cls_scores = cls_scores[:, 1]
keep_inds = np.where(cls_scores > thresh)[0]
if len(keep_inds) > 0:
boxes = dets[keep_inds]
boxes[:, 4] = cls_scores[keep_inds]
reg = reg[keep_inds]
else:
return None
keep = py_nms(boxes, 0.4, mode='Union')
boxes = boxes[keep]
# 对pnet截取的图像的坐标进行校准,生成rnet的人脸框对于原图的绝对坐标
boxes_c = calibrate_box(boxes, reg[keep])
return boxes_c
# 获取ONet模型预测结果
def detect_onet(im, dets, thresh):
"""将onet的选框继续筛选基本和rnet差不多但多返回了landmark"""
h, w, c = im.shape
dets = convert_to_square(dets)
dets[:, 0:4] = np.round(dets[:, 0:4])
[dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph] = pad(dets, w, h)
num_boxes = dets.shape[0]
cropped_ims = np.zeros((num_boxes, 3, 48, 48), dtype=np.float32)
for i in range(num_boxes):
tmp = np.zeros((tmph[i], tmpw[i], 3), dtype=np.uint8)
tmp[dy[i]:edy[i] + 1, dx[i]:edx[i] + 1, :] = im[y[i]:ey[i] + 1, x[i]:ex[i] + 1, :]
img = cv2.resize(tmp, (48, 48), interpolation=cv2.INTER_LINEAR)
img = img.transpose((2, 0, 1))
img = (img - 127.5) / 128
cropped_ims[i, :, :, :] = img
cls_scores, reg, landmark = predict_onet(cropped_ims)
cls_scores = cls_scores[:, 1]
keep_inds = np.where(cls_scores > thresh)[0]
if len(keep_inds) > 0:
boxes = dets[keep_inds]
boxes[:, 4] = cls_scores[keep_inds]
reg = reg[keep_inds]
landmark = landmark[keep_inds]
else:
return None, None
w = boxes[:, 2] - boxes[:, 0] + 1
h = boxes[:, 3] - boxes[:, 1] + 1
landmark[:, 0::2] = (np.tile(w, (5, 1)) * landmark[:, 0::2].T + np.tile(boxes[:, 0], (5, 1)) - 1).T
landmark[:, 1::2] = (np.tile(h, (5, 1)) * landmark[:, 1::2].T + np.tile(boxes[:, 1], (5, 1)) - 1).T
boxes_c = calibrate_box(boxes, reg)
keep = py_nms(boxes_c, 0.6, mode='Minimum')
boxes_c = boxes_c[keep]
landmark = landmark[keep]
return boxes_c, landmark
# 预测图片
def infer_image(image_path):
im = cv2.imread(image_path)
# 调用第一个模型预测
boxes_c = detect_pnet(im, 20, 0.79, 0.9)
if boxes_c is None:
return None, None
# 调用第二个模型预测
boxes_c = detect_rnet(im, boxes_c, 0.6)
if boxes_c is None:
return None, None
# 调用第三个模型预测
boxes_c, landmark = detect_onet(im, boxes_c, 0.7)
if boxes_c is None:
return None, None
return boxes_c, landmark
class wildtest():
def __init__(self, image_dir, result_dir):
self.image_dir = image_dir
self.result_dir = result_dir
def detect(self):
event_list = os.listdir(self.image_dir)
for event in event_list:
event_dir = os.path.join(self.image_dir, event)
res_dir = os.path.join(self.result_dir, event)
if not os.path.exists(res_dir):
os.makedirs(res_dir)
images_list = os.listdir(event_dir)
for images in images_list:
images_path = os.path.join(event_dir, images)
# img = cv2.imread(images_path)
print(images_path)
bboxs, landmarks = infer_image(images_path)
# bboxs, landmarks = mtcnn_detector.detect_face(img)
# print(bboxs)
if bboxs is None:
fpath = os.path.join(res_dir, images[:-4] + '.txt')
f = open(fpath, 'w')
f.write(images[:-4] + '\n')
f.write(str(0) + '\n')
f.close()
continue
if bboxs.shape[0] != 0:
bboxs[:, 2] = bboxs[:, 2] - bboxs[:, 0]
bboxs[:, 3] = bboxs[:, 3] - bboxs[:, 1]
bboxs[:, :4] = np.round(bboxs[:, :4])
""" print(bboxs)
save_name = 'r_304.jpg'
vis_face(img,bboxs,landmarks, save_name) """
fpath = os.path.join(res_dir, images[:-4] + '.txt')
f = open(fpath, 'w')
f.write(images[:-4] + '\n')
f.write(str(bboxs.shape[0]) + '\n')
for i in range(bboxs.shape[0]):
f.write('{:.0f} {:.0f} {:.0f} {:.0f} {:.3f}\n'.format(bboxs[i, 0], bboxs[i, 1], bboxs[i, 2], bboxs[i, 3], bboxs[i, 4]))
f.close()
if __name__ == '__main__':
image_dir = './dataset/WIDER_val/images/'
result_dir = './anno_store/wider_val/'
wildtest = wildtest(image_dir, result_dir)
wildtest.detect()
预测结果在目录:./anno_store/wider_val/
- 计算准确率
参考项目代码:https://github.com/bubbliiiing/retinaface-pytorch计算准确率
新建目录:widerface_evaluate
,将文件夹ground_truth
放在目录下,并将setup.py
、evaluation
、box_overlaps
三个文件放在目录下
先运行setup.py
,再运行evaluation
。
Easy Val AP: 0.6931683876966619
Medium Val AP: 0.6767176224620579
Hard Val AP: 0.4284666286589425
FDDB数据集测试
FDDB数据集地址:http://vis-www.cs.umass.edu/fddb/
参考博客:1. 在windows平台上测试自己的人脸检测算法在FDDB数据集
2. https://github.com/hualitlc/MTCNN-on-FDDB-Dataset
注:FDDB要求的输出预测结果格式
<image name i>
<number of faces in this image =im>
<face i1>
<face i2>
...
<face im>
- 读
FDDB-fold-**.txt
文件,MTCNN模型在FDDB数据集上检测结果保存
import argparse
import os
import cv2
import numpy as np
import torch
from utils.utils import generate_bbox, py_nms, convert_to_square
from utils.utils import pad, calibrate_box, processed_image
parser = argparse.ArgumentParser()
parser.add_argument('--model_path', type=str, default='../infer_models', help='PNet、RNet、ONet三个模型文件存在的文件夹路径')
# parser.add_argument('--image_path', type=str, default='dataset/test.jpg', help='需要预测图像的路径')
args = parser.parse_args()
device = torch.device("cuda")
# 获取P模型
pnet = torch.jit.load(os.path.join(args.model_path, 'PNet.pth'))
pnet.to(device)
softmax_p = torch.nn.Softmax(dim=0)
pnet.eval()
# 获取R模型
rnet = torch.jit.load(os.path.join(args.model_path, 'RNet.pth'))
rnet.to(device)
softmax_r = torch.nn.Softmax(dim=-1)
rnet.eval()
# 获取R模型
onet = torch.jit.load(os.path.join(args.model_path, 'ONet.pth'))
onet.to(device)
softmax_o = torch.nn.Softmax(dim=-1)
onet.eval()
# 使用PNet模型预测
def predict_pnet(infer_data):
# 添加待预测的图片
infer_data = torch.tensor(infer_data, dtype=torch.float32, device=device)
infer_data = torch.unsqueeze(infer_data, dim=0)
# 执行预测
cls_prob, bbox_pred, _ = pnet(infer_data)
cls_prob = torch.squeeze(cls_prob)
cls_prob = softmax_p(cls_prob)
bbox_pred = torch.squeeze(bbox_pred)
return cls_prob.detach().cpu().numpy(), bbox_pred.detach().cpu().numpy()
# 使用RNet模型预测
def predict_rnet(infer_data):
# 添加待预测的图片
infer_data = torch.tensor(infer_data, dtype=torch.float32, device=device)
# 执行预测
cls_prob, bbox_pred, _ = rnet(infer_data)
cls_prob = softmax_r(cls_prob)
return cls_prob.detach().cpu().numpy(), bbox_pred.detach().cpu().numpy()
# 使用ONet模型预测
def predict_onet(infer_data):
# 添加待预测的图片
infer_data = torch.tensor(infer_data, dtype=torch.float32, device=device)
# 执行预测
cls_prob, bbox_pred, landmark_pred = onet(infer_data)
cls_prob = softmax_o(cls_prob)
return cls_prob.detach().cpu().numpy(), bbox_pred.detach().cpu().numpy(), landmark_pred.detach().cpu().numpy()
# 获取PNet网络输出结果
def detect_pnet(im, min_face_size, scale_factor, thresh):
"""通过pnet筛选box和landmark
参数:
im:输入图像[h,2,3]
"""
net_size = 12
# 人脸和输入图像的比率
current_scale = float(net_size) / min_face_size
im_resized = processed_image(im, current_scale)
_, current_height, current_width = im_resized.shape
all_boxes = list()
# 图像金字塔
while min(current_height, current_width) > net_size:
# 类别和box
cls_cls_map, reg = predict_pnet(im_resized)
boxes = generate_bbox(cls_cls_map[1, :, :], reg, current_scale, thresh)
current_scale *= scale_factor # 继续缩小图像做金字塔
im_resized = processed_image(im, current_scale)
_, current_height, current_width = im_resized.shape
if boxes.size == 0:
continue
# 非极大值抑制留下重复低的box
keep = py_nms(boxes[:, :5], 0.5, mode='Union')
boxes = boxes[keep]
all_boxes.append(boxes)
if len(all_boxes) == 0:
return None
all_boxes = np.vstack(all_boxes)
# 将金字塔之后的box也进行非极大值抑制
keep = py_nms(all_boxes[:, 0:5], 0.7, mode='Union')
all_boxes = all_boxes[keep]
# box的长宽
bbw = all_boxes[:, 2] - all_boxes[:, 0] + 1
bbh = all_boxes[:, 3] - all_boxes[:, 1] + 1
# 对应原图的box坐标和分数
boxes_c = np.vstack([all_boxes[:, 0] + all_boxes[:, 5] * bbw,
all_boxes[:, 1] + all_boxes[:, 6] * bbh,
all_boxes[:, 2] + all_boxes[:, 7] * bbw,
all_boxes[:, 3] + all_boxes[:, 8] * bbh,
all_boxes[:, 4]])
boxes_c = boxes_c.T
return boxes_c
# 获取RNet网络输出结果
def detect_rnet(im, dets, thresh):
"""通过rent选择box
参数:
im:输入图像
dets:pnet选择的box,是相对原图的绝对坐标
返回值:
box绝对坐标
"""
h, w, c = im.shape
# 将pnet的box变成包含它的正方形,可以避免信息损失
dets = convert_to_square(dets)
dets[:, 0:4] = np.round(dets[:, 0:4])
# 调整超出图像的box
[dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph] = pad(dets, w, h)
delete_size = np.ones_like(tmpw) * 20
ones = np.ones_like(tmpw)
zeros = np.zeros_like(tmpw)
num_boxes = np.sum(np.where((np.minimum(tmpw, tmph) >= delete_size), ones, zeros))
cropped_ims = np.zeros((num_boxes, 3, 24, 24), dtype=np.float32)
for i in range(int(num_boxes)):
# 将pnet生成的box相对与原图进行裁剪,超出部分用0补
if tmph[i] < 20 or tmpw[i] < 20:
continue
tmp = np.zeros((tmph[i], tmpw[i], 3), dtype=np.uint8)
try:
tmp[dy[i]:edy[i] + 1, dx[i]:edx[i] + 1, :] = im[y[i]:ey[i] + 1, x[i]:ex[i] + 1, :]
img = cv2.resize(tmp, (24, 24), interpolation=cv2.INTER_LINEAR)
img = img.transpose((2, 0, 1))
img = (img - 127.5) / 128
cropped_ims[i, :, :, :] = img
except:
continue
cls_scores, reg = predict_rnet(cropped_ims)
cls_scores = cls_scores[:, 1]
keep_inds = np.where(cls_scores > thresh)[0]
if len(keep_inds) > 0:
boxes = dets[keep_inds]
boxes[:, 4] = cls_scores[keep_inds]
reg = reg[keep_inds]
else:
return None
keep = py_nms(boxes, 0.4, mode='Union')
boxes = boxes[keep]
# 对pnet截取的图像的坐标进行校准,生成rnet的人脸框对于原图的绝对坐标
boxes_c = calibrate_box(boxes, reg[keep])
return boxes_c
# 获取ONet模型预测结果
def detect_onet(im, dets, thresh):
"""将onet的选框继续筛选基本和rnet差不多但多返回了landmark"""
h, w, c = im.shape
dets = convert_to_square(dets)
dets[:, 0:4] = np.round(dets[:, 0:4])
[dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph] = pad(dets, w, h)
num_boxes = dets.shape[0]
cropped_ims = np.zeros((num_boxes, 3, 48, 48), dtype=np.float32)
for i in range(num_boxes):
tmp = np.zeros((tmph[i], tmpw[i], 3), dtype=np.uint8)
tmp[dy[i]:edy[i] + 1, dx[i]:edx[i] + 1, :] = im[y[i]:ey[i] + 1, x[i]:ex[i] + 1, :]
img = cv2.resize(tmp, (48, 48), interpolation=cv2.INTER_LINEAR)
img = img.transpose((2, 0, 1))
img = (img - 127.5) / 128
cropped_ims[i, :, :, :] = img
cls_scores, reg, landmark = predict_onet(cropped_ims)
cls_scores = cls_scores[:, 1]
keep_inds = np.where(cls_scores > thresh)[0]
if len(keep_inds) > 0:
boxes = dets[keep_inds]
boxes[:, 4] = cls_scores[keep_inds]
reg = reg[keep_inds]
landmark = landmark[keep_inds]
else:
return None, None
w = boxes[:, 2] - boxes[:, 0] + 1
h = boxes[:, 3] - boxes[:, 1] + 1
landmark[:, 0::2] = (np.tile(w, (5, 1)) * landmark[:, 0::2].T + np.tile(boxes[:, 0], (5, 1)) - 1).T
landmark[:, 1::2] = (np.tile(h, (5, 1)) * landmark[:, 1::2].T + np.tile(boxes[:, 1], (5, 1)) - 1).T
boxes_c = calibrate_box(boxes, reg)
keep = py_nms(boxes_c, 0.6, mode='Minimum')
boxes_c = boxes_c[keep]
landmark = landmark[keep]
return boxes_c, landmark
# 预测图片
def infer_image(image_path):
im = cv2.imread(image_path)
# 调用第一个模型预测
boxes_c = detect_pnet(im, 20, 0.79, 0.9)
if boxes_c is None:
return None, None
# 调用第二个模型预测
boxes_c = detect_rnet(im, boxes_c, 0.6)
if boxes_c is None:
return None, None
# 调用第三个模型预测
boxes_c, landmark = detect_onet(im, boxes_c, 0.7)
if boxes_c is None:
return None, None
return boxes_c, landmark
class fddbtest():
def __init__(self, image_dir, result_dir):
self.image_dir = image_dir
self.result_dir = result_dir
def detect(self):
# event_list = os.listdir(self.image_dir)
# print(event_list)
for i in range(1, 11):
fileFoldInputName = "./FDDB-fold-%02d.txt" % i
fileInputName = './FDDB-folds/' + fileFoldInputName
print(fileInputName)
fileFoldOutName = "FDDB-fold-%02d-out.txt" % i
fileOutputName = './FDDB-folds/' + fileFoldOutName
fileTotalPredictName = './FDDB-folds/' + 'predict.txt'
fout = open(fileTotalPredictName, 'a+') # predict.txt
fsplitout = open(fileOutputName, 'a+') # FDDB-fold-%02d-out.txt
f = open(fileInputName, 'r') # FDDB-fold-00.txt, read
for imgpath in f.readlines():
imgpath = imgpath.split('\n')[0]
path = './originalPics/' + imgpath + '.jpg'
# print(imgpath)
img = cv2.imread(path)
if img is None:
continue
img_matlab = img.copy()
tmp = img_matlab[:, :, 2].copy()
img_matlab[:, :, 2] = img_matlab[:, :, 0]
img_matlab[:, :, 0] = tmp
# check rgb position
# tic()
bboxs, landmarks = infer_image(path)
if bboxs is None:
text1 = str(imgpath) + '\n' + str(0) + '\n'
print(text1)
fout.write(text1) # predict.txt
fsplitout.write(text1) # FDDB-fold-%02d-out.txt
else:
text1 = str(imgpath) + '\n' + str(len(bboxs)) + '\n'
print(text1)
fout.write(text1) # FDDB-fold-%02d-out.txt or predict.txt
fsplitout.write(text1)
for coordinate in range(len(bboxs)):
text2 = str(int(bboxs[coordinate][0])) + ' ' + str(int(bboxs[coordinate][1])) + ' ' \
+ str(abs(int(bboxs[coordinate][2] - bboxs[coordinate][0]))) + ' ' \
+ str(abs(int(bboxs[coordinate][3] - bboxs[coordinate][1]))) + ' ' \
+ str(bboxs[coordinate][4]) + '\n'
fout.write(text2) # predict.txt
fsplitout.write(text2) # FDDB-fold-%02d-out.txt
# print error
f.close() # input the fold list, FDDB-fold-00.txt
fout.close() # output the result, predict.txt
fsplitout.close()
if __name__ == '__main__':
image_dir = './FDDB-folds'
result_dir = './FDDB-folds'
fddbtest = fddbtest(image_dir, result_dir)
fddbtest.detect()
预测结果保存在:/fddb_evaluate/FDDB-folds/predict.txt
注:若是在windows平台得到的txt文件,记得ubuntu18.04上转换一下。
- 下载官方评估代码:evaluation code
详细步骤:Ubuntu 下使用 FDDB 测试人脸检测模型并生成 ROC 曲线
opencv版本有所不同,选择在Ubuntu18.04上安装opencv3.4.5
Makefile文件已经写好,直接执行make,但报错
jn@nj:~/桌面/FDDB/evaluation$ make
g++ -O3 `pkg-config --cflags opencv` -c EllipsesSingleImage.cpp
EllipsesSingleImage.cpp: In member function ‘virtual void EllipsesSingleImage::show()’:
EllipsesSingleImage.cpp:77:52: error: ‘Scalar’ was not declared in this scope
mask = ((EllipseR *)(list->at(i)))->display(mask, Scalar(255,0,0), 3, NULL);
^~~~~~
EllipsesSingleImage.cpp:77:52: note: suggested alternative:
In file included from /usr/local/include/opencv2/core.hpp:58:0,
from /usr/local/include/opencv2/core/types_c.h:124,
from /usr/local/include/opencv2/core/core_c.h:48,
from /usr/local/include/opencv/highgui.h:45,
from RegionsSingleImage.hpp:10,
from EllipsesSingleImage.hpp:7,
from EllipsesSingleImage.cpp:5:
/usr/local/include/opencv2/core/types.hpp:657:25: note: ‘cv::Scalar’
typedef Scalar_<double> Scalar;
^~~~~~
Makefile:17: recipe for target 'EllipsesSingleImage.o' failed
make: *** [EllipsesSingleImage.o] Error 1
在对应cpp文件中添加头文件
#include <opencv2/imgproc.hpp>
再次执行得到可执行文件evaluate
jn@nj:~/桌面/FDDB/evaluation$ make
g++ -O3 `pkg-config --cflags opencv` -c evaluate.cpp
g++ OpenCVUtils.o Region.o RegionsSingleImage.o EllipseR.o EllipsesSingleImage.o RectangleR.o RectanglesSingleImage.o Hungarian.o MatchPair.o Matching.o Results.o evaluate.o -o evaluate `pkg-config --libs opencv`
执行后文件排列
jn@nj:~/桌面/FDDB$ ll
总用量 24
drwxrwxr-x 6 jn jn 4096 11月 30 13:29 ./
drwxr-xr-x 11 jn jn 4096 11月 29 20:09 ../
drwxrwxr-x 2 jn jn 4096 11月 30 13:47 evaluation/
drwxrwxr-x 2 jn jn 4096 11月 30 13:32 FDDB-folds/
drwxrwxr-x 4 jn jn 4096 11月 30 13:11 originalPics/
drwxrwxr-x 2 jn jn 4096 11月 30 13:32 out-folds/
在./FDDB-folds
文件夹中准备数据
imList.txt
cat FDDB-fold-01.txt FDDB-fold-02.txt FDDB-fold-03.txt FDDB-fold-04.txt FDDB-fold-05.txt FDDB-fold-06.txt FDDB-fold-07.txt FDDB-fold-08.txt FDDB-fold-09.txt FDDB-fold-10.txt > imList.txt
ellipseList.txt
cat FDDB-fold-01-ellipseList.txt FDDB-fold-02-ellipseList.txt FDDB-fold-03-ellipseList.txt FDDB-fold-04-ellipseList.txt FDDB-fold-05-ellipseList.txt FDDB-fold-06-ellipseList.txt FDDB-fold-07-ellipseList.txt FDDB-fold-08-ellipseList.txt FDDB-fold-09-ellipseList.txt FDDB-fold-10-ellipseList.txt > ellipseList
- 计算AUC结果
将模型预测文件放在./out-folds/
或当前目录下
安装gunplot
sudo apt-get install gnuplot
执行evaluate
./evaluate -a ../ellipseList.txt -d ../results.txt -i ../originalPics/ -l ../imList.txt
在FDDB-folds
文件夹中生成两个文件:tempContROC.txt、tempDiscROC.txt
下载官方代码:compareROC,解压文件
tar -zxvf compareROC.tar.gz
绘制曲线
gnuplot discROC.p
修改runEvaluate.pl
#### VARIABLES TO EDIT ####
# where gnuplot is
my $GNUPLOT = "/usr/bin/gnuplot";
# where the binary is
my $evaluateBin = "./evaluate";
# where the images are
my $imDir = "../originalPics/";
# where the folds are
my $fddbDir = "../FDDB-folds/";
# where the detections are
my $detDir = "../out-folds/";
###########################