基于PyTorch的MTCNN复现记录

MTCNN简介

MTCNN,Multi-task convolutional neural network(多任务卷积神经网络),通过多任务学习训练级联卷积神经网络来集成人脸检测和人脸对齐任务。MTCNN包含三个阶段,在第一阶段P-Net生成大量候选人脸窗口,第二阶段R-Net拒绝大量非人脸窗口,第三阶段O-Net生成最终人脸边界框和五个人脸关键点坐标。

工作流程如下图

在这里插入图片描述

训练过程

本文参考仓库:https://github.com/yeyupiaoling/Pytorch-MTCNN

数据及环境准备

训练PNet

按照readme中将数据集放在指定的文件夹,就可以开始生成PNet训练数据。

运行generate_PNet_data.py,漫长的等待…

在这里插入图片描述

在运行train_PNet.py之前,安装下tensorboard

pip install tensorboard

train_PNet.py插入以下代码

from torch.utils.tensorboard import SummaryWriter

# 添加tensorboard
writer = SummaryWriter("logs")

# 记录训练的次数
total_train_step = 0

在循环体中插入代码

total_train_step = total_train_step + 1
if total_train_step % 100 == 0:
    acc = accuracy(class_out, label)
    writer.add_scalar("train_loss", total_loss, total_train_step)
    writer.add_scalar("train_accuracy", acc, total_train_step)

开始运行train_PNet.py,打开PyCharm终端,输入

tensorboard --logdir=logs

打开网址便可看见实时训练曲线

在这里插入图片描述
若是想在tensorboard中获取曲线图,直接截取的做法不是不行,但放在论文中会显得不合适。

于是,将csv文件下载下来,后续手动平滑曲线。

import matplotlib.pyplot as plt
import pandas as pd


def tensorboard_smoothing(x, smooth=0.85):
    x = x.copy()
    weight = smooth
    for i in range(1, len(x)):
        x[i] = (x[i-1] * weight + x[i]) / (weight + 1)
        weight = (weight + 1) * smooth
    return x


fig, ax1 = plt.subplots(1, 1)

len_mean = pd.read_csv("train_loss.csv")
ax1.plot(len_mean['Step'], tensorboard_smoothing(len_mean['Value'], smooth=0.999), color="#3399FF")

ax1.set_xlabel("steps")
ax1.set_ylabel("loss", color="#000000")
ax1.set_title("training loss")
plt.show()

效果
在这里插入图片描述
在这里插入图片描述

训练RNet

运行generate_RNet_data.py,漫长的等待…

在这里插入图片描述

在运行train_RNet.py之前,同样插入tensorboard相关代码

训练ONet

运行generate_RNet_data.py,漫长的等待…,但是报错了

在这里插入图片描述
经分析是GPU空间不足,修改代码

# 修改为cpu预测
device = torch.device("cpu")

重新运行generate_RNet_data.py

在这里插入图片描述
运行train_ONet.py

测试

经训练有三个模型:PNetRNetONet

MTCNN模型跑单张图片

运行infer_path

在这里插入图片描述

WIDER FACE验证集准确率

  • 先将MTCNN模型在验证集上跑出预测结果并保存

WIDER FACE验证集输出文件要求格式

< image name i >
< number of faces in this image = im >
< face i1 >
< face i2 >
...

< face im >

编写测试脚本widerface_test.py

import argparse
import os

import cv2
import numpy as np
import torch

from utils.utils import generate_bbox, py_nms, convert_to_square
from utils.utils import pad, calibrate_box, processed_image

parser = argparse.ArgumentParser()
parser.add_argument('--model_path', type=str, default='infer_models',      help='PNet、RNet、ONet三个模型文件存在的文件夹路径')
#parser.add_argument('--image_path', type=str, default='dataset/test.jpg',  help='需要预测图像的路径')
args = parser.parse_args()


device = torch.device("cuda")

# 获取P模型
pnet = torch.jit.load(os.path.join(args.model_path, 'PNet.pth'))
pnet.to(device)
softmax_p = torch.nn.Softmax(dim=0)
pnet.eval()

# 获取R模型
rnet = torch.jit.load(os.path.join(args.model_path, 'RNet.pth'))
rnet.to(device)
softmax_r = torch.nn.Softmax(dim=-1)
rnet.eval()

# 获取R模型
onet = torch.jit.load(os.path.join(args.model_path, 'ONet.pth'))
onet.to(device)
softmax_o = torch.nn.Softmax(dim=-1)
onet.eval()


# 使用PNet模型预测
def predict_pnet(infer_data):
    # 添加待预测的图片
    infer_data = torch.tensor(infer_data, dtype=torch.float32, device=device)
    infer_data = torch.unsqueeze(infer_data, dim=0)
    # 执行预测
    cls_prob, bbox_pred, _ = pnet(infer_data)
    cls_prob = torch.squeeze(cls_prob)
    cls_prob = softmax_p(cls_prob)
    bbox_pred = torch.squeeze(bbox_pred)
    return cls_prob.detach().cpu().numpy(), bbox_pred.detach().cpu().numpy()


# 使用RNet模型预测
def predict_rnet(infer_data):
    # 添加待预测的图片
    infer_data = torch.tensor(infer_data, dtype=torch.float32, device=device)
    # 执行预测
    cls_prob, bbox_pred, _ = rnet(infer_data)
    cls_prob = softmax_r(cls_prob)
    return cls_prob.detach().cpu().numpy(), bbox_pred.detach().cpu().numpy()


# 使用ONet模型预测
def predict_onet(infer_data):
    # 添加待预测的图片
    infer_data = torch.tensor(infer_data, dtype=torch.float32, device=device)
    # 执行预测
    cls_prob, bbox_pred, landmark_pred = onet(infer_data)
    cls_prob = softmax_o(cls_prob)
    return cls_prob.detach().cpu().numpy(), bbox_pred.detach().cpu().numpy(), landmark_pred.detach().cpu().numpy()


# 获取PNet网络输出结果
def detect_pnet(im, min_face_size, scale_factor, thresh):
    """通过pnet筛选box和landmark
    参数:
      im:输入图像[h,2,3]
    """
    net_size = 12
    # 人脸和输入图像的比率
    current_scale = float(net_size) / min_face_size
    im_resized = processed_image(im, current_scale)
    _, current_height, current_width = im_resized.shape
    all_boxes = list()
    # 图像金字塔
    while min(current_height, current_width) > net_size:
        # 类别和box
        cls_cls_map, reg = predict_pnet(im_resized)
        boxes = generate_bbox(cls_cls_map[1, :, :], reg, current_scale, thresh)
        current_scale *= scale_factor  # 继续缩小图像做金字塔
        im_resized = processed_image(im, current_scale)
        _, current_height, current_width = im_resized.shape

        if boxes.size == 0:
            continue
        # 非极大值抑制留下重复低的box
        keep = py_nms(boxes[:, :5], 0.5, mode='Union')
        boxes = boxes[keep]
        all_boxes.append(boxes)
    if len(all_boxes) == 0:
        return None
    all_boxes = np.vstack(all_boxes)
    # 将金字塔之后的box也进行非极大值抑制
    keep = py_nms(all_boxes[:, 0:5], 0.7, mode='Union')
    all_boxes = all_boxes[keep]
    # box的长宽
    bbw = all_boxes[:, 2] - all_boxes[:, 0] + 1
    bbh = all_boxes[:, 3] - all_boxes[:, 1] + 1
    # 对应原图的box坐标和分数
    boxes_c = np.vstack([all_boxes[:, 0] + all_boxes[:, 5] * bbw,
                         all_boxes[:, 1] + all_boxes[:, 6] * bbh,
                         all_boxes[:, 2] + all_boxes[:, 7] * bbw,
                         all_boxes[:, 3] + all_boxes[:, 8] * bbh,
                         all_boxes[:, 4]])
    boxes_c = boxes_c.T

    return boxes_c


# 获取RNet网络输出结果
def detect_rnet(im, dets, thresh):
    """通过rent选择box
        参数:
          im:输入图像
          dets:pnet选择的box,是相对原图的绝对坐标
        返回值:
          box绝对坐标
    """
    h, w, c = im.shape
    # 将pnet的box变成包含它的正方形,可以避免信息损失
    dets = convert_to_square(dets)
    dets[:, 0:4] = np.round(dets[:, 0:4])
    # 调整超出图像的box
    [dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph] = pad(dets, w, h)
    delete_size = np.ones_like(tmpw) * 20
    ones = np.ones_like(tmpw)
    zeros = np.zeros_like(tmpw)
    num_boxes = np.sum(np.where((np.minimum(tmpw, tmph) >= delete_size), ones, zeros))
    cropped_ims = np.zeros((num_boxes, 3, 24, 24), dtype=np.float32)
    for i in range(int(num_boxes)):
        # 将pnet生成的box相对与原图进行裁剪,超出部分用0补
        if tmph[i] < 20 or tmpw[i] < 20:
            continue
        tmp = np.zeros((tmph[i], tmpw[i], 3), dtype=np.uint8)
        try:
            tmp[dy[i]:edy[i] + 1, dx[i]:edx[i] + 1, :] = im[y[i]:ey[i] + 1, x[i]:ex[i] + 1, :]
            img = cv2.resize(tmp, (24, 24), interpolation=cv2.INTER_LINEAR)
            img = img.transpose((2, 0, 1))
            img = (img - 127.5) / 128
            cropped_ims[i, :, :, :] = img
        except:
            continue
    cls_scores, reg = predict_rnet(cropped_ims)
    cls_scores = cls_scores[:, 1]
    keep_inds = np.where(cls_scores > thresh)[0]
    if len(keep_inds) > 0:
        boxes = dets[keep_inds]
        boxes[:, 4] = cls_scores[keep_inds]
        reg = reg[keep_inds]
    else:
        return None

    keep = py_nms(boxes, 0.4, mode='Union')
    boxes = boxes[keep]
    # 对pnet截取的图像的坐标进行校准,生成rnet的人脸框对于原图的绝对坐标
    boxes_c = calibrate_box(boxes, reg[keep])
    return boxes_c


# 获取ONet模型预测结果
def detect_onet(im, dets, thresh):
    """将onet的选框继续筛选基本和rnet差不多但多返回了landmark"""
    h, w, c = im.shape
    dets = convert_to_square(dets)
    dets[:, 0:4] = np.round(dets[:, 0:4])
    [dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph] = pad(dets, w, h)
    num_boxes = dets.shape[0]
    cropped_ims = np.zeros((num_boxes, 3, 48, 48), dtype=np.float32)
    for i in range(num_boxes):
        tmp = np.zeros((tmph[i], tmpw[i], 3), dtype=np.uint8)
        tmp[dy[i]:edy[i] + 1, dx[i]:edx[i] + 1, :] = im[y[i]:ey[i] + 1, x[i]:ex[i] + 1, :]
        img = cv2.resize(tmp, (48, 48), interpolation=cv2.INTER_LINEAR)
        img = img.transpose((2, 0, 1))
        img = (img - 127.5) / 128
        cropped_ims[i, :, :, :] = img
    cls_scores, reg, landmark = predict_onet(cropped_ims)

    cls_scores = cls_scores[:, 1]
    keep_inds = np.where(cls_scores > thresh)[0]
    if len(keep_inds) > 0:
        boxes = dets[keep_inds]
        boxes[:, 4] = cls_scores[keep_inds]
        reg = reg[keep_inds]
        landmark = landmark[keep_inds]
    else:
        return None, None

    w = boxes[:, 2] - boxes[:, 0] + 1

    h = boxes[:, 3] - boxes[:, 1] + 1
    landmark[:, 0::2] = (np.tile(w, (5, 1)) * landmark[:, 0::2].T + np.tile(boxes[:, 0], (5, 1)) - 1).T
    landmark[:, 1::2] = (np.tile(h, (5, 1)) * landmark[:, 1::2].T + np.tile(boxes[:, 1], (5, 1)) - 1).T
    boxes_c = calibrate_box(boxes, reg)

    keep = py_nms(boxes_c, 0.6, mode='Minimum')
    boxes_c = boxes_c[keep]
    landmark = landmark[keep]
    return boxes_c, landmark


# 预测图片
def infer_image(image_path):
    im = cv2.imread(image_path)
    # 调用第一个模型预测
    boxes_c = detect_pnet(im, 20, 0.79, 0.9)
    if boxes_c is None:
        return None, None
    # 调用第二个模型预测
    boxes_c = detect_rnet(im, boxes_c, 0.6)
    if boxes_c is None:
        return None, None
    # 调用第三个模型预测
    boxes_c, landmark = detect_onet(im, boxes_c, 0.7)
    if boxes_c is None:
        return None, None

    return boxes_c, landmark
    

class wildtest():
    def __init__(self, image_dir, result_dir):
        self.image_dir = image_dir
        self.result_dir = result_dir

    def detect(self):

        event_list = os.listdir(self.image_dir)
        for event in event_list:
            event_dir = os.path.join(self.image_dir, event)
            res_dir = os.path.join(self.result_dir, event)
            if not os.path.exists(res_dir):
                os.makedirs(res_dir)
            images_list = os.listdir(event_dir)
            for images in images_list:
                images_path = os.path.join(event_dir, images)
                # img = cv2.imread(images_path)
                print(images_path)

                bboxs, landmarks = infer_image(images_path)
                # bboxs, landmarks = mtcnn_detector.detect_face(img)
                # print(bboxs)
                if bboxs is None:
                    fpath = os.path.join(res_dir, images[:-4] + '.txt')
                    f = open(fpath, 'w')
                    f.write(images[:-4] + '\n')
                    f.write(str(0) + '\n')
                    f.close()
                    continue
                if bboxs.shape[0] != 0:
                    bboxs[:, 2] = bboxs[:, 2] - bboxs[:, 0]
                    bboxs[:, 3] = bboxs[:, 3] - bboxs[:, 1]
                    bboxs[:, :4] = np.round(bboxs[:, :4])
                    """ print(bboxs)
                    save_name = 'r_304.jpg'
                    vis_face(img,bboxs,landmarks, save_name) """
                    fpath = os.path.join(res_dir, images[:-4] + '.txt')
                    f = open(fpath, 'w')
                    f.write(images[:-4] + '\n')
                    f.write(str(bboxs.shape[0]) + '\n')
                    for i in range(bboxs.shape[0]):
                        f.write('{:.0f} {:.0f} {:.0f} {:.0f} {:.3f}\n'.format(bboxs[i, 0], bboxs[i, 1], bboxs[i, 2], bboxs[i, 3], bboxs[i, 4]))
                    f.close()


if __name__ == '__main__':
    image_dir = './dataset/WIDER_val/images/'
    result_dir = './anno_store/wider_val/'
    wildtest = wildtest(image_dir, result_dir)
    wildtest.detect()

预测结果在目录:./anno_store/wider_val/

  • 计算准确率

参考项目代码:https://github.com/bubbliiiing/retinaface-pytorch计算准确率

新建目录:widerface_evaluate,将文件夹ground_truth放在目录下,并将setup.pyevaluationbox_overlaps三个文件放在目录下

先运行setup.py,再运行evaluation

Easy   Val AP: 0.6931683876966619
Medium Val AP: 0.6767176224620579
Hard   Val AP: 0.4284666286589425

FDDB数据集测试

FDDB数据集地址:http://vis-www.cs.umass.edu/fddb/

参考博客:1. 在windows平台上测试自己的人脸检测算法在FDDB数据集
2. https://github.com/hualitlc/MTCNN-on-FDDB-Dataset

注:FDDB要求的输出预测结果格式

<image name i>
<number of faces in this image =im>
<face i1>
<face i2>
...
<face im>
  • FDDB-fold-**.txt文件,MTCNN模型在FDDB数据集上检测结果保存
import argparse
import os

import cv2
import numpy as np
import torch

from utils.utils import generate_bbox, py_nms, convert_to_square
from utils.utils import pad, calibrate_box, processed_image

parser = argparse.ArgumentParser()
parser.add_argument('--model_path', type=str, default='../infer_models',      help='PNet、RNet、ONet三个模型文件存在的文件夹路径')
# parser.add_argument('--image_path', type=str, default='dataset/test.jpg',  help='需要预测图像的路径')
args = parser.parse_args()

device = torch.device("cuda")

# 获取P模型
pnet = torch.jit.load(os.path.join(args.model_path, 'PNet.pth'))
pnet.to(device)
softmax_p = torch.nn.Softmax(dim=0)
pnet.eval()

# 获取R模型
rnet = torch.jit.load(os.path.join(args.model_path, 'RNet.pth'))
rnet.to(device)
softmax_r = torch.nn.Softmax(dim=-1)
rnet.eval()

# 获取R模型
onet = torch.jit.load(os.path.join(args.model_path, 'ONet.pth'))
onet.to(device)
softmax_o = torch.nn.Softmax(dim=-1)
onet.eval()


# 使用PNet模型预测
def predict_pnet(infer_data):
    # 添加待预测的图片
    infer_data = torch.tensor(infer_data, dtype=torch.float32, device=device)
    infer_data = torch.unsqueeze(infer_data, dim=0)
    # 执行预测
    cls_prob, bbox_pred, _ = pnet(infer_data)
    cls_prob = torch.squeeze(cls_prob)
    cls_prob = softmax_p(cls_prob)
    bbox_pred = torch.squeeze(bbox_pred)
    return cls_prob.detach().cpu().numpy(), bbox_pred.detach().cpu().numpy()


# 使用RNet模型预测
def predict_rnet(infer_data):
    # 添加待预测的图片
    infer_data = torch.tensor(infer_data, dtype=torch.float32, device=device)
    # 执行预测
    cls_prob, bbox_pred, _ = rnet(infer_data)
    cls_prob = softmax_r(cls_prob)
    return cls_prob.detach().cpu().numpy(), bbox_pred.detach().cpu().numpy()


# 使用ONet模型预测
def predict_onet(infer_data):
    # 添加待预测的图片
    infer_data = torch.tensor(infer_data, dtype=torch.float32, device=device)
    # 执行预测
    cls_prob, bbox_pred, landmark_pred = onet(infer_data)
    cls_prob = softmax_o(cls_prob)
    return cls_prob.detach().cpu().numpy(), bbox_pred.detach().cpu().numpy(), landmark_pred.detach().cpu().numpy()


# 获取PNet网络输出结果
def detect_pnet(im, min_face_size, scale_factor, thresh):
    """通过pnet筛选box和landmark
    参数:
      im:输入图像[h,2,3]
    """
    net_size = 12
    # 人脸和输入图像的比率
    current_scale = float(net_size) / min_face_size
    im_resized = processed_image(im, current_scale)
    _, current_height, current_width = im_resized.shape
    all_boxes = list()
    # 图像金字塔
    while min(current_height, current_width) > net_size:
        # 类别和box
        cls_cls_map, reg = predict_pnet(im_resized)
        boxes = generate_bbox(cls_cls_map[1, :, :], reg, current_scale, thresh)
        current_scale *= scale_factor  # 继续缩小图像做金字塔
        im_resized = processed_image(im, current_scale)
        _, current_height, current_width = im_resized.shape

        if boxes.size == 0:
            continue
        # 非极大值抑制留下重复低的box
        keep = py_nms(boxes[:, :5], 0.5, mode='Union')
        boxes = boxes[keep]
        all_boxes.append(boxes)
    if len(all_boxes) == 0:
        return None
    all_boxes = np.vstack(all_boxes)
    # 将金字塔之后的box也进行非极大值抑制
    keep = py_nms(all_boxes[:, 0:5], 0.7, mode='Union')
    all_boxes = all_boxes[keep]
    # box的长宽
    bbw = all_boxes[:, 2] - all_boxes[:, 0] + 1
    bbh = all_boxes[:, 3] - all_boxes[:, 1] + 1
    # 对应原图的box坐标和分数
    boxes_c = np.vstack([all_boxes[:, 0] + all_boxes[:, 5] * bbw,
                         all_boxes[:, 1] + all_boxes[:, 6] * bbh,
                         all_boxes[:, 2] + all_boxes[:, 7] * bbw,
                         all_boxes[:, 3] + all_boxes[:, 8] * bbh,
                         all_boxes[:, 4]])
    boxes_c = boxes_c.T

    return boxes_c


# 获取RNet网络输出结果
def detect_rnet(im, dets, thresh):
    """通过rent选择box
        参数:
          im:输入图像
          dets:pnet选择的box,是相对原图的绝对坐标
        返回值:
          box绝对坐标
    """
    h, w, c = im.shape
    # 将pnet的box变成包含它的正方形,可以避免信息损失
    dets = convert_to_square(dets)
    dets[:, 0:4] = np.round(dets[:, 0:4])
    # 调整超出图像的box
    [dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph] = pad(dets, w, h)
    delete_size = np.ones_like(tmpw) * 20
    ones = np.ones_like(tmpw)
    zeros = np.zeros_like(tmpw)
    num_boxes = np.sum(np.where((np.minimum(tmpw, tmph) >= delete_size), ones, zeros))
    cropped_ims = np.zeros((num_boxes, 3, 24, 24), dtype=np.float32)
    for i in range(int(num_boxes)):
        # 将pnet生成的box相对与原图进行裁剪,超出部分用0if tmph[i] < 20 or tmpw[i] < 20:
            continue
        tmp = np.zeros((tmph[i], tmpw[i], 3), dtype=np.uint8)
        try:
            tmp[dy[i]:edy[i] + 1, dx[i]:edx[i] + 1, :] = im[y[i]:ey[i] + 1, x[i]:ex[i] + 1, :]
            img = cv2.resize(tmp, (24, 24), interpolation=cv2.INTER_LINEAR)
            img = img.transpose((2, 0, 1))
            img = (img - 127.5) / 128
            cropped_ims[i, :, :, :] = img
        except:
            continue
    cls_scores, reg = predict_rnet(cropped_ims)
    cls_scores = cls_scores[:, 1]
    keep_inds = np.where(cls_scores > thresh)[0]
    if len(keep_inds) > 0:
        boxes = dets[keep_inds]
        boxes[:, 4] = cls_scores[keep_inds]
        reg = reg[keep_inds]
    else:
        return None

    keep = py_nms(boxes, 0.4, mode='Union')
    boxes = boxes[keep]
    # 对pnet截取的图像的坐标进行校准,生成rnet的人脸框对于原图的绝对坐标
    boxes_c = calibrate_box(boxes, reg[keep])
    return boxes_c


# 获取ONet模型预测结果
def detect_onet(im, dets, thresh):
    """将onet的选框继续筛选基本和rnet差不多但多返回了landmark"""
    h, w, c = im.shape
    dets = convert_to_square(dets)
    dets[:, 0:4] = np.round(dets[:, 0:4])
    [dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph] = pad(dets, w, h)
    num_boxes = dets.shape[0]
    cropped_ims = np.zeros((num_boxes, 3, 48, 48), dtype=np.float32)
    for i in range(num_boxes):
        tmp = np.zeros((tmph[i], tmpw[i], 3), dtype=np.uint8)
        tmp[dy[i]:edy[i] + 1, dx[i]:edx[i] + 1, :] = im[y[i]:ey[i] + 1, x[i]:ex[i] + 1, :]
        img = cv2.resize(tmp, (48, 48), interpolation=cv2.INTER_LINEAR)
        img = img.transpose((2, 0, 1))
        img = (img - 127.5) / 128
        cropped_ims[i, :, :, :] = img
    cls_scores, reg, landmark = predict_onet(cropped_ims)

    cls_scores = cls_scores[:, 1]
    keep_inds = np.where(cls_scores > thresh)[0]
    if len(keep_inds) > 0:
        boxes = dets[keep_inds]
        boxes[:, 4] = cls_scores[keep_inds]
        reg = reg[keep_inds]
        landmark = landmark[keep_inds]
    else:
        return None, None

    w = boxes[:, 2] - boxes[:, 0] + 1

    h = boxes[:, 3] - boxes[:, 1] + 1
    landmark[:, 0::2] = (np.tile(w, (5, 1)) * landmark[:, 0::2].T + np.tile(boxes[:, 0], (5, 1)) - 1).T
    landmark[:, 1::2] = (np.tile(h, (5, 1)) * landmark[:, 1::2].T + np.tile(boxes[:, 1], (5, 1)) - 1).T
    boxes_c = calibrate_box(boxes, reg)

    keep = py_nms(boxes_c, 0.6, mode='Minimum')
    boxes_c = boxes_c[keep]
    landmark = landmark[keep]
    return boxes_c, landmark


# 预测图片
def infer_image(image_path):
    im = cv2.imread(image_path)
    # 调用第一个模型预测
    boxes_c = detect_pnet(im, 20, 0.79, 0.9)
    if boxes_c is None:
        return None, None
    # 调用第二个模型预测
    boxes_c = detect_rnet(im, boxes_c, 0.6)
    if boxes_c is None:
        return None, None
    # 调用第三个模型预测
    boxes_c, landmark = detect_onet(im, boxes_c, 0.7)
    if boxes_c is None:
        return None, None

    return boxes_c, landmark


class fddbtest():

    def __init__(self, image_dir, result_dir):
        self.image_dir = image_dir
        self.result_dir = result_dir

    def detect(self):
        # event_list = os.listdir(self.image_dir)
        # print(event_list)

        for i in range(1, 11):

            fileFoldInputName = "./FDDB-fold-%02d.txt" % i
            fileInputName = './FDDB-folds/' + fileFoldInputName

            print(fileInputName)

            fileFoldOutName = "FDDB-fold-%02d-out.txt" % i
            fileOutputName = './FDDB-folds/' + fileFoldOutName

            fileTotalPredictName = './FDDB-folds/' + 'predict.txt'

            fout = open(fileTotalPredictName, 'a+')    # predict.txt
            fsplitout = open(fileOutputName, 'a+')     # FDDB-fold-%02d-out.txt

            f = open(fileInputName, 'r')  # FDDB-fold-00.txt, read
            for imgpath in f.readlines():

                imgpath = imgpath.split('\n')[0]
                path = './originalPics/' + imgpath + '.jpg'
                # print(imgpath)
                img = cv2.imread(path)

                if img is None:
                    continue
                img_matlab = img.copy()
                tmp = img_matlab[:, :, 2].copy()
                img_matlab[:, :, 2] = img_matlab[:, :, 0]
                img_matlab[:, :, 0] = tmp

                # check rgb position
                # tic()
                bboxs, landmarks = infer_image(path)

                if bboxs is None:
                    text1 = str(imgpath) + '\n' + str(0) + '\n'
                    print(text1)
                    fout.write(text1)   # predict.txt
                    fsplitout.write(text1)   # FDDB-fold-%02d-out.txt



                else:
                    text1 = str(imgpath) + '\n' + str(len(bboxs)) + '\n'
                    print(text1)
                    fout.write(text1)  # FDDB-fold-%02d-out.txt or predict.txt
                    fsplitout.write(text1)

                    for coordinate in range(len(bboxs)):
                        text2 = str(int(bboxs[coordinate][0])) + ' ' + str(int(bboxs[coordinate][1])) + ' ' \
                                + str(abs(int(bboxs[coordinate][2] - bboxs[coordinate][0]))) + ' ' \
                                + str(abs(int(bboxs[coordinate][3] - bboxs[coordinate][1]))) + ' ' \
                                + str(bboxs[coordinate][4]) + '\n'

                        fout.write(text2)   # predict.txt
                        fsplitout.write(text2)  # FDDB-fold-%02d-out.txt

            # print error
            f.close()  # input the fold list, FDDB-fold-00.txt
            fout.close()  # output the result, predict.txt
            fsplitout.close()



if __name__ == '__main__':
    image_dir = './FDDB-folds'
    result_dir = './FDDB-folds'
    fddbtest = fddbtest(image_dir, result_dir)
    fddbtest.detect()

预测结果保存在:/fddb_evaluate/FDDB-folds/predict.txt

注:若是在windows平台得到的txt文件,记得ubuntu18.04上转换一下。

详细步骤:Ubuntu 下使用 FDDB 测试人脸检测模型并生成 ROC 曲线

opencv版本有所不同,选择在Ubuntu18.04上安装opencv3.4.5

Makefile文件已经写好,直接执行make,但报错

jn@nj:~/桌面/FDDB/evaluation$ make
g++ -O3  `pkg-config --cflags opencv`  -c EllipsesSingleImage.cpp 
EllipsesSingleImage.cpp: In member function ‘virtual void EllipsesSingleImage::show():
EllipsesSingleImage.cpp:77:52: error: ‘Scalar’ was not declared in this scope
  mask = ((EllipseR *)(list->at(i)))->display(mask, Scalar(255,0,0), 3, NULL);
                                                    ^~~~~~
EllipsesSingleImage.cpp:77:52: note: suggested alternative:
In file included from /usr/local/include/opencv2/core.hpp:58:0,
                 from /usr/local/include/opencv2/core/types_c.h:124,
                 from /usr/local/include/opencv2/core/core_c.h:48,
                 from /usr/local/include/opencv/highgui.h:45,
                 from RegionsSingleImage.hpp:10,
                 from EllipsesSingleImage.hpp:7,
                 from EllipsesSingleImage.cpp:5:
/usr/local/include/opencv2/core/types.hpp:657:25: note:   ‘cv::Scalar’
 typedef Scalar_<double> Scalar;
                         ^~~~~~
Makefile:17: recipe for target 'EllipsesSingleImage.o' failed
make: *** [EllipsesSingleImage.o] Error 1

在对应cpp文件中添加头文件

#include   <opencv2/imgproc.hpp>

再次执行得到可执行文件evaluate

jn@nj:~/桌面/FDDB/evaluation$ make
g++ -O3  `pkg-config --cflags opencv`  -c evaluate.cpp 
g++ OpenCVUtils.o Region.o RegionsSingleImage.o EllipseR.o EllipsesSingleImage.o RectangleR.o RectanglesSingleImage.o Hungarian.o MatchPair.o Matching.o Results.o evaluate.o -o evaluate `pkg-config --libs opencv`

执行后文件排列

jn@nj:~/桌面/FDDB$ ll
总用量 24
drwxrwxr-x  6 jn jn 4096 1130 13:29 ./
drwxr-xr-x 11 jn jn 4096 1129 20:09 ../
drwxrwxr-x  2 jn jn 4096 1130 13:47 evaluation/
drwxrwxr-x  2 jn jn 4096 1130 13:32 FDDB-folds/
drwxrwxr-x  4 jn jn 4096 1130 13:11 originalPics/
drwxrwxr-x  2 jn jn 4096 1130 13:32 out-folds/

./FDDB-folds文件夹中准备数据

imList.txt

cat FDDB-fold-01.txt FDDB-fold-02.txt FDDB-fold-03.txt FDDB-fold-04.txt FDDB-fold-05.txt FDDB-fold-06.txt FDDB-fold-07.txt FDDB-fold-08.txt FDDB-fold-09.txt FDDB-fold-10.txt > imList.txt

ellipseList.txt

cat FDDB-fold-01-ellipseList.txt FDDB-fold-02-ellipseList.txt FDDB-fold-03-ellipseList.txt FDDB-fold-04-ellipseList.txt FDDB-fold-05-ellipseList.txt FDDB-fold-06-ellipseList.txt FDDB-fold-07-ellipseList.txt FDDB-fold-08-ellipseList.txt FDDB-fold-09-ellipseList.txt FDDB-fold-10-ellipseList.txt > ellipseList
  • 计算AUC结果

将模型预测文件放在./out-folds/或当前目录下

安装gunplot

sudo apt-get install gnuplot

执行evaluate

./evaluate -a ../ellipseList.txt -d ../results.txt -i ../originalPics/ -l ../imList.txt

FDDB-folds文件夹中生成两个文件:tempContROC.txt、tempDiscROC.txt

下载官方代码:compareROC,解压文件

tar -zxvf compareROC.tar.gz

绘制曲线

gnuplot discROC.p

修改runEvaluate.pl

#### VARIABLES TO EDIT ####
# where gnuplot is
my $GNUPLOT = "/usr/bin/gnuplot";
# where the binary is
my $evaluateBin = "./evaluate";
# where the images are
my $imDir = "../originalPics/";
# where the folds are
my $fddbDir = "../FDDB-folds/";
# where the detections are
my $detDir = "../out-folds/";
###########################

猜你喜欢

转载自blog.csdn.net/Star_ID/article/details/127995622