AIZOO open source face detection masks

A business partner to help turn their small open source a face wearing a mask detection model, the results were good, support it, hey! In this description link

  • The past month, the new crown pneumonia epidemic affects the hearts of people across the country, while the front-line health care workers in the forefront of the fight against the epidemic, we can see a lot of the technology industry and practitioners in the field of artificial intelligence, but also contribute their strength . In recent days, open view, Shang Tang, Kang, Baidu are a number of technology companies developed the infrared temperature measurement with AI face detection algorithm, wearing masks and other detection equipment, according to the map, also developed by Ali deep learning algorithm to automatically diagnose new medical crown pneumonia. We can say that practitioners in all walks of life to contribute to the force as soon as possible to overcome this epidemic.
    As Internet + entrepreneurial team AI field, we decided to open source mask inspection data sets and trained our model finishing (provided keras version and caffe version), code, in addition, we will also be converted to a format model TensorFlow.js support, deployed on our website aizoo.com, facilitate experience inside the browser.
    At the end of the article, there are links to all our open-source information, if you do not want to see the content of the article, you can slide the bottom of the article directly View Profile link.
    First, let's look at a video effect.Video can not turn, you are interested in a small partner to see it in its original language

  • Video is not wearing a mask in red, green was wearing a mask. We can see, if his hands to his mouth most part, the model will be considered not to wear a mask (In this regard, there will be discussed below).

Below, we will model structure, data, and introduces three TensorFlow.js deploy the project. Well, ado, Let's go ~

I. Introduction model

  • Before the era of deep learning, face detection commonly used traditional, manual-based design features of the method, one of the most well-known than the Viola-Jones algorithm, since some mobile phones and digital cameras built-in face detection algorithm still uses Viola-Jones algorithm. However, with the rapid development of deep learning technology, face detection algorithm based on the depth of learning to gradually replace the traditional computer vision algorithms.

  • On the face detection assessment most commonly used --WIDER Face Dataset dataset of view, using the model in depth study of the precision and recall rates greatly exceed the traditional algorithms. Green line in the figure is a Viola-Jones Precision-Recall FIG.

Here Insert Picture Description

  • The figure is a graph PR performance evaluation of many face detection algorithm based on the depth of the study. You can see the performance of face detection algorithm based on the depth of learning, substantially more than the VJ algorithm (the right-hand curve, the better). The past two years, face detection algorithm simple test set in WIDER Face of the (easy part) can reach 95% recall rate, accuracy rate as high as 90%, as a comparison, VJ algorithm at 40% recall rate, accuracy rate is only 75% or so.

Here Insert Picture Description

  • In fact, the face detection algorithm based on the depth of learning, most of them based on improved depth learning target detection algorithm, or that is the common target detection model, configured to accommodate the specific face detection task carried out. The large number of target detection model (Faster RCNN, SSD, YOLO), the face detection algorithm SSD is the most commonly used algorithms, such as the well-known SSH model, S3FD model, RetinaFace algorithms are inspired by SSD algorithm, or based SSD improved customization task, for example, mentioned layer positioned more forward position, Anchor size adjustment, adjustment Anchor tag allocation rule, or the like is added FPN SSD basis.
    In my personal opinion, SSD is the most elegant, simple object detection model, therefore, face masks detection model we realize, is the idea of using SSD, limited space reasons, this article does not detail the principles SSD will only be simple configuration model introduction.
    Later, Yuan-feng also write some details and articles SSD algorithm implementation, and open source minimalist object detection training framework I realized, as well as comparative analysis of YoloV3 SSD and so on, I will send to our public number, welcome we are concerned about AIZOO public number.

  • 在本项目中,我们使用的是SSD架构的人脸检测算法,相比于普通的人脸检测模型只有人脸一个类别,而人脸口罩检测,只不过是增加了一个类别,变成戴口罩人脸和不戴口罩的人脸两个类别而已。我们开源的模型是一个非常小的模型,输入是260x260大小,主干网络只有8层,有五个定位和分类层,一共只有28个卷积层。而每个卷积层的通道数,是32、64、128这三种,所有这个模型总的参数量只有101.5万个参数。下图是网络的结构图。
    Here Insert Picture Description

  • 其中,上面八个卷积层是主干网络,也就是特征提取层,下面20层是定位和分类层(注意,为了方便显示,我们没有画出BN层)。
    训练目标检测模型,最重要的合理的设置anchor的大小和宽高比,笔者个人在做项目时,一般会统计数据集的目标物体的宽高比和大小来设置anchor的大小和宽高比。例如,在我们标注的口罩人脸数据集上,我们读取了所有人脸的标注信息,并计算每个人脸高度与宽度的比值,统计得到高度与宽比的分布直方图,如下:
    Here Insert Picture Description

  • 因为人脸的一般是长方形的,而很多图片是比较宽的,例如16:9的图片,人脸的宽度和高度归一化后,有很多图片的高度是宽度的2倍甚至更大。从上图也可以看出,归一化后的人脸高宽比集中在1~2.5之间。所以,根据数据的分布,我们将五个定位层的anchor的宽高比统一设置为1,0.62, 0.42。(转换为高宽比,也就是约1,1.6:1,2.4:1)

  • 五个定位层的配置信息如下表所示:

卷基层 特征图大小 anchor大小 anchor宽高比
第一层 33x33 0.04, 0.056 1, 0.62, 0.42
第二层 17x17 0.08, 0.11 1, 0.62, 0.42
第三层 9x9 0.16, 0.22 1, 0.62, 0.42
第四层 5x5 0.32, 0.45 1, 0.62, 0.42
第五层 3x3 0.64, 0.72 1, 0.62, 0.42
  • 笔者使用基于Keras实现的目标检测微框架训练的人脸口罩检测模型,为了避免一些网友提到的使用手挡住嘴巴就会欺骗部分口罩检测系统的情况,我们在数据集中加入了部分嘴巴被手捂住的数据,另外,我们还在训练的过程中,随机的往嘴巴部分粘贴一些其他物体的图片,从而避免模型认为只要露出嘴巴的就是没戴口罩,没露出嘴巴的就是带口罩这个问题,通过这两个规避方法,我们很好的解决了这个问题,大家可以在aizoo.com体验我们的模型效果。后处理部分主要就是非最大抑制(NMS),我们使用了单类的NMS,也就是戴口罩人脸和不戴口罩人脸两个类别一起做NMS,从而提高速度。

二. 数据

  • 其实,做这个项目,笔者使用个人实现的检测框架,只用两个小时就搞定了训练部分。但是,为了整理数据,笔者花了整整两天的时间。如果我们花两天时间写代码,我们不觉得痛苦,但是如果花两天时间标注数据,那是一个相当痛苦的事情。

  • 最初,笔者使用了B站一个本科生小伙公开的的1千多张图片训练了一个模型,但是该数据集比较单一,很容易出现用手捂住嘴巴,模型就认为是戴口罩这种问题,后面,笔者决定自己整理数据集。

  • 人脸检测数据集非常多,其中最常用的莫过于WIDER Face数据集,我们从中选择了3894张图片,并进行了校验,主要是将部分戴口罩的人脸标注为戴口罩。对于戴口罩的人脸,我们使用了中科院信工所葛仕明老师开源的MAFA数据集,该数据集本是一个遮挡人脸的数据集,其中有各种被遮挡的人脸图片,大多数都是被口罩遮挡的人脸图片,我们从中选择了4064张人脸戴口罩的图片。

  • MAFA数据集的人脸位置定义与WIDER Face有较大区别,MAFA的人脸框在靠近眉毛上方,是正方形的,而且标注框不严格(框离脸部边缘有缝隙),而WIDER Face的人脸框在额头上方,如果不进行修改,会导致模型对于不戴口罩的人脸,检测框是到额头上方,而戴口罩的人脸,检测框只到眉毛上方。因此,我们重新标注了这部分MAFA数据。(标注数据真的是一个非常难受的过程~)。我们最终对数据进行了随机划分为训练集和验证集。下表是数据统计。

    练集 来自WIDER Face 来自MAFA 共计
    练集/张 3114 3006 6120
    证集/张 780 1059 1839

-关于数据部分,我们就介绍到这里。需要的朋友可以在文末找到下载链接。或者在公众号回复“口罩数据集”也可以获取到链接。

三. TensoFlow.js部署

  • 为了便于在浏览器部署模型,我们特意将模型设计的非常小,这导致一个弊端就是模型会牺牲一部分性能,但是速度会非常快,在普通CPU上也可以做到实时性。最终的模型在验证集上的Precision-Recall曲线如下图所示:

  • 因为WIDER Face本身是一个稍微复杂的数据集,再加上我们模型的输入和参数量比较小,所以,可以看到对于人脸的AP只有0.896。不过,这点可以通过设计大一点的模型改善性能。如果您有需要更高精度的模型,欢迎联系我们。

  • 使用Keras训练好模型后,我们将其转换为TensorFlow.js格式,并使用JavaScript对口罩人脸检测模型进行了部署。模型的效果图如下图所示:
    Here Insert Picture Description

开源资料获取

最后,是我们开源的相关资料的获取方式。

数据下载地址提取码: eyfz

开源代码和模型链接

在线体验链接

======================= END ========================
I yuan-feng, Internet + AI field entrepreneurs are welcome to scan the next Fanger Wei code, or directly in the micro-letter search "AIZOO" our public concern number AIZOO.

If you are an algorithm needs, such as target detection, face recognition, defect detection, pedestrian detection algorithm needs, please add our micro signal AIZOOTech communicate with us, our team is a group of engineers algorithm entrepreneurial team, it will be an efficient, stable , cost-effective products to meet your needs.

If you are an algorithm or development engineer, you can also add our micro signal AIXZOOTech, please note the school or company name - research - nickname, such as "Western Electric - Image Algorithm - Yuan Feng", Yuan summit pull your into our algorithm exchange group, together with the development of algorithms and the exchange of knowledge, as well as docking project.

Published 63 original articles · won praise 55 · Views 100,000 +

Guess you like

Origin blog.csdn.net/qq_36810544/article/details/104391955