Detect Faces Efficiently: A Survey and Evaluations Study Notes

Paper Address: 2021-IEEE-Shiqi Yu-Detect Faces Efficiently: A Survey and Evaluations
Code Address 1: https://github.com/ShiqiYu/libfacedetection
Code Address 2: https://github.com/shiqiyu/libfacedetection.train
Note: The following content is only the more important part of personal attention, so only focus on reading the following content~

Abstract

  • What is Face Detection? Face Detection is to search for all areas where faces may exist in the image, and locate them if there is a face;
  • Face Detection Algorithm: Introduces representative face detection algorithms based on deep learning, and conducts in-depth analysis of these algorithms in terms of accuracy and efficiency;
  • In terms of data sets: compared and discussed the current popular and challenging data and their evaluation indicators;
  • How to measure the performance of Face Detector? Use two metrics: FLOPs and latency;
  • The significance of this article: By comparing different algorithms and corresponding performance, it is convenient for everyone to choose a face detector/algorithm that suits them.

1 Introduction

The main contributions of this paper:
(1) Different from the previous review of face detection algorithms (focusing on traditional face detection algorithms), this paper mainly focuses on the review of face detection algorithms based on deep learning . And it provides a clear perspective on the development path of face detection based on deep learning in recent years;
(2) This paper mainly studies and analyzes the accuracy and efficiency of the algorithm , and conducts a large number of experiments to share the results of each algorithm according to different evaluation indicators. Performance, at the same time, a summary of the tricks of the algorithm is given, which is convenient for readers to choose the model that suits them; (
3) Focusing on the efficiency of face detectors, a comprehensive experiment is carried out to evaluate the accuracy of different face detectors sex and efficiency. In addition to latency, this paper proposes an accurate metric for the computational cost of CNN models. It is floating point operations (FLOPs) under certain rules . FLOPs are more latency-neutral than latency, which relies heavily on hardware and deep network structures.
The main content of this paper is arranged:
(1) summarizes some key challenges of face detection ;
(2) provides a roadmap to describe the development of face detection based on deep learning , and conducts a detailed review;
(3) reviews Several basic sub-problems in face detection tasks , including backbone, context modeling, processing of facial scale changes, and generation of pre-suggestion boxes;
(4) Introduce and summarize commonly used face detection datasets and corresponding optimal performance (
5) Extensive experiments in some open one-stage face detection algorithms reveal thatThe relationship between computational cost and AP ;
(6) Review speed-focused face detectors collected from Github; (7) Discuss the future challenges
of face detection .

2 Main Challenges

(1) Challenges related to accuracy come from facial appearance and imaging conditions. Example: Face samples under difficult conditions are shown in Figure 1:
insert image description here
(2) Masked/occluded face detection is becoming more and more important due to COVID-19.
(3) The huge demand for edge devices brings
efficiency-related challenges
.

3 Face Detection Frameworks

Following the way of dividing object detection frameworks, this paper divides deep learning-based face detectors into three main categories :
(1) Multi-stage face detection frameworks. This type of algorithm is inspired by the cascade classifier in face detection, which is an early exploration of applying deep learning technology to face detection; (2)
Two-stage face detection framework. In the first stage, some proposed areas are produced, and in the second stage, the pre-advised areas are confirmed. Efficiency should be better than multi-stage. [Target detection such as R-CNN series~]
(3) Single-stage face detection framework. Feature extraction and proposal generation are performed in a unified network. These frameworks can be further divided into anchor-based methods ( RetinaFace , YOLOv5face, YOLOFaceV2, SCRFD, etc.) and anchor-free methods ( CenterFace ).
insert image description here

3.1 Muti-Stage and Two-Stage Face Detectors

3.2 One-Stage Face Detectors

Single-stage face detection algorithm: Feature extraction, pre-suggestion area generation and face detection are performed simultaneously through a unique convolutional neural network, and its operating efficiency has nothing to do with the number of faces.
In recent years, One-Stage has become popular mainly for the following three reasons:
(1) By design, the running time of a single-stage face detector has nothing to do with the number of faces in the image. Therefore, it enhances the robustness of runtime efficiency;
(2) The single-stage detector achieves approximate scale invariance through context modeling and multi-scale feature sampling, which is computationally efficient and simple; (
3) Face detection is an Relatively simpler task than general object detection. This means that innovations and advanced network designs in object detection can be quickly adapted to face detection by taking into account the special patterns of faces.

5 Datasets and Evaluation

5.1 Datasets

insert image description here

8 Conclusions and Discussions

  • This article introduces a large number of the latest face detection algorithms and related benchmark algorithms, and compares their performance (mainly in terms of accuracy and delay).
  • Future face detection research can focus on the following aspects: (1) Ultra-fast face detection --> 1080P images can be processed on low-computation edge devices, and FLOPs should be less than 100M; (2) Long-tail distribution of detection faces in .
  • The ultimate goal of face detection is to detect faces with high . As a result, the algorithm can be deployed to a variety of edge devices and centralized servers, improving computer perception; currently, there is still a considerable gap in this regard. Face detectors can achieve good accuracy, but still require a lot of computation. The next step should be to improve efficiency.

Guess you like

Origin blog.csdn.net/weixin_41807182/article/details/127978547