PFLD: A Practical Facial Landmark Detector Study Notes

PFLD paper address: https://arxiv.org/pdf/1902.10859v2.pdf
PFLD source address: https://github.com/polarisZhao/PFLD-pytorch

Abstract

Accuracy, efficiency, and compactness/smallness are the keys for face marker detectors to achieve practical applications. To consider these three issues simultaneously, this paper investigates a neat model with promising detection accuracy in various natural environments (e.g., unconstrained pose, expression, lighting and occlusion conditions) and on mobile devices. super real-time speed. More specifically, this paper customizes an end-to-end single-stage network related to acceleration techniques. In the training phase, for each sample, the rotation information is estimated for geometrically regularized landmark localization, and then the testing phase is not involved. In addition to considering geometric regularization, a new loss is designed to alleviate the data imbalance problem by adjusting the weight of samples to different states, such as large poses, extreme lighting and occlusions in the training set. Extensive experiments are conducted to demonstrate the effectiveness of our design and reveal that it outperforms the The performance of the most advanced alternatives. The model in this paper can be only 2.1Mb in size and achieve high precision of more than 140 frames (Qualcomm ARM 845 processor) on a mobile phone , making it attractive for large-scale or real-time applications. This article has disclosed the actual system based on the PFLD 0.25X model at http://sites.google.com/view/xjguo/fld to encourage comparison and improvement from the community.

1. Introduction

The problem of key point detection accuracy is summarized as the following three challenges :
(1) Local variability (diversity) : expressions, local extreme lighting (such as highlights and shadows) and occlusion will bring some changes/interference to the face image;
(2) Global variability (diversity) : Pose and imaging quality are the two main factors affecting the appearance of faces in images. When the global structure of faces is misestimated, it will lead to poor positioning of (a large number of) key points;
(3) Data imbalance : Unbalanced data distribution will affect the final performance of the model/algorithm that cannot reasonably and effectively learn the feature representation of the data; (4
) Effectiveness of the model : The two major constraints on applicability are mainly Model size and demand for computing power;

The main contributions of this paper :
(1) The main intention/purpose of this paper is to show that a good design can save a lot of resources while achieving SOTA; (2) The
PFLD algorithm is proposed;
(3) How to enhance the robustness of the algorithm (Geometric constraints) and data imbalance issues, this paper proposes a new loss;
(4) Use the MS-FC layer to increase the receptive field and better capture the global structure of the face;
(5) In the processing speed In terms of model compactness, this paper uses MobileNet blocks to construct the backbone network of PFLD;
(6) The effectiveness of this algorithm is proved by a large number of experiments;

2. Methodology

In this section, we mainly focus on the following aspects:
1. The design of loss, the purpose is to focus on Challenges #1, #2 and #3 at the same time;
2. The detailed description of the network structure, to focus on Challenge #4;

2.2 Backbone Network

The detailed configuration of Backbone is shown in Table 1:
insert image description here

2.3 Auxiliary Network

A reasonable/suitable auxiliary constraint is beneficial for the stability and robustness of keypoint localization. The auxiliary network proposed in this paper plays such a role. The auxiliary network structure is as follows, and its input comes from the fourth block of the backbone (this should refer to the fourth bottleneck, see the code for details).
insert image description here

2.4 Implementation Details

  • First crop all the faces according to the given bbox, and resize to 112 * 112;
  • The batch size is 256;
  • The optimizer is Adam, the weight reduction parameter: 1e-6; the momentum parameter is 0.9;
  • Throughout the training process, lr is fixed to 1e-4;
  • Graphics card: 1080Ti;
  • For 300W, use flip and rotation (-30°, 30°, step=5°) enhancement methods in the training data; in addition, 20% of the area in each sample is randomly occluded;
  • For AFLW, no enhancements are made;
  • For the model testing phase, only backbone is used/included in the network structure;

3. Experimental Evaluation

3.1 Experimental Settings

Datasets : 300W and AFLW;
Evaluation criteria/criteria : Normalized Mean Error (NME) and Cumulative Error Distribution Curve (CED); [For details, please refer to –https://blog.csdn.net/john_bh/article/details/106096775. ]

3.2 Experimental Results

Model Size
insert image description here
Processing Speed
​​is shown in Table 3 above. (C means CPU, G means GPU, A means Qualcomm arm845 processor).
On the Qualcomm arm845 processor, the PFLD 0.25X model detects a face every 7ms (over 140fps); PFLD 1.0X detects a face every 26.4ms (over 37fps).
The Ablation Study
is mainly to illustrate the effectiveness of the new design loss in this article. For the IPN evaluation indicators in the table below, please refer to https://blog.csdn.net/john_bh/article/details/106096775.
insert image description here

4. Concluding Remarks

  • In order to be competent for large-scale and/or real-time tasks, the face key point detector needs to pay attention to the following three aspects: accuracy, efficiency and model lightweight/simplification;
  • This paper proposes a practical face keypoint detection algorithm: PFLD. It consists of two sub-networks: backbone network [mainly composed of mobilenet blocks] and auxiliary network;
  • There is also a multi-scale fully connected layer in the network to increase the receptive field and the ability to capture the face structure;
  • In order to further adjust/normalize the localization of keypoints, another branch, the auxiliary network, is used in this paper, which can effectively estimate the rotation information;
  • Considering geometric regularization and data imbalance issues, a new loss is designed;
  • A large number of experimental results prove the superiority of the new algorithm designed in this paper;

Guess you like

Origin blog.csdn.net/weixin_41807182/article/details/127615059