[Face Detection] MTCNN+Arcsoftmax

1. Preface

This project is for face recognition by mtcnn+arcsoftmax

2 Face detection

2.1 Overall process

Overview ideas: the use of cascaded thought, on a network as input the output of the next network
Insert picture description here

2.2 Network structure

Insert picture description here

2.2.1 Pnet

Insert picture description here

Function : Determine whether there is a face, and give a face frame and key points, and provide a suggestion frame for the R network

Input : 12x12x3 picture

Output :
1. Probability of human face. I used the two-classification method to make the loss function, so I only output 1x1x1 1 channel (if the multi-class loss function is used, 2 channels can be output)

The reason for outputting a channel: One is that the loss function uses two classifications as the loss function, and the other is when performing coordinate inverse calculations.

Insert picture description here
Insert picture description here

2. The coordinates of the offset of the face detection frame (upper left and lower right). Output 1x1x4 4 channels
Insert picture description here
3, face key point coordinates. The output is 1x1x10 (not implemented in the code)

2.2.2 Rnet

Insert picture description here

Function : To further determine the suggestion box of Pnet output, and to further improve the accuracy of the presence or absence of the face

Input : 24x24x3

Output : Fully connected
1, 1 == "face/no face
2, 4 == "recommended frame coordinate offset
3, 10 == "key point coordinate offset (code not implemented)
Insert picture description here

3) Onet

Insert picture description here

Effect : the output of Rnet further refine and further improve the face accuracy of the presence or absence of
difference : compared with other networks, network layers totaled O 6 layers

Input : 48x48x3
Output : Fully connected
1, 1== "face/no face
2, 4==" suggested frame coordinate offset
3, 10 == "key point coordinate offset (code not implemented)

2.3 Data processing

Use data set: CelebA
download link: https://pan.baidu.com/s/1_e8pSnfeMT0fCFEtvm8pSA
extraction code: wpav

2.3.1 Data generation

Project folder composition
Insert picture description here
Idea:
Insert picture description here

2.3.2 dataset

Note: When I do, in the return value, the image data, confidence, and offset are returned separately (the confidence and offset may not be returned separately)

Insert picture description here
Question:
Training

2.4 Network training

2.4.1 Loss function

2.4.2 Optimizer

2.4.3 Detector

2.5 Gadgets

2.5.1 IOU

2.5.2 NMS non-maximum suppression

3 feature extraction

3.1 Idea

4 feature comparison

Guess you like

Origin blog.csdn.net/qq_43586192/article/details/113846119