Training face recognition deep network (Open Face) based on Triplet loss function

Git:  http://cmusatyalab.github.io/openface/

FaceNet’s innovation comes from four distinct factors: (a) thetriplet loss, (b) their triplet selection procedure, (c) training with 100 million to 200 million labeled images, and (d) (not discussed here) large-scale experimentation to find an network architecture.

First resize to the following size: 96*96

Enter Image (100M-200M pictures required)

Face detection (detecting faces), Preprocessing (scale normalization, grayscale correction, and affine transformation for each face)

Input to neural network (for feature extraction) to finally achieve face representation

Then classify sklearn's SVM (a library in python)

 

 
Figure 1 Model training structure

Triplet loss structure: a set of three images: one standard image, one positive sample (same person as standard), one negative sample (different person)

The entire network is adjusted by the loss formula. The formula is as follows, and the idea will be introduced at the end of the article:

 

 
Figure 2 Triplet loss formula

Resize (96*96) preprocessing uses a simple 2D affine transformation to normalize the face, train the neural network - low-dimensional face representation (the neural network extracts features)

OpenFaceis trained with 500k images from combining the two largest labeled face recognition datasets forresearch.

e network provides an embedding on the unit

hypersphere and Euclidean distance represents similarity.

The network provides an embedded hyperplane and Euclidean distance to represent similarity.

Logic flow:

 

 
Figure 3 Model logic flow

 

 

Finally, the neural network extracts features to form the initial model face representation. As shown below:

 

 
Figure 4 Torch combined with Python

Error function Triplet loss

Finally, let's talk about the error function Triplet loss based on metric learning. The source of its ideas is as follows:

Where xai represents the reference sample, xpi represents the same sample, xni represents the heterogeneous sample, and threshold represents a specific threshold. This inequality can be expressed in the following form:

 

This inequality essentially defines the distance relationship between homogeneous samples and heterogeneous samples, that is: the distance between all homogeneous samples + the threshold threshold, which is smaller than the distance between heterogeneous samples. When the distance relationship does not satisfy the above inequalities, we can adjust the entire network through the back-propagation algorithm by solving the following error function:

 

 

The error is calculated only if the value of the formula in parentheses is greater than 0. Using this formula, the gradient directions of xai, xpi and xni can be calculated respectively, and the previous network can be adjusted according to the back-propagation algorithm.

In FaceNet , the author uses this method to combine with the network structure proposed in Zeiler & Fergus and GoogLeNet to realize face recognition and achieve high accuracy.

To verify the effectiveness of TripletLoss, we use TripletLoss to train another deep convolutional network on the WebFace database to implement face verification, and WebFace has a structural description of the network. Unlike FaceNet , we do not adopt the semi-hard sample selection strategy used by the author, but directly expand the number of samples in the batch. Thanks to the dual Titan X graphics cards, the BatchSize reaches 540, and a larger BatchSize can ensure that the gradient direction obtained is similar to that obtained by the semi-hard strategy.

After obtaining the tripletLoss trained network, we use the Joint-Bayesian method to learn the features extracted from the last layer of the network to obtain a similarity estimation model. The comparative ROC curve of the final model and DeepID on the LFW test set is shown in the following figure:

 

 
 
Openface nn4, small2 network improved on Facenet

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325150770&siteId=291194637