05-Face recognition-FaceNet's perceptual knowledge 04-Face recognition-triplets loss explanation (reprint) 04-Face recognition-triplets loss explanation (reprint)

Source link: https://github.com/davidsandberg/facenet

Paper link: https://arxiv.org/pdf/1503.03832.pdf

Video Interpretation Paper of the Great God of Station B: https://www.bilibili.com/video/av17281188

 

FaceNet is a network for face recognition. Regarding the face, it is often divided into 2 tasks:

1. Face detection (find faces in the picture, find features, and correct)

2. Face recognition (see who this is)

 

In this series of essays, MTCNN is used for face detection and FaceNet is used for face recognition. Let's talk about FaceNet.

 

3 tasks of FaceNet

  • Face judgment (is it this person, is it? No)
  • Facial recognition (who is this person?)
  • Face clustering (what are the similar faces to this face?)

 

The general process of FaceNet

(Extracted, no correction required) Face to be judged -> FaceNet network -> embedding (feature vector of face)

--- Task 1 ---> Find the L2 distance with the feature vector of the known face -> The distance is less than the threshold value is the same face.

--- Task 2 ---> KNN nearest neighbor classification, who is the face (nearest neighbor retrieval, refer to other essays).

--- Task 3 ---> K-means clustering, to find a set of similar faces.

 

FaceNet Technology

It can be seen from the above that the most important thing should be how the FaceNet network is designed. The three tasks are all implemented using some traditional techniques based on the representative face feature embedding output by the network.

One of FaceNet's technologies:

  • triplet  triplets

    The so-called triplets can be seen in my essay " 04-Face Recognition-Interpretation of triplets loss (reproduced) "

http://www.cnblogs.com/alexYuin/p/8855972.html

    In use, a training data sample consists of 3 images:

(anchor-face to be judged, positive- and anchor-like faces, negative- and anchor-different faces)

    This is a triplet.

 

The second technology of FaceNet:

  • Triplets Loss loss function

    See my essay " 04-Face Recognition-Explanation of triplets loss (reproduced) "

http://www.cnblogs.com/alexYuin/p/8855972.html

    Here, the core of the triplet is the anchor.

    04-In the essay, the meaning of L function = 0 is the distance from anchor to positive + alpha <= distance from anchor to negative. This is the result we want. That is, the direction of optimization.

 

 

 

 

 

 

 

 

What a FaceNet network could look like

 

The training process of FaceNet

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324487713&siteId=291194637