I understand the principle here is carried out according to the code, but not necessarily correct
Discussed here is recognition, not the face detection, face detection principle because I have understood, even without dlib, using a variety of RCNN can be achieved, but more so get to the bottom
are dlib is detected first face, and then a 128-dimensional vector generated by the face Resnet, Resnet structure has several different depths (images from https://raw.githubusercontent.com/raghakot/keras-resnet/master /images/architecture.png )
dlib using the network layer 34, a reference http://dlib.net/dnn_imagenet_train_ex.cpp.html , which referred resnet34
And the following code looks the same network and 34 layers (in fact, my C ++ is not good, not how to read looked like it)
template <typename SUBNET> using level1 = res<512,res<512,res_down<512,SUBNET>>>;
template <typename SUBNET> using level2 = res<256,res<256,res<256,res<256,res<256,res_down<256,SUBNET>>>>>>;
template <typename SUBNET> using level3 = res<128,res<128,res<128,res_down<128,SUBNET>>>>;
template <typename SUBNET> using level4 = res<64,res<64,res<64,SUBNET>>>;
If it works then detailed point, it should be below this figure (picture from the network)
resnet34 last layer is fc 1000, is the 1000 neurons
How resnet generate 128-dimensional vector it?
Very simple, behind fc1000 plus a Dense (128) on the line
And then seek the distance between the two vectors after the vector to generate a determined degree of similarity between two face
So how do we start from zero and build a network dlib it as face recognition? That is, we should first build a resnet34, followed by one of Dense (128), followed by classification behind after training is completed abandon last Dense (128) connected to that part of the classification, leaving only the front of the argument, so each input can get a picture a 128-dimensional vector a
Complete