How to build a face recognition system with DL4J

I. Overview

Face recognition is essentially a problem of finding similarity. The same face is mapped to the same space, and their distance is relatively close. The measure of this distance can be cosine distance, Euclidean distance, or other distance. There are three avatars below.

A B C

Obviously A and C are the same face, A and B are different faces, how to describe it mathematically? Suppose there is a distance function d(x1,x2), then d(A,B) > d(A,C). In a real face recognition application, how small is the function d(x1,x2) to be recognized as the same face? This value is related to the parameters when training the model, which will be given below. It is worth noting that if the function d is cosine, the larger the value, the more similar it is. A general face recognition model should contain two units of feature extraction (that is, feature mapping) and distance calculation.

Second, the construction model

So is there any way to feature mapping? For image processing, convolutional neural network is undoubtedly the best method at present. DeepLearning4J has built-in trained VggFace model, which is based on vgg16 training. The download address of vggFace: https://dl4jdata.blob.core.windows.net/models/vgg16_dl4j_vggface_inference.v1.zip, how to get this address? Directly follow the source code VGG16, the DL4JResources.getURLString method in the pretrainedUrl method will have the download address of the relevant model, and the download address of the pretrained model such as VGG19, ResNet50, etc., can be found in this way. The source code is as follows

public class VGG16 extends ZooModel {

    @Builder.Default private long seed = 1234;
    @Builder.Default private int[] inputShape = new int[] {3, 224, 224};
    @Builder.Default private int numClasses = 0;
    @Builder.Default private IUpdater updater = new Nesterovs();
    @Builder.Default private CacheMode cacheMode = CacheMode.NONE;
    @Builder.Default private WorkspaceMode workspaceMode = WorkspaceMode.ENABLED;
    @Builder.Default private ConvolutionLayer.AlgoMode cudnnAlgoMode = ConvolutionLayer.AlgoMode.PREFER_FASTEST;

    private VGG16() {}

    @Override
    public String pretrainedUrl(PretrainedType pretrainedType) {
        if (pretrainedType == PretrainedType.IMAGENET)
            return DL4JResources.getURLString("models/vgg16_dl4j_inference.zip");
        else if (pretrainedType == PretrainedType.CIFAR10)
            return DL4JResources.getURLString("models/vgg16_dl4j_cifar10_inference.v1.zip");
        else if (pretrainedType == PretrainedType.VGGFACE)
            return DL4JResources.getURLString("models/vgg16_dl4j_vggface_inference.v1.zip");
        else
            return null;
    }

The model structure of vgg16 is as follows:

====================================================================================================
VertexName (VertexType)        nIn,nOut     TotalParams   ParamsShape                  Vertex Inputs
====================================================================================================
input_1 (InputVertex)          -,-          -             -                            -            
conv1_1 (ConvolutionLayer)     3,64         1,792         W:{64,3,3,3}, b:{1,64}       [input_1]    
conv1_2 (ConvolutionLayer)     64,64        36,928        W:{64,64,3,3}, b:{1,64}      [conv1_1]    
pool1 (SubsamplingLayer)       -,-          0             -                            [conv1_2]    
conv2_1 (ConvolutionLayer)     64,128       73,856        W:{128,64,3,3}, b:{1,128}    [pool1]      
conv2_2 (ConvolutionLayer)     128,128      147,584       W:{128,128,3,3}, b:{1,128}   [conv2_1]    
pool2 (SubsamplingLayer)       -,-          0             -                            [conv2_2]    
conv3_1 (ConvolutionLayer)     128,256      295,168       W:{256,128,3,3}, b:{1,256}   [pool2]      
conv3_2 (ConvolutionLayer)     256,256      590,080       W:{256,256,3,3}, b:{1,256}   [conv3_1]    
conv3_3 (ConvolutionLayer)     256,256      590,080       W:{256,256,3,3}, b:{1,256}   [conv3_2]    
pool3 (SubsamplingLayer)       -,-          0             -                            [conv3_3]    
conv4_1 (ConvolutionLayer)     256,512      1,180,160     W:{512,256,3,3}, b:{1,512}   [pool3]      
conv4_2 (ConvolutionLayer)     512,512      2,359,808     W:{512,512,3,3}, b:{1,512}   [conv4_1]    
conv4_3 (ConvolutionLayer)     512,512      2,359,808     W:{512,512,3,3}, b:{1,512}   [conv4_2]    
pool4 (SubsamplingLayer)       -,-          0             -                            [conv4_3]    
conv5_1 (ConvolutionLayer)     512,512      2,359,808     W:{512,512,3,3}, b:{1,512}   [pool4]      
conv5_2 (ConvolutionLayer)     512,512      2,359,808     W:{512,512,3,3}, b:{1,512}   [conv5_1]    
conv5_3 (ConvolutionLayer)     512,512      2,359,808     W:{512,512,3,3}, b:{1,512}   [conv5_2]    
pool5 (SubsamplingLayer)       -,-          0             -                            [conv5_3]    
flatten (PreprocessorVertex)   -,-          -             -                            [pool5]      
fc6 (DenseLayer)               25088,4096   102,764,544   W:{25088,4096}, b:{1,4096}   [flatten]    
fc7 (DenseLayer)               4096,4096    16,781,312    W:{4096,4096}, b:{1,4096}    [fc6]        
fc8 (DenseLayer)               4096,2622    10,742,334    W:{4096,2622}, b:{1,2622}    [fc7]        
----------------------------------------------------------------------------------------------------
            Total Parameters:  145,002,878
        Trainable Parameters:  145,002,878
           Frozen Parameters:  0

For VggFace, we only need the previous convolutional layer and pooling layer to extract features, other fully connected layers can be discarded, then our model can be set as follows.

Explanation: The reason why StackVertex and UnStackVertex are used here is that by default in dl4j, tensor Merge is input together when all inputs are given, which cannot achieve the purpose of sharing weights for multiple inputs, so here we first use StackVertex along the lines of the first 0-dimensional stacked tensors, shared convolution and pooling to extract features, and then use UnStackVertex to unpack the tensors for later use to calculate distances.

The next problem is that the transfer learning api in dl4j can only append the relevant structure to the end of the model, and now our scene is to put part of the structure of the pretrained model in the middle, what should we do? Don't worry, let's take a look at the source code of the transfer learning API and see how DL4J is encapsulated. Found clues in the build method of org.deeplearning4j.nn.transferlearning.TransferLearning.

public ComputationGraph build() {
            initBuilderIfReq();

            ComputationGraphConfiguration newConfig = editedConfigBuilder
                    .validateOutputLayerConfig(validateOutputLayerConfig == null ? true : validateOutputLayerConfig).build();
            if (this.workspaceMode != null)
                newConfig.setTrainingWorkspaceMode(workspaceMode);
            ComputationGraph newGraph = new ComputationGraph(newConfig);
            newGraph.init();

            int[] topologicalOrder = newGraph.topologicalSortOrder();
            org.deeplearning4j.nn.graph.vertex.GraphVertex[] vertices = newGraph.getVertices();
            if (!editedVertices.isEmpty()) {
                //set params from orig graph as necessary to new graph
                for (int i = 0; i < topologicalOrder.length; i++) {

                    if (!vertices[topologicalOrder[i]].hasLayer())
                        continue;

                    org.deeplearning4j.nn.api.Layer layer = vertices[topologicalOrder[i]].getLayer();
                    String layerName = vertices[topologicalOrder[i]].getVertexName();
                    long range = layer.numParams();
                    if (range <= 0)
                        continue; //some layers have no params
                    if (editedVertices.contains(layerName))
                        continue; //keep the changed params
                    INDArray origParams = origGraph.getLayer(layerName).params();
                    layer.setParams(origParams.dup()); //copy over origGraph params
                }
            } else {
                newGraph.setParams(origGraph.params());
            }

It turns out that the layer.setParams method is called directly, and the parameters related to each layer can be set. Next, we have an idea to directly construct a model that is the same as vgg16, and set the parameters of vgg16 to the new model. In fact, in essence, after DeepLearning is trained, only the parameters are useful. With these parameters, we can use these models as we want. Not much nonsense, let's go directly to the code and build our target model

private static ComputationGraph buildModel() {
        ComputationGraphConfiguration conf = new NeuralNetConfiguration.Builder().seed(123)
                .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT).activation(Activation.RELU)
                .graphBuilder().addInputs("input1", "input2").addVertex("stack", new StackVertex(), "input1", "input2")
                .layer("conv1_1",
                        new ConvolutionLayer.Builder().kernelSize(3, 3).stride(1, 1).padding(1, 1).nIn(3).nOut(64)
                                .build(),
                        "stack")
                .layer("conv1_2",
                        new ConvolutionLayer.Builder().kernelSize(3, 3).stride(1, 1).padding(1, 1).nOut(64).build(),
                        "conv1_1")
                .layer("pool1",
                        new SubsamplingLayer.Builder().poolingType(SubsamplingLayer.PoolingType.MAX).kernelSize(2, 2)
                                .stride(2, 2).build(),
                        "conv1_2")
                // block 2
                .layer("conv2_1",
                        new ConvolutionLayer.Builder().kernelSize(3, 3).stride(1, 1).padding(1, 1).nOut(128).build(),
                        "pool1")
                .layer("conv2_2",
                        new ConvolutionLayer.Builder().kernelSize(3, 3).stride(1, 1).padding(1, 1).nOut(128).build(),
                        "conv2_1")
                .layer("pool2",
                        new SubsamplingLayer.Builder().poolingType(SubsamplingLayer.PoolingType.MAX).kernelSize(2, 2)
                                .stride(2, 2).build(),
                        "conv2_2")
                // block 3
                .layer("conv3_1",
                        new ConvolutionLayer.Builder().kernelSize(3, 3).stride(1, 1).padding(1, 1).nOut(256).build(),
                        "pool2")
                .layer("conv3_2",
                        new ConvolutionLayer.Builder().kernelSize(3, 3).stride(1, 1).padding(1, 1).nOut(256).build(),
                        "conv3_1")
                .layer("conv3_3",
                        new ConvolutionLayer.Builder().kernelSize(3, 3).stride(1, 1).padding(1, 1).nOut(256).build(),
                        "conv3_2")
                .layer("pool3",
                        new SubsamplingLayer.Builder().poolingType(SubsamplingLayer.PoolingType.MAX).kernelSize(2, 2)
                                .stride(2, 2).build(),
                        "conv3_3")
                // block 4
                .layer("conv4_1",
                        new ConvolutionLayer.Builder().kernelSize(3, 3).stride(1, 1).padding(1, 1).nOut(512).build(),
                        "pool3")
                .layer("conv4_2",
                        new ConvolutionLayer.Builder().kernelSize(3, 3).stride(1, 1).padding(1, 1).nOut(512).build(),
                        "conv4_1")
                .layer("conv4_3",
                        new ConvolutionLayer.Builder().kernelSize(3, 3).stride(1, 1).padding(1, 1).nOut(512).build(),
                        "conv4_2")
                .layer("pool4",
                        new SubsamplingLayer.Builder().poolingType(SubsamplingLayer.PoolingType.MAX).kernelSize(2, 2)
                                .stride(2, 2).build(),
                        "conv4_3")
                // block 5
                .layer("conv5_1",
                        new ConvolutionLayer.Builder().kernelSize(3, 3).stride(1, 1).padding(1, 1).nOut(512).build(),
                        "pool4")
                .layer("conv5_2",
                        new ConvolutionLayer.Builder().kernelSize(3, 3).stride(1, 1).padding(1, 1).nOut(512).build(),
                        "conv5_1")
                .layer("conv5_3",
                        new ConvolutionLayer.Builder().kernelSize(3, 3).stride(1, 1).padding(1, 1).nOut(512).build(),
                        "conv5_2")
                .layer("pool5",
                        new SubsamplingLayer.Builder().poolingType(SubsamplingLayer.PoolingType.MAX).kernelSize(2, 2)
                                .stride(2, 2).build(),
                        "conv5_3")
                .addVertex("unStack1", new UnstackVertex(0, 2), "pool5")
                .addVertex("unStack2", new UnstackVertex(1, 2), "pool5")
                .addVertex("cosine", new CosineLambdaVertex(), "unStack1", "unStack2")
                .addLayer("out", new LossLayer.Builder().build(), "cosine").setOutputs("out")
                .setInputTypes(InputType.convolutionalFlat(224, 224, 3), InputType.convolutionalFlat(224, 224, 3))
                .build();
        ComputationGraph network = new ComputationGraph(conf);
        network.init();
        return network;
    }

Next, read the parameters of VGG16 and set them to our new model. For the convenience of the code, we set the LayerName to the same as in vgg16

String vggLayerNames = "conv1_1,conv1_2,conv2_1,conv2_2,conv3_1,conv3_2,conv3_3,conv4_1,conv4_2,conv4_3,conv5_1,conv5_2,conv5_3"; 
File vggfile = new File("F:/vgg16_dl4j_vggface_inference.v1.zip");
        ComputationGraph vggFace =
                ModelSerializer.restoreComputationGraph(vggfile);
        ComputationGraph model = buildModel();
        for (String name : vggLayerNames.split(",")) {
            model.getLayer(name).setParams(vggFace.getLayer(name).params().dup());
		}

The feature extraction layer is constructed. After extracting the features, we need to calculate the distance. Here, we need to use DL4J to implement the custom layer. The automatic differentiation provided by DL4J can be very convenient to implement the custom layer. Here we choose SameDiffLambdaVertex, the reason is that this layer No parameters are required, just calculate cosine, the code is as follows:

public class CosineLambdaVertex extends SameDiffLambdaVertex {

	@Override
	public SDVariable defineVertex(SameDiff sameDiff, VertexInputs inputs) {
		SDVariable input1 = inputs.getInput(0);
		SDVariable input2 = inputs.getInput(1);
		return sameDiff.expandDims(sameDiff.math.cosineSimilarity(input1, input2, 1, 2, 3), 1);
	}

	@Override
	public InputType getOutputType(int layerIndex, InputType... vertexInputs) throws InvalidInputTypeException {
		return InputType.feedForward(1);
	}
}

Note: After calculating cosine, expandDims is used to expand the one-dimensional tensor to two-dimensional, in order to verify the accuracy of the model in the LFW dataset.

DL4J also provides other implementations of custom layers and custom nodes, which are as follows:

Layers: standard single input, single output layers defined using SameDiff. To implement, extend org.deeplearning4j.nn.conf.layers.samediff.SameDiffLayer
Lambda layers: as above, but without any parameters. You only need to implement a single method for these! To implement, extend org.deeplearning4j.nn.conf.layers.samediff.SameDiffLambdaLayer
Graph vertices: multiple inputs, single output layers usable only in ComputationGraph. To implement: extend org.deeplearning4j.nn.conf.layers.samediff.SameDiffVertex
Lambda vertices: as above, but without any parameters. Again, you only need to implement a single method for these! To implement, extend org.deeplearning4j.nn.conf.layers.samediff.SameDiffLambdaVertex
Output layers: An output layer, for calculating scores/losses. Used as the final layer in a network. To implement, extend org.deeplearning4j.nn.conf.layers.samediff.SameDiffOutputLayer

Case address: https://github.com/eclipse/deeplearning4j-examples/tree/master/samediff-examples

Documentation: https://github.com/eclipse/deeplearning4j-examples/blob/master/samediff-examples/src/main/java/org/nd4j/examples/samediff/customizingdl4j/README.md

Next, there is one last question, how to define the output layer? The output layer does not need any parameters and calculations, just output the cosine result. The LossLayer provided in dl4j naturally satisfies this structure, there are no parameters, and the activation function is the identity function IDENTITY. So far the model construction is completed, and the final structure is as follows:


=========================================================================================================
VertexName (VertexType)        nIn,nOut   TotalParams   ParamsShape                  Vertex Inputs       
=========================================================================================================
input1 (InputVertex)           -,-        -             -                            -                   
input2 (InputVertex)           -,-        -             -                            -                   
stack (StackVertex)            -,-        -             -                            [input1, input2]    
conv1_1 (ConvolutionLayer)     3,64       1,792         W:{64,3,3,3}, b:{1,64}       [stack]             
conv1_2 (ConvolutionLayer)     64,64      36,928        W:{64,64,3,3}, b:{1,64}      [conv1_1]           
pool1 (SubsamplingLayer)       -,-        0             -                            [conv1_2]           
conv2_1 (ConvolutionLayer)     64,128     73,856        W:{128,64,3,3}, b:{1,128}    [pool1]             
conv2_2 (ConvolutionLayer)     128,128    147,584       W:{128,128,3,3}, b:{1,128}   [conv2_1]           
pool2 (SubsamplingLayer)       -,-        0             -                            [conv2_2]           
conv3_1 (ConvolutionLayer)     128,256    295,168       W:{256,128,3,3}, b:{1,256}   [pool2]             
conv3_2 (ConvolutionLayer)     256,256    590,080       W:{256,256,3,3}, b:{1,256}   [conv3_1]           
conv3_3 (ConvolutionLayer)     256,256    590,080       W:{256,256,3,3}, b:{1,256}   [conv3_2]           
pool3 (SubsamplingLayer)       -,-        0             -                            [conv3_3]           
conv4_1 (ConvolutionLayer)     256,512    1,180,160     W:{512,256,3,3}, b:{1,512}   [pool3]             
conv4_2 (ConvolutionLayer)     512,512    2,359,808     W:{512,512,3,3}, b:{1,512}   [conv4_1]           
conv4_3 (ConvolutionLayer)     512,512    2,359,808     W:{512,512,3,3}, b:{1,512}   [conv4_2]           
pool4 (SubsamplingLayer)       -,-        0             -                            [conv4_3]           
conv5_1 (ConvolutionLayer)     512,512    2,359,808     W:{512,512,3,3}, b:{1,512}   [pool4]             
conv5_2 (ConvolutionLayer)     512,512    2,359,808     W:{512,512,3,3}, b:{1,512}   [conv5_1]           
conv5_3 (ConvolutionLayer)     512,512    2,359,808     W:{512,512,3,3}, b:{1,512}   [conv5_2]           
pool5 (SubsamplingLayer)       -,-        0             -                            [conv5_3]           
unStack1 (UnstackVertex)       -,-        -             -                            [pool5]             
unStack2 (UnstackVertex)       -,-        -             -                            [pool5]             
cosine (SameDiffGraphVertex)   -,-        -             -                            [unStack1, unStack2]
out (LossLayer)                -,-        0             -                            [cosine]            
---------------------------------------------------------------------------------------------------------
            Total Parameters:  14,714,688
        Trainable Parameters:  14,714,688
           Frozen Parameters:  0
=========================================================================================================

3. Verify the model accuracy on LFW

LFW data download address: http://vis-www.cs.umass.edu/lfw/, after I downloaded it, I put it in the F:\facerecognition directory.

Construct the test set, construct positive examples and negative examples respectively, put the same face in a pile, and put different faces in a pile, the code is as follows:

import org.apache.commons.io.FileUtils;

import java.io.File;
import java.io.IOException;
import java.util.Arrays;
import java.util.List;
import java.util.Random;

public class DataTools {
    private static final String PARENT_PATH = "F:/facerecognition";

    public static void main(String[] args) throws IOException {
        File file = new File(PARENT_PATH + "/lfw");
        List<File> list = Arrays.asList(file.listFiles());
        for (int i = 0; i < list.size(); i++) {
            String name = list.get(i).getName();
            File[] faceFileArray = list.get(i).listFiles();
            if (null == faceFileArray) {
                continue;
            }
            //构造正例
            if (faceFileArray.length > 1) {
                String positiveFilePath = PARENT_PATH + "/pairs/1/" + name;
                File positiveFileDir = new File(positiveFilePath);
                if (positiveFileDir.exists()) {
                    positiveFileDir.delete();
                }
                positiveFileDir.mkdir();
                FileUtils.copyFile(faceFileArray[0], new File(positiveFilePath + "/" + faceFileArray[0].getName()));
                FileUtils.copyFile(faceFileArray[1], new File(positiveFilePath + "/" + faceFileArray[1].getName()));
            }
            //构造负例
            String negativeFilePath = PARENT_PATH + "/pairs/0/" + name;
            File negativeFileDir = new File(negativeFilePath);
            if (negativeFileDir.exists()) {
                negativeFileDir.delete();
            }
            negativeFileDir.mkdir();
            FileUtils.copyFile(faceFileArray[0], new File(negativeFilePath + "/" + faceFileArray[0].getName()));
            File[] differentFaceArray = list.get(randomInt(list.size(), i)).listFiles();
            int differentFaceIndex = randomInt(differentFaceArray.length, -1);
            FileUtils.copyFile(differentFaceArray[differentFaceIndex], new File(negativeFilePath + "/" + differentFaceArray[differentFaceIndex].getName()));
        }
    }

    public static int randomInt(int max, int target) {
        Random random = new Random();
        while (true) {
            int result = random.nextInt(max);
            if (result != target) {
                return result;
            }
        }
    }
}

After the test set is constructed, an iterator is constructed, and NativeImageLoader is used to read the image in the iterator. There is a related introduction in "How to use datavec in deeplearning4j to process images" .

public class DataSetForEvaluation implements MultiDataSetIterator {
	private List<FacePair> facePairList;
	private int batchSize;
	private int totalBatches;
	private NativeImageLoader imageLoader;
	private int currentBatch = 0;

	public DataSetForEvaluation(List<FacePair> facePairList, int batchSize) {
		this.facePairList = facePairList;
		this.batchSize = batchSize;
		this.totalBatches = (int) Math.ceil((double) facePairList.size() / batchSize);
		this.imageLoader = new NativeImageLoader(224, 224, 3, new ResizeImageTransform(224, 224));
	}

	@Override
	public boolean hasNext() {
		return currentBatch < totalBatches;
	}

	@Override
	public MultiDataSet next() {
		return next(batchSize);
	}

	@Override
	public MultiDataSet next(int num) {
		int i = currentBatch * batchSize;
		int currentBatchSize = Math.min(batchSize, facePairList.size() - i);
		INDArray input1 = Nd4j.zeros(currentBatchSize, 3,224,224);
		INDArray input2 =  Nd4j.zeros(currentBatchSize, 3,224,224);
		INDArray label = Nd4j.zeros(currentBatchSize, 1);
		for (int j = 0; j < currentBatchSize; j++) {
			try {
				input1.put(new INDArrayIndex[]{NDArrayIndex.point(j),NDArrayIndex.all(),NDArrayIndex.all(),NDArrayIndex.all()}, imageLoader.asMatrix(facePairList.get(i).getList().get(0)).div(255));
				input2.put(new INDArrayIndex[]{NDArrayIndex.point(j),NDArrayIndex.all(),NDArrayIndex.all(),NDArrayIndex.all()},imageLoader.asMatrix(facePairList.get(i).getList().get(1)).div(255));
			} catch (Exception e) {
				e.printStackTrace();
			}
			label.putScalar((long) j, 0, facePairList.get(i).getLabel());
			++i;
		}
		System.out.println(currentBatch);
		++currentBatch;
		return new org.nd4j.linalg.dataset.MultiDataSet(new INDArray[] { input1, input2},
				new INDArray[] { label });
	}

	@Override
	public void setPreProcessor(MultiDataSetPreProcessor preProcessor) {

	}

	@Override
	public MultiDataSetPreProcessor getPreProcessor() {
		return null;
	}

	@Override
	public boolean resetSupported() {
		return true;
	}

	@Override
	public boolean asyncSupported() {
		return false;
	}

	@Override
	public void reset() {
		currentBatch = 0;
	}

}

Next, we can evaluate the performance of the model. The accuracy and precision are okay, but the F1 value is a bit low.

========================Evaluation Metrics========================
 # of classes:    2
 Accuracy:        0.8973
 Precision:       0.9119
 Recall:          0.6042
 F1 Score:        0.7268
Precision, recall & F1: reported for positive class (class 1 - "1") only


=========================Confusion Matrix=========================
    0    1
-----------
 5651   98 | 0 = 0
  665 1015 | 1 = 1

Confusion matrix format: Actual (rowClass) predicted as (columnClass) N times
==================================================================

Fourth, use SpringBoot to encapsulate the model into a service

After the model is saved, it is just a bunch of dead parameters. How can it become an online service? There are two types of face recognition services: 1:1 and 1:N

1. 1:1 application

Typical 1:1 applications such as face recognition unlocking on mobile phones and face recognition attendance on DingTalk are relatively simple applications that only require Zhang San to be Zhang San, and the amount of computation is very small. easy to implement

2.1:N application

Typical 1:N applications, such as face-finding in public security organs, find out who the target face is from a massive face database without knowing the identity of the target face. When the amount of data in the face database is huge, calculation is a big problem.

If the structure is not required to be real-time, you can use Hadoop MapReduce or Spark to calculate it offline. All we need to do is to encapsulate a Hive UDF function, or MapReduce jar, or Spark RDD programming.

However, for the requirement of real-time calculation results, this problem cannot be transformed into an indexing problem, so it is necessary to design a calculation framework that can solve the problem of global Max or global Top in a distributed manner. The general structure is as follows:

The blue arrow indicates that the request is left, and the green arrow indicates that the calculation result is returned. The figure depicts a client request hitting the node Node3, and Node3 forwards the request to other Nodes for parallel computing. Of course, if the memory of each Node is large enough, the tensors of the entire face library can be preheated to the permanent memory to speed up the calculation.

Of course, the parallel computing framework is not implemented in this blog, only the model is packaged as a service with springboot. Run FaceRecognitionApplication and visit http://localhost:8080/index. The service effect is as follows:

All code for this blog: https://gitee.com/lxkm/dl4j-demo/tree/master/face-recognition

V. Summary

The main purpose of this blog is to introduce how to use DL4J in actual combat, including the acquisition of pretrained model parameters, the implementation of custom layers, the implementation of custom iterators, and the use of springboot to wrap layer services, etc.

Of course, it is not enough for a face recognition system to have only one image embedding and tensor distance. It should also include face correction, anti-AI attack (the following blog will also introduce how to use DL4J for FGSM attack), and feature extraction of key parts of the face And so a lot of fine work to do. Of course, there is a lot of work to be done to make face recognition a general SAAS service.

To train a good face recognition model, it needs the cooperation of various loss functions. For example, you can use SoftMax for classification, and then use Center Loss and Triple Loss for fine-tuning. The follow-up blog will introduce how to use DL4J to implement Triple Loss ( $[official]$ ) , to train a face recognition model.

Happiness comes from sharing.

This blog is original by the author, please indicate the source for reprinting

How to build a face recognition system with DL4J

Guess you like