How to use DL4J to attack the face recognition model

I. Introduction

    The previous blog "How to Build a Face Recognition System with DL4J" introduced how to use DL4J to build a face recognition service. At the end of the article, ML Attack was mentioned. This blog will introduce how to use DeepLearning4J to detect faces Identify models for FGSM attacks. This blog mainly contains two parts.

    1. The basic principle of ML Attack.

    2. Explain the attack process in combination with the example of the Tianchi face recognition confrontation competition.

2. Attacks on Machine Learning Models

    1. The process of attacking ML

     Attacks on machine learning models can be described in one sentence by adding some tiny noise to the input features to make the model recognize errors, taking image recognition as an example.

    

    The above picture shows that after adding noise to the original picture, the cat is recognized as a dog, which is the process of attack.

    2. The principle of the attack

    Let's recall that most of the time, the machine learning model seeks the minimum value of the Loss Function, just attack the Loss function, fix the model parameters, and in turn solve a feature X, so that the larger the Loss Function value, the better.

    Recall the training process of the model, assuming there is an image x, the parameter p of the model, the label is c1 (assuming c1 represents the classification of cats), we define a loss function: L = Loss(x, p, c1), where x and c1 is fixed, and the minimum value of L is obtained on the training set by adjusting the parameter p.

   (1) Untargeted attack

    Untargeted attack means that there is no directional target, and the model can be classified incorrectly. Then the formula is described as: argmax Loss(y, p, c1), and a y is obtained when p and c1 are fixed, so that the Loss function is the largest.

    For example, input a picture of a cat and hope the model predicts wrong, no matter what the model predicts, as long as it is not a cat.

    

  (2) Targeted attack

    A targeted attack is to get the model to recognize the input as the object we want, let's say we input an image of a cat and we want the model to predict the classification of a dog. Let c2 be the classification of dogs, then use the formula to describe the targeted attack as follows:

    argmin (-Loss(y,p,c1)+Loss(y,p,c2))

    When p, c1, and c2 are fixed, solve a y to minimize the above function. In layman's terms, input a picture of a cat and hope that the model predicts a dog.

    

    Remarks: where y represents the original image x with noise added, c1 represents the classification of cats, and c2 represents the classification of dogs.

    Finally, there is another problem. If the noise is too strong, the original image will be distorted. We hope that the added noise will be very small and not easy to ignore. We need to make a limit, define the function d(x,y)<t, The function d represents the distance between two images x, y, and t represents a tiny threshold. In this way, the range of noise is limited and cannot be too large.

    We can attack when we know the structure of the model. This is a white-box attack, but most of the time, we have no way of knowing the structure of the model, and we can perform a black-box attack. A black box attack is an attack on the proxy model. For example, an attack on vggnet works on ResNet as well. Below is some data to demonstrate that black box attacks are useful. (The numbers in the table represent the correct rate)

3. FGSM attack

    Paper address: https://arxiv.org/abs/1412.6572

    Solve for a y =

4. Tianchi face recognition confrontation

    1. Competition address: https://tianchi.aliyun.com/competition/entrance/231745/information

    2. Competition scoring rules:

        In order to ensure the visual effect of the perturbed face, this competition limits the perturbation of a single pixel within the range of [-25.5, 25.5]. For submission results with perturbation beyond this range, we will force the image perturbation to be truncated to [-25.5 in the background. , 25.5] interval (using the numpy.clip function). So please control the difference between the submitted adversarial sample and the original image in a single pixel.

    For each generated adversarial sample, the model will be used in the background to predict the sample, and the corresponding disturbance amount will be calculated according to the recognition result. The specific calculation formula is as follows:

    

     where  M represents the prediction result of the background model, and  y represents the true label of the sample  I. If the defense algorithm correctly identifies the sample, the attack is unsuccessful, and the disturbance amount is directly set to the upper limit of 44.1673. This upper bound can be calculated from the maximum perturbation of the constraint of 25.5.  If the attack is successful, calculate the L 2 ​distance between the adversarial sample I^a  and the original sample I , as the score, the smaller the score, the better.  

    One sentence to describe the rule is that the smaller the change, the higher the attack success rate and the better the score.

5. DeepLearning4j conducts FGSM attack

    1. Solve the problem of dl4j's gradient for input

    The proxy model we attacked is also VggFace (why always choose vggface, indeed dl4j only has vggface, hey, there is no other choice), the Gradient object in ComputationGraph is recycled after the gradient back-propagation is completed. The purpose of this design is to save RAM. In MultiLayerNetwork, the partial derivative of Loss to input can be obtained through org.deeplearning4j.nn.multilayer.MultiLayerNetwork#calculateGradients, but how do we get the partial derivative of Loss to input in ComputationGraph? Don't worry, let's find the answer in the source code, let's take a closer look at the source code org.deeplearning4j.nn.graph.ComputationGraph#calcBackpropGradients What does backpropagation do?

 try (MemoryWorkspace wsWorkingMem = workspaceMgr.notifyScopeEntered(ArrayType.BP_WORKING_MEM)) {
                    pair = current.doBackward(truncatedBPTT, workspaceMgr);
                    epsilons = pair.getSecond();

                    //Validate workspace location for the activation gradients:
                    //validateArrayWorkspaces(LayerWorkspaceMgr mgr, INDArray array, ArrayType arrayType, String vertexName, boolean isInputVertex, String op){
                    for (INDArray epsilon : epsilons) {
                        if (epsilon != null) {
                            //May be null for EmbeddingLayer, etc
                            validateArrayWorkspaces(workspaceMgr, epsilon, ArrayType.ACTIVATION_GRAD, vertexName, false, "Backprop");
                        }
                    }
                }

    Follow up on org.deeplearning4j.nn.graph.vertex.GraphVertex#doBackward method

public Pair<Gradient, INDArray[]> doBackward(boolean tbptt, LayerWorkspaceMgr workspaceMgr) {
        if (!canDoBackward()) {
            if(inputs == null || inputs[0] == null){
                throw new IllegalStateException("Cannot do backward pass: inputs not set. Layer: \"" + vertexName
                        + "\" (idx " + vertexIndex + "), numInputs: " + getNumInputArrays());
            } else {
                throw new IllegalStateException("Cannot do backward pass: all epsilons not set. Layer \"" + vertexName
                        + "\" (idx " + vertexIndex + "), numInputs :" + getNumInputArrays() + "; numOutputs: "
                        + getNumOutputConnections());
            }
        }

        //Edge case: output layer - never did forward pass hence layer.setInput was never called...
        if(!setLayerInput){
            applyPreprocessorAndSetInput(workspaceMgr);
        }

        Pair<Gradient, INDArray> pair;
        if (tbptt && layer instanceof RecurrentLayer) {
            //Truncated BPTT for recurrent layers
            pair = ((RecurrentLayer) layer).tbpttBackpropGradient(epsilon,
                            graph.getConfiguration().getTbpttBackLength(), workspaceMgr);
        } else {
            //Normal backprop
            pair = layer.backpropGradient(epsilon, workspaceMgr); //epsTotal may be null for OutputLayers
        }

        if (layerPreProcessor != null) {
            INDArray eps = pair.getSecond();
            eps = layerPreProcessor.backprop(eps, graph.batchSize(), workspaceMgr);
            pair.setSecond(eps);
        }

        //Layers always have single activations input -> always have single epsilon output during backprop
        return new Pair<>(pair.getFirst(), new INDArray[] {pair.getSecond()});
    }

    There is a org.deeplearning4j.nn.conf.InputPreProcessor#backprop that returns the gradient to InputPreProcessor for processing. So we have an idea. We only need to set an InputPreProcessor for the first convolutional layer to obtain the returned gradient. Note that the InputPreProcessor of the LayerVertex is final modified, so how to set the InputPreProcessor? This can't beat Javaer, reflection.

public class LayerVertex extends BaseGraphVertex {

    private Layer layer;
    private final InputPreProcessor layerPreProcessor;
    private boolean setLayerInput;

    Next, implement an InputPreProcessor and put the returned gradient in a static variable

public class Preprocessor implements InputPreProcessor {

	private static final long serialVersionUID = 1L;
	public static INDArray epsilon;

	@Override
	public INDArray preProcess(INDArray input, int miniBatchSize, LayerWorkspaceMgr workspaceMgr) {
		return workspaceMgr.dup(ArrayType.ACTIVATIONS, input);
	}

	@Override
	public InputType getOutputType(InputType inputType) {
		return inputType;
	}

	@Override
	public Pair<INDArray, MaskState> feedForwardMaskArray(INDArray maskArray, MaskState currentMaskState,
			int minibatchSize) {
		return null;
	}

	@Override
	public INDArray backprop(INDArray output, int miniBatchSize, LayerWorkspaceMgr workspaceMgr) {
		epsilon = output.detach();
		return workspaceMgr.dup(ArrayType.ACTIVATION_GRAD, output);
	}

	@Override
	public InputPreProcessor clone() {
		// TODO Auto-generated method stub
		return null;
	}

}

        Next, load the vggface model with the dl4j transfer learning API, remove the fully connected layer, and add CnnLossLayer as the output. Here, the Loss function uses COSINE_PROXIMITY (after trying various methods, I found that the cosine distance works best), and then reflect it to the first Layer convolution layer plus InputPreProcessor, call the setAccessible(true) method of Field during reflection, and open the access permission of the private attribute (of course, this is a last resort method), please see the following code.

ComputationGraph pretrained = (ComputationGraph) VGG16.builder().build().initPretrained(PretrainedType.VGGFACE);
		System.out.println(pretrained.summary());
		FineTuneConfiguration fineTuneConf = new FineTuneConfiguration.Builder().updater(new Sgd(0)).seed(123).build();
		ComputationGraph vgg16Transfer = new TransferLearning.GraphBuilder(pretrained)
				.fineTuneConfiguration(fineTuneConf).removeVertexAndConnections("flatten")
				.removeVertexAndConnections("fc6").removeVertexAndConnections("fc7").removeVertexAndConnections("fc8")
				.addLayer("out", new CnnLossLayer.Builder(LossFunctions.LossFunction.COSINE_PROXIMITY)
						.activation(Activation.IDENTITY).build(), "pool5")
				.setOutputs("out").build();
		LayerVertex conv1_1 = (LayerVertex) vgg16Transfer.getVertex("conv1_1");

		Class<?> clz = conv1_1.getClass();
		Field nameField = clz.getDeclaredField("layerPreProcessor");
		nameField.setAccessible(true);
		nameField.set(conv1_1, new Preprocessor());

		System.out.println(vgg16Transfer.summary());

    So far, the partial derivative of Loss to the input can be obtained through Preprocessor.epsilon. After this problem is solved, the attack can be carried out. The structure of the final proxy model is as follows:

================================================================================================
VertexName (VertexType)      nIn,nOut   TotalParams   ParamsShape                  Vertex Inputs
================================================================================================
input_1 (InputVertex)        -,-        -             -                            -            
conv1_1 (ConvolutionLayer)   3,64       1,792         W:{64,3,3,3}, b:{1,64}       [input_1]    
conv1_2 (ConvolutionLayer)   64,64      36,928        W:{64,64,3,3}, b:{1,64}      [conv1_1]    
pool1 (SubsamplingLayer)     -,-        0             -                            [conv1_2]    
conv2_1 (ConvolutionLayer)   64,128     73,856        W:{128,64,3,3}, b:{1,128}    [pool1]      
conv2_2 (ConvolutionLayer)   128,128    147,584       W:{128,128,3,3}, b:{1,128}   [conv2_1]    
pool2 (SubsamplingLayer)     -,-        0             -                            [conv2_2]    
conv3_1 (ConvolutionLayer)   128,256    295,168       W:{256,128,3,3}, b:{1,256}   [pool2]      
conv3_2 (ConvolutionLayer)   256,256    590,080       W:{256,256,3,3}, b:{1,256}   [conv3_1]    
conv3_3 (ConvolutionLayer)   256,256    590,080       W:{256,256,3,3}, b:{1,256}   [conv3_2]    
pool3 (SubsamplingLayer)     -,-        0             -                            [conv3_3]    
conv4_1 (ConvolutionLayer)   256,512    1,180,160     W:{512,256,3,3}, b:{1,512}   [pool3]      
conv4_2 (ConvolutionLayer)   512,512    2,359,808     W:{512,512,3,3}, b:{1,512}   [conv4_1]    
conv4_3 (ConvolutionLayer)   512,512    2,359,808     W:{512,512,3,3}, b:{1,512}   [conv4_2]    
pool4 (SubsamplingLayer)     -,-        0             -                            [conv4_3]    
conv5_1 (ConvolutionLayer)   512,512    2,359,808     W:{512,512,3,3}, b:{1,512}   [pool4]      
conv5_2 (ConvolutionLayer)   512,512    2,359,808     W:{512,512,3,3}, b:{1,512}   [conv5_1]    
conv5_3 (ConvolutionLayer)   512,512    2,359,808     W:{512,512,3,3}, b:{1,512}   [conv5_2]    
pool5 (SubsamplingLayer)     -,-        0             -                            [conv5_3]    
out (CnnLossLayer)           -,-        0             -                            [pool5]      
------------------------------------------------------------------------------------------------
            Total Parameters:  14,714,688
        Trainable Parameters:  14,714,688
           Frozen Parameters:  0
================================================================================================

    2. Generate Label tensor

    Next, download the target image that needs to be attacked. I put it on the D drive. The target image is as follows.

    

    Next, use vggFace to read all the pictures and convert the pictures into tensors. We only need to get the output of the last pooling layer. Please see the code below

NativeImageLoader loader = new NativeImageLoader(224, 224, 3, new ResizeImageTransform(224, 224));
		File file = new File("D:/securityAI_round1_images/images");
		ImageLoader imageLoader = new ImageLoader(112, 112, 3);
		List<File> list = new ArrayList<>();
		for (File f : file.listFiles()) {
			list.add(f);
		}

		Map<Integer, INDArray> labelMap = new HashMap<>();
		for (int i = 0; i < list.size(); i++) {
			vgg16Transfer.clear();
			INDArray image = loader.asMatrix(list.get(i)).div(255);
			Map<String, INDArray> map = vgg16Transfer.feedForward(image, false);
			labelMap.put(i, map.get("pool5"));
		}

    3. Untargeted attack

    Use the tensor obtained in step 2 as the label, and use the gradient ascent method to find the maximum value of COSINE_PROXIMITY. The implementation of COSINE_PROXIMITY adds a negative sign to consine, so it is to find the minimum value of cosine. In other words, find a picture with the least changes and the least like yourself. code show as below

for (int i = 0; i < list.size(); i++) {
			vgg16Transfer.clear();
			INDArray oldImage = loader.asMatrix(list.get(i)).div(255);
			INDArray newImage = oldImage;
			NesterovsUpdater nesterovsUpdater = new NesterovsUpdater(0.9, 0.01, new long[] { 1, 3, 224, 224 });
			for (int m = 0; m < 2; m++) {
				vgg16Transfer.setInputs(newImage);
				vgg16Transfer.setLabels(
						labelMap.get(i).add(Nd4j.rand(new long[] { 1, 512, 7, 7 }, new NormalDistribution(0, 0.07))));
				vgg16Transfer.computeGradientAndScore();
				INDArray epsilon = Preprocessor.epsilon;

				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(0, 30), NDArrayIndex.all())// 去掉30行
						.assign(0);

				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(0, 75),
						NDArrayIndex.interval(0, 60))// 额头
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(0, 75),
						NDArrayIndex.interval(164, 224))// 额头
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(30, 45),
						NDArrayIndex.interval(0, 60))// 额头
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(30, 45),
						NDArrayIndex.interval(164, 224))// 额头
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(0, 75),
						NDArrayIndex.interval(72, 80))// 额头
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(0, 75),
						NDArrayIndex.interval(135, 152))// 额头
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(75, 115),
						NDArrayIndex.interval(0, 40))// 眼睛
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(75, 115),
						NDArrayIndex.interval(184, 224))// 眼睛
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(115, 165),
						NDArrayIndex.interval(0, 40))// 脸
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(115, 165),
						NDArrayIndex.interval(179, 224))// 脸
						.assign(0);

				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(165, 195),
						NDArrayIndex.interval(0, 50))// 嘴巴
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(165, 195),
						NDArrayIndex.interval(174, 224))// 嘴巴
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(195, 224),
						NDArrayIndex.interval(0, 70))// 下巴
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(195, 224),
						NDArrayIndex.interval(154, 224))// 下巴
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(195, 224),
						NDArrayIndex.interval(75, 97))// 下巴
						.assign(0);
				epsilon.get(NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.interval(195, 224),
						NDArrayIndex.interval(127, 149))// 下巴
						.assign(0);

				epsilon = Transforms.sign(epsilon);
				nesterovsUpdater.applyUpdater(epsilon);
				INDArray preUpdate = newImage.add(epsilon);
				INDArray delta = oldImage.sub(preUpdate);
				INDArray tooLarge = delta.dup();// 因为减的太多了
				BooleanIndexing.replaceWhere(tooLarge, 0, Conditions.absLessThanOrEqual(max));
				BooleanIndexing.replaceWhere(tooLarge, 0, Conditions.lessThan(0));
				tooLarge.subi(max);
				BooleanIndexing.replaceWhere(tooLarge, 0, Conditions.lessThan(0));

				INDArray tooSmall = delta.dup();// 因为加的太多了
				BooleanIndexing.replaceWhere(tooSmall, 0, Conditions.absLessThanOrEqual(max));
				BooleanIndexing.replaceWhere(tooSmall, 0, Conditions.greaterThan(0));
				tooSmall.addi(max);
				BooleanIndexing.replaceWhere(tooSmall, 0, Conditions.greaterThan(0));
				INDArray bias = tooLarge.add(tooSmall);
				newImage = preUpdate.add(bias);
				vgg16Transfer.clear();
				System.out.println(vgg16Transfer.score());

			}
			System.out.println("平均偏差:" + (max - Transforms.abs(oldImage.sub(newImage)).meanNumber().doubleValue()));
			System.out.println("===========================");
			newImage = newImage.mul(255);
			BooleanIndexing.replaceWhere(newImage, 0, Conditions.lessThan(0.0));
			BooleanIndexing.replaceWhere(newImage, 255, Conditions.greaterThan(255));

			BufferedImage bufferedImage = new BufferedImage(224, 224, BufferedImage.TYPE_INT_RGB);
			imageLoader.toBufferedImageRGB(newImage.get(new INDArrayIndex[] { NDArrayIndex.point(0), NDArrayIndex.all(),
					NDArrayIndex.all(), NDArrayIndex.all() }), bufferedImage);
			ImageIO.write(bufferedImage, "jpg", new File("D:/preImage/" + list.get(i).getName()));

			INDArray oldSmallLoader = originalLoad.asMatrix(list.get(i)).div(255);
			INDArray smallImage = smallLoader.asMatrix(new File("D:/preImage/" + list.get(i).getName()));
			System.out.println("原始平均偏差:"
					+ (max - Transforms.abs(oldSmallLoader.sub(smallImage.div(255))).meanNumber().doubleValue()));
			System.out.println("===========================");
			BufferedImage smallBufferedImage = new BufferedImage(112, 112, BufferedImage.TYPE_INT_RGB);
			imageLoader.toBufferedImageRGB(smallImage.get(new INDArrayIndex[] { NDArrayIndex.point(0),
					NDArrayIndex.all(), NDArrayIndex.all(), NDArrayIndex.all() }), smallBufferedImage);
			ImageIO.write(smallBufferedImage, "jpg", new File("D:/images/" + list.get(i).getName()));
		}

    illustrate:

    (1) The target picture is a 112*112 picture. After trying, first widen it to a 224*224 picture before attacking, and the effect will be improved. (It is speculated that dl4j vggface is trained with 224*224 faces, so I made this attempt).

    (2) Randomly adding tiny noise to the label can improve the effect. This code is the reason labelMap.get(i).add(Nd4j.rand(new long[] { 1, 512, 7, 7 }, new NormalDistribution(0, 0.07))).

    (3) In the process of gradient ascent, it will be better to add momentum as an update factor. NesterovsUpdater in the code implements this function.

    (4), BooleanIndexing.replaceWhere in the code is to limit the amount of image changes to a range. It is actually a clip operation, but so much code is written, which is really not as convenient as Python.

    (5) In order to improve the score, when updating the picture, the most changes to the mouth, eyes, and nose can also improve the score.

    (6) The target Loss Function is COSINE_PROXIMITY, and MSE and MAE have been tried before with poor results.

    4. The final generated attack sample

V. Summary

    In the official game, the score was 19.67, ranking more than 20. The effect is average. https://tianchi.aliyun.com/competition/entrance/231745/rankingList/0

    At that time, it was true that there were too few dl4j pretrained models. The effect of multi-model fusion attack should be improved. Now many keras face recognition models can be imported into dl4j. Readers can try it by themselves and check the results. Now the competition has been transferred. For the long game.

    The purpose of writing this blog is to introduce how to use DeepLearning4j to generate adversarial samples with the FGSM method. DL4J is more friendly to Javaer, the source code is clear, and there are many interfaces for expansion. Every Javaer can try and do some AI. Applications.

    All source code for this blog has been submitted: https://gitee.com/lxkm/dl4j-demo/tree/master/adversarial-example

    Note: There is a picture in the blog that uses Professor Li Hongyi's ppt

 

Happiness comes from sharing.

   This blog is original by the author, please indicate the source for reprinting

 

 

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324158400&siteId=291194637