In this tutorial, I will show how to use gradient ascent to solve how to misclassify input.
Figure out how to use gradient ascent to change an input classification
The neural network is a black box. Understanding their decisions requires creativity, but they are not so opaque.
In this tutorial, I will show you how to use backpropagation to change the input so that it is classified the way you want.
Human black box
Let us first take humans as an example. If I show you the following input:
Chances are you don't know if this is 5 or 6. In fact, I believe I can convince you that this may also be 8.
Now, if you ask someone what they need to do to turn something into 5, you might do something like this visually:
If I want you to change this to 8, you can do this:
Now, it is not easy to explain the answer to this question with a few if statements or looking at a few coefficients. And for certain types of input (image, sound, video, etc.), interpretability will undoubtedly become more difficult, but not impossible.
How to deal with neural network
How does a neural network answer the same question I asked above? To answer this question, we can use gradient ascent.
This is how the neural network thinks we need to modify the input to make it closer to other classifications.
This produced two interesting results. First of all, the black area is the network object that we need to remove the pixel density. Second, the yellow area is where it thinks we need to increase the pixel density.
We can take a step in this gradient direction and add the gradient to the original image. Of course, we can repeat this process over and over again, finally changing the input into the prediction we want.
You can see that the black spot in the lower left corner of the picture is very similar to human thoughts.
How about making the input look more like 8? This is how the network thinks you must change the input.
It is worth noting that there is a black mass in the lower left corner and a bright mass in the middle. If we add this to the input, we get the following result:
In this case, I don't particularly believe that we have changed this 5 to 8. However, we have reduced the probability of 5. It will definitely be easier to use the picture on the right instead of the picture on the left to convince you that the argument is 8.
gradient
In regression analysis, we use coefficients to understand what we have learned. In random forest, we can observe decision nodes.
In neural networks, it comes down to how we use gradients creatively. To classify this number, we generated a distribution based on possible predictions.
This is what we call forward propagation
As we move forward, we calculate the probability distribution of the output
The code looks like this:
Now suppose we want to trick the network into predicting that the value of input x is "5". The way to achieve this is to give it an image (x), calculate the prediction for the image, and then maximize the probability of predicting the label "5" .
For this, we can use gradient ascent to calculate the predicted gradient at the 6th index (ie label = 5) § with respect to the input x.
To do this in the code, we input x as a parameter to the neural network, select the 6th prediction (because we have labels: 0,1,2,3,4,5,...), the 6th index Means the label "5".
Visually this looks like:
code show as below:
When we call .backward(), what happened can be visualized by the previous animation.
Now that we have calculated the gradients, we can visualize and plot them:
Since the network has not been trained, the above gradient looks like random noise... However, once we train the network, the gradient information will be richer:
Automation through callbacks
This is a very useful tool to help clarify what happens during your network training. In this case, we want to automate this process so that it will happen automatically during training.
For this, we will use PyTorch Lightning to implement our neural network:
import torch
import torch.nn.functional as F
import pytorch_lightning as pl
class LitClassifier(pl.LightningModule):
def __init__(self):
super().__init__()
self.l1 = torch.nn.Linear(28 * 28, 10)
def forward(self, x):
return torch.relu(self.l1(x.view(x.size(0), -1)))
def training_step(self, batch, batch_idx):
x, y = batch
y_hat = self(x)
loss = F.cross_entropy(y_hat, y)
result = pl.TrainResult(loss)
# enable the auto confused logit callback
self.last_batch = batch
self.last_logits = y_hat.detach()
result.log('train_loss', loss, on_epoch=True)
return result
def validation_step(self, batch, batch_idx):
x, y = batch
y_hat = self(x)
loss = F.cross_entropy(y_hat, y)
result = pl.EvalResult(checkpoint_on=loss)
result.log('val_loss', loss)
return result
def configure_optimizers(self):
return torch.optim.Adam(self.parameters(), lr=0.005)
The complex code that automatically draws the content described here can be abstracted as the Callback in Lightning. Callback is a small program, you may call it in various parts of training.
In this example, when processing the training batch, we want to generate these images in case some inputs are confused. .
import torch
from pytorch_lightning import Callback
from torch import nn
class ConfusedLogitCallback(Callback):
def __init__(
self,
top_k,
projection_factor=3,
min_logit_value=5.0,
logging_batch_interval=20,
max_logit_difference=0.1
):
super().__init__()
self.top_k = top_k
self.projection_factor = projection_factor
self.max_logit_difference = max_logit_difference
self.logging_batch_interval = logging_batch_interval
self.min_logit_value = min_logit_value
def on_train_batch_end(self, trainer, pl_module, batch, batch_idx, dataloader_idx):
# show images only every 20 batches
if (trainer.batch_idx + 1) % self.logging_batch_interval != 0:
return
# pick the last batch and logits
x, y = batch
try:
logits = pl_module.last_logits
except AttributeError as e:
m = """please track the last_logits in the training_step like so:
def training_step(...):
self.last_logits = your_logits
"""
raise AttributeError(m)
# only check when it has opinions (ie: the logit > 5)
if logits.max() > self.min_logit_value:
# pick the top two confused probs
(values, idxs) = torch.topk(logits, k=2, dim=1)
# care about only the ones that are at most eps close to each other
eps = self.max_logit_difference
mask = (values[:, 0] - values[:, 1]).abs() < eps
if mask.sum() > 0:
# pull out the ones we care about
confusing_x = x[mask, ...]
confusing_y = y[mask]
mask_idxs = idxs[mask]
pl_module.eval()
self._plot(confusing_x, confusing_y, trainer, pl_module, mask_idxs)
pl_module.train()
def _plot(self, confusing_x, confusing_y, trainer, model, mask_idxs):
from matplotlib import pyplot as plt
confusing_x = confusing_x[:self.top_k]
confusing_y = confusing_y[:self.top_k]
x_param_a = nn.Parameter(confusing_x)
x_param_b = nn.Parameter(confusing_x)
batch_size, c, w, h = confusing_x.size()
for logit_i, x_param in enumerate((x_param_a, x_param_b)):
x_param = x_param.to(model.device)
logits = model(x_param.view(batch_size, -1))
logits[:, mask_idxs[:, logit_i]].sum().backward()
# reshape grads
grad_a = x_param_a.grad.view(batch_size, w, h)
grad_b = x_param_b.grad.view(batch_size, w, h)
for img_i in range(len(confusing_x)):
x = confusing_x[img_i].squeeze(0).cpu()
y = confusing_y[img_i].cpu()
ga = grad_a[img_i].cpu()
gb = grad_b[img_i].cpu()
mask_idx = mask_idxs[img_i].cpu()
fig, axarr = plt.subplots(nrows=2, ncols=3, figsize=(15, 10))
self.__draw_sample(fig, axarr, 0, 0, x, f'True: {y}')
self.__draw_sample(fig, axarr, 0, 1, ga, f'd{mask_idx[0]}-logit/dx')
self.__draw_sample(fig, axarr, 0, 2, gb, f'd{mask_idx[1]}-logit/dx')
self.__draw_sample(fig, axarr, 1, 1, ga * 2 + x, f'd{mask_idx[0]}-logit/dx')
self.__draw_sample(fig, axarr, 1, 2, gb * 2 + x, f'd{mask_idx[1]}-logit/dx')
trainer.logger.experiment.add_figure('confusing_imgs', fig, global_step=trainer.global_step)
@staticmethod
def __draw_sample(fig, axarr, row_idx, col_idx, img, title):
im = axarr[row_idx, col_idx].imshow(img)
fig.colorbar(im, ax=axarr[row_idx, col_idx])
axarr[row_idx, col_idx].set_title(title, fontsize=20)
However, by installing pytorch-lightning-bolts, we made it easier
!pip install pytorch-lightning-bolts
from pl_bolts.callbacks.vision import ConfusedLogitCallback
trainer = Trainer(callbacks=[ConfusedLogitCallback(1)])
Put them together
Finally, we can train our model and automatically generate images when the judgment logic is confused.
# data
dataset = MNIST(os.getcwd(), download=True, transform=transforms.ToTensor())
train, val = random_split(dataset, [55000, 5000])
# model
model = LitClassifier()
# attach callback
trainer = Trainer(callbacks=[ConfusedLogitCallback(1)])
# train!
trainer.fit(model, DataLoader(train, batch_size=64), DataLoader(val, batch_size=64))
tensorboard will automatically generate the following pictures:
See if this is different
Author: William Falcon
Full code: https://colab.research.google.com/drive/16HVAJHdCkyj7W43Q3ZChnxZ7DOwx6K5i?usp=sharing
deephub translation team