%matplotlib inline
Training a classifier
Last lecture've seen how to define a neural network, weight loss value calculation and updating the network weights.
You may now be thinking the next step.
About the data?
When processing an image, text, audio and video data in general, you may be used to load standard Python data packet to a numpy array.
Then converts this into an array torch.*Tensor
.
- The image can be used Pillow, OpenCV
- The audio can be used scipy, librosa
- Python and the original text can be used to load Cython, or use NLTK or
SpaCy Processing
In particular, for image task, we have created a package
torchvision
that contains some of the basic method for processing image data set. These data sets include
Imagenet, CIFAR10, MNIST like. In addition to loading data, torchvision
further comprising an image converter,
torchvision.datasets
and torch.utils.data.DataLoader
.
torchvision
Package not only provides a great convenience, but also to avoid duplication of code.
In this tutorial, we use CIFAR10 data set, which has the following 10 categories
: 'Airplane', 'Automobile', 'Bird', 'CAT', 'Deer',
'Dog', 'Frog', 'Horse', 'ship', 'truck'. The image CIFAR-10 are
3x32x32 size, i.e., 3 channel color, 32x32 pixels.
A training image classifier
Sequentially in the following order:
- Use
torchvision
load and normalized CIFAR10 training and test sets - The definition of a convolution neural network
- Defined loss function
- Network training on the training set
Network test on the test set
Read and normalized CIFAR10
Use torchvision
can be easily loaded CIFAR10.
import torch
import torchvision
import torchvision.transforms as transforms
torchvision outputs are [0,1] PILImage image, we convert it into a normalized range of [-1, 1] tensor.
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data',
train=True,
download=True,
transform=transform)
trainloader = torch.utils.data.DataLoader(trainset,
batch_size=4,
shuffle=True,
num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data',
train=False,
download=True,
transform=transform)
testloader = torch.utils.data.DataLoader(testset,
batch_size=4,
shuffle=False,
num_workers=2)
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data\cifar-10-python.tar.gz
100%|███████████████████████████████████████████████████████████████▉| 170483712/170498071 [30:34<00:00, 144484.30it/s]
Files already downloaded and verified
170500096it [30:50, 144484.30it/s]
We show some of the training images.
import matplotlib.pyplot as plt
import numpy as np
# 展示图像的函数
def imshow(img):
img = img / 2 + 0.5 # unnormalize
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1, 2, 0)))
# 获取随机数据
dataiter = iter(trainloader)
images, labels = dataiter.next()
# 展示图像
imshow(torchvision.utils.make_grid(images))
# 显示图像标签
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
ship dog dog plane
2. Define a convolution neural network
From a previous neural network copy the code neural network, and modifying the input image 3 channels.
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
3. Define loss function and Optimizer
We use cross-entropy loss function, using stochastic gradient driven by volume decline.
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
4. training network
Interesting time began.
We just need the iterator loop on the data, the data input to the network, and optimization.
for epoch in range(2): # 多批次循环
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# 获取输入
inputs, labels = data
# 梯度置0
optimizer.zero_grad()
# 正向传播,反向传播,优化
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# 打印状态信息
running_loss += loss.item()
if i % 2000 == 1999: # 每2000批次打印一次
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000))
running_loss = 0.0
print('Finished Training')
[1, 2000] loss: 2.168
[1, 4000] loss: 1.848
[1, 6000] loss: 1.663
[1, 8000] loss: 1.573
[1, 10000] loss: 1.529
[1, 12000] loss: 1.458
[2, 2000] loss: 1.412
[2, 4000] loss: 1.390
[2, 6000] loss: 1.352
[2, 8000] loss: 1.317
[2, 10000] loss: 1.306
[2, 12000] loss: 1.299
Finished Training
The test network on the test set
We conducted two training throughout the training set, but we need to check whether the network focus on learning from data to useful things.
It is detected by comparing the output of the neural network prediction and the actual label class labels.
If the prediction is correct, we correctly predicted the sample is added to the list.
The first step, showing the test set of images and familiar picture content.
dataiter = iter(testloader)
images, labels = dataiter.next()
# 显示图片
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))
GroundTruth: cat ship ship plane
Let's look at the picture above neural network think yes.
outputs = net(images)
Energy output of 10 labels.
The larger a category of energy, the neural network that it is in this category. So let's get the highest energy label.
_, predicted = torch.max(outputs, 1)
print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
for j in range(4)))
Predicted: cat ship ship ship
The result looks good.
Let's see what happens over the entire network test set.
correct = 0
total = 0
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy of the network on the 10000 test images: %d %%' % (
100 * correct / total))
Accuracy of the network on the 10000 test images: 53 %
The result looks good, at least better than random selection, random selection of 10% correct.
It seems to be learning to something.
Good at identifying which class, which is not good it?
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs, 1)
c = (predicted == labels).squeeze()
for i in range(4):
label = labels[i]
class_correct[label] += c[i].item()
class_total[label] += 1
for i in range(10):
print('Accuracy of %5s : %2d %%' % (
classes[i], 100 * class_correct[i] / class_total[i]))
Accuracy of plane : 46 %
Accuracy of car : 61 %
Accuracy of bird : 33 %
Accuracy of cat : 39 %
Accuracy of deer : 43 %
Accuracy of dog : 54 %
Accuracy of frog : 76 %
Accuracy of horse : 47 %
Accuracy of ship : 75 %
Accuracy of truck : 60 %
The next step?
How can we run it on a neural network GPU?
Training on the GPU
The move to a neural network trained on the GPU is like putting a Tensor conversion as easy on the GPU. And this operation will be recursive traversal module, and converts them to parameter buffer and CUDA tensor.
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# 确认我们的电脑支持CUDA,然后显示CUDA信息:
print(device)
The rest of this section assumes device
a CUDA device.
These methods will then recursively through all the modules and parameters and buffers the
converted CUDA tensor:
net.to(device)
Remember: inputs and targets should be converted.
inputs, labels = inputs.to(device), labels.to(device)
Why we did not notice the GPU speed increase a lot? That is because a very small network.
Practice:
Try increasing the width of your network (the first nn.Conv2d
of two parameters, the second nn.Conv2d
of the first argument, they need to be the same number), you get to see what kind of acceleration.
Achievement of objectives :
- In-depth understanding of PyTorch tensor libraries and neural networks
- A small network trained to classify pictures
Translator's Note: Later we tutorial training a real network, so that the recognition rate of 90%.
Multi-GPU training
If you want to use all the greater GPU acceleration,
please see the data parallel processing .
The next step?
- :doc:
训练神经网络玩电子游戏 </intermediate/reinforcement_q_learning>
在ImageNet上训练最好的ResNet
使用对抗生成网络来训练一个人脸生成器
使用LSTM网络训练一个字符级的语言模型
更多示例
更多教程
在论坛上讨论PyTorch
Slack上与其他用户讨论