PyTorch practice: implementing Cifar10 color image classification

Table of contents

Preface

1. Cifar10 data set

class torch.utils.data.Dataset

 torch.utils.data.DataLoader

2. Define Neural Network

Ordinary neural network:

Define loss function and optimizer

 Training network-Net

CPU training

Model accuracy

Edit

GPU training

Training network-LeNet

Model accuracy

Please pay attention to prevent it from getting lost. If there are any mistakes, please leave a message for advice. Thank you very much.


Preface

PyTorch can be said to be the most suitable for beginners to learn among the three mainstream frameworks. Compared with other mainstream frameworks, PyTorch's simplicity and ease of use make it the first choice for beginners. The point I want to emphasize is that the framework can be compared to a programming language, which is only a tool for us to achieve project effects, that is, the wheels we use to build cars. What we need to focus on is to understand how to use Torch to implement functions without overly caring about it. How to make the wheels will take us too much learning time. In the future, there will be a series of articles that explain the deep learning framework in detail, but it is only later that we are more familiar with the theoretical knowledge and practical operations of deep learning before we can start learning. What we need most at this stage is to learn how to use these tools.

The content of deep learning is not so easy to master. It contains a lot of mathematical theoretical knowledge and a lot of calculation formula principles that require reasoning. And without actual operation, it is difficult to understand what role the code we write ultimately represents in the neural network computing framework. However, I will try my best to simplify the knowledge and convert it into content that we are more familiar with. I will try my best to let everyone understand and become familiar with the neural network framework, to ensure smooth understanding and smooth deduction, and try not to use too many mathematical formulas and Professional theoretical knowledge. Quickly understand and implement the algorithm in one article, and become proficient in this knowledge in the most efficient way.


The blogger has been focusing on data modeling for four years, and has participated in dozens of mathematical modeling, large and small, and understands the principles of various models, the modeling process of each model, and various problem analysis methods. The purpose of this column is to quickly use various mathematical models, machine learning, deep learning, and code from scratch. Each article contains practical projects and runnable code. Bloggers keep up with various digital and analog competitions. For each digital and analog competition, bloggers will write the latest ideas and codes into this column, as well as detailed ideas and complete codes. I hope friends in need will not miss the column carefully created by the author: Quick Learning in One Article - Commonly Used Models in Mathematical Modeling


1. Cifar10 data set

CIFAR-10 is one of the benchmark datasets widely used for testing and validating image classification algorithms. It has attracted much attention from researchers because of its relatively small size and rich diversity. In the field of deep learning, many studies and papers use CIFAR-10 as a test data set to evaluate their model performance. These categories are:

  1. airplane
  2. automobile
  3. birds
  4. cat
  5. deer
  6. dog
  7. frog
  8. horse
  9. ship
  10. truck

The data set is divided into a training set and a test set, where the training set contains 50,000 images and the test set contains 10,000 images. Each image is 3*32*32a 3-channel color image with a resolution of 32*32. In addition, there is also a CIFAR-100 data set. Since there is not much difference between CIFAR-10 and CIFAR-100 except for the number of classification categories, only the relatively small data set of CIFAR-10 will be introduced here. , introduces the general ideas and methods of image classification using pytorch.

Official download website: CIFAR-10 and CIFAR-100 datasets

Use torch.utils.data to load data:

import numpy as np
import torch
import torchvision.transforms as transforms
import os
from torch.utils.data import DataLoader
from torchvision.transforms import ToPILImage
show = ToPILImage() # 可以把Tensor转成Image,方便可视化
import torchvision.datasets as dsets
batch_size = 100
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"

# 定义对数据的预处理
transform = transforms.Compose([
        transforms.ToTensor(), # 转为Tensor
        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), # 归一化 先将输入归一化到(0,1),再使用公式”(x-mean)/std”,将每个元素分布到(-1,1) 
                             ])

#Cifar110 dataset
train_dataset = dsets.CIFAR10(root='/ml/pycifar',
                              train = True,
                              download = True,
                              transform=transform
                             )
test_dataset = dsets.CIFAR10(root='/ml/pycifar',
                             train = False,
                             download = True,
                             transform=transform
                            )
#加载数据
train_loader=torch.utils.data.DataLoader(dataset=train_dataset,
                                         batch_size = batch_size,
                                         shuffle = True
                                        )
test_loader = torch.utils.data.DataLoader(dataset = test_dataset,
                                          batch_size=batch_size,
                                          shuffle=True
                                         )

Data Display:

import matplotlib.pyplot as plt
fig = plt.figure()
classes=['plane','car','bird','cat','deer','dog','frog','horse','ship','truck']
for i in range(12):
    plt.subplot(3, 4, i+1)
    plt.tight_layout()
    (_, label) = train_dataset[i] 
    plt.imshow(train_loader.dataset.data[i],cmap=plt.cm.binary)
    plt.title("Labels: {}".format(classes[label]))
    plt.xticks([])
    plt.yticks([])
plt.show()

 

 The dataset is divided into five training batches and one test batch, each batch has 10,000 images. The test batch contains 1000 randomly selected images from each category. The training batch contains the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batch contains exactly 5000 images of each category.

We have successfully imported the data set. It should be noted that we have preprocessed the data before and normalized the data. Let us add two basic data loading classes, Dataset and DataLoader, that we need to know:

class torch.utils.data.Dataset

PyTorch Datasetis an abstract class used to represent datasets. It allows you to customize how datasets are loaded for use during training and testing.

  1. Data loading and preprocessing: Dataset allows you to customize the way data is loaded, and can load data from files, databases, networks and other sources. You can __getitem__implement data preprocessing, conversion and other operations in the method to meet the input requirements of the model.

  2. Support for indexes: By implementing __getitem__methods, it is possible to obtain data samples Datasetvia indexes (e.g. dataset[i]This allows you to read data on demand, suitable for large data sets, and avoid loading all the data at once.

  3. Return the total number of samples: len(dataset) Return the total number of samples in the data set to facilitate setting the appropriate number of iterations during the training process.

  4. Iterable: Dataset Can be iterated like a Python list, which means you can use forloops on the data set.

  5. Used in conjunction with DataLoader: Dataset Usually used with PyTorch DataLoader, DataLoaderdata can be loaded into the model in batches, realizing batch processing of data.

  6. Implement custom data sets: You can inherit Datasetclasses and create custom data sets according to your needs.

can be refactored as:

from torch.utils.data import Dataset

class CustomDataset(Dataset):
    def __init__(self, data):
        self.data = data

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        sample = self.data[idx]
        # 这里可以进行数据预处理或转换
        return sample

# 使用示例
data = [1, 2, 3, 4, 5]
custom_dataset = CustomDataset(data)
print(len(custom_dataset))  # 输出: 5
print(custom_dataset[2])  # 输出: 3

 torch.utils.data.DataLoader

 torch.utils.data.DataLoader is a little more complicated. torch provides many parameters that can be used. Commonly used parameters need to be mastered:

  1. dataset (Dataset): The dataset to load. torch.utils.data.DatasetTypically a custom dataset object inherited from .

  2. batch_size (int, optional): Number of samples in each batch. The default value is 1.

  3. shuffle (bool, optional): Whether to randomly shuffle the data at the beginning of each epoch. The default value is False.

  4. sampler (Sampler, optional): Defines the strategy for sampling samples from the data set. If this parameter is specified, shufflethe parameter will be ignored.

  5. batch_sampler (Sampler, optional): Similar to sampler, but returns a batch index list.

  6. num_workers (int, optional): Number of child processes used for data loading. The default value is 0, which means all data will be loaded in the main process. Setting a value greater than 0 will start a corresponding number of child processes to load data, which can speed up data loading.

  7. collate_fn (callable, optional): Function used to pack samples into a batch. Typically used when the input data has different sizes.

  8. pin_memory (bool, optional): If True, the data loader will store the data in CUDA pinned memory, which can accelerate data transfer to the GPU. The default value is False.

  9. drop_last (bool, optional): If True, the last batch batch_sizewill be dropped when its size is less than . The default value is False.

  10. timeout (numeric, optional): Timeout (in seconds) when waiting for new data. If set to None, it will wait until the data is ready. The default value is 0.

  11. worker_init_fn (callable, optional): Each worker will call this function before starting to load data. Can be used to initialize some specific settings in the worker.

These parameters can be adjusted according to the actual situation to meet the needs of the data set and model training. For example, parameters such as , etc. can be set according to the data set size, model structure, etc. batch_sizeto num_workersobtain the best training performance.

2. Define Neural Network

We can use two networks to compare their classification capabilities, one is the ordinary Net network, and the other is the LeNet convolutional network:

Ordinary neural network:

import torch.nn as nn
import torch
input_size = 3072 #3*32*32
hidden_size1 = 500 #第一次隐藏层个数
hidden_size2 = 200 #第二次隐藏层个数
num_classes = 10 #分类个数
num_epochs = 5 #批次次数
batch_size = 100  #批次大小
learning_rate =1e-3
#定义两层神经网络
class Net(nn.Module):
    def __init__(self,input_size,hidden_size1,hidden_size2,num_classes):
        super(Net,self).__init__()
        self.layer1 = nn.Linear(input_size,hidden_size1)#输入
        self.layer2 = nn.Linear(hidden_size1,hidden_size2)#两层隐藏层计算
        self.layer3 = nn.Linear(hidden_size2,num_classes)#输出
        
    def forward(self,x):
        out = torch.relu(self.layer1(x)) #隐藏层1
        out = torch.relu(self.layer2(out)) #隐藏层2
        out = self.layer3(out)
        return out

net =Net(input_size,hidden_size1,hidden_size2,num_classes)

 

Net(
  (layer1): Linear(in_features=3072, out_features=500, bias=True)
  (layer2): Linear(in_features=500, out_features=200, bias=True)
  (layer3): Linear(in_features=200, out_features=10, bias=True)
)

Define loss function and optimizer

from torch import optim
criterion = nn.CrossEntropyLoss() # 交叉熵损失函数
optimizer = optim.SGD(net.parameters(), lr=learning_rate)

 Training network-Net

The training process of neural network is basically similar:

  • Input data
  • Forward propagation + back propagation
  • Update parameters

CPU training

Pytorch runs on the CPU by default:

batch_size = 1000  #批次大小
for epoch in range(num_epochs):
    print('current epoch + %d' % epoch)
    running_loss = 0.0
    for i ,(images,labels) in enumerate(train_loader,0):
        images=images.view(images.size(0),-1)
        labels = torch.tensor(labels, dtype=torch.long)
        # 梯度清零
        optimizer.zero_grad()
        outputs = net(images) #将数据集传入网络做前向计算
        loss = criterion(outputs ,labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
        if i % 1000 == 0: # 每1000个batch打印一下训练状态
            print('[%d, %5d] loss: %.3f' \
                  % (epoch+1, i+1, running_loss))
            running_loss = 0.0
print('Finished Training')

 

current epoch + 0
[1,     0] loss: 2.149
current epoch + 1
[2,     0] loss: 2.025
current epoch + 2
[3,     0] loss: 1.987
current epoch + 3
[4,     0] loss: 2.020
current epoch + 4
[5,     0] loss: 1.970
Finished Training

Model accuracy

#prediction
total = 0
correct =0 
acc_list_test = []
for images,labels in test_loader:
    images=images.view(images.size(0),-1)
    outputs = net(images) #将数据集传入网络做前向计算
    
    _,predicts = torch.max(outputs.data,1)
    total += labels.size(0)
    correct += (predicts == labels).sum()
    acc_list_test.append(100 * correct / total)
    
print('Accuracy = %.2f'%(100 * correct / total))
plt.plot(acc_list_test)
plt.xlabel('Epoch')
plt.ylabel('Accuracy On TestSet')
plt.show()

The accuracy rate is only 33.06%. In fact, if a picture is directly given to us to guess, there is a 10% probability of guessing it correctly. So it seems that the neural network can still be improved. Let's try the convolutional network again.

GPU training

To use GPU training, you only need to switch the torch driver device to the GPU, and explicitly move the tensors and models to the GPU:

batch_size = 1000  #批次大小
net_gpu =Net(input_size,hidden_size1,hidden_size2,num_classes)
net_gpu.cuda()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

for epoch in range(num_epochs):
    print('current epoch + %d' % epoch)
    running_loss = 0.0
    for i ,(images,labels) in enumerate(train_loader,0):
        images = images.to(device)
        labels = labels.to(device)
        
        images=images.view(images.size(0),-1)
        labels = torch.tensor(labels, dtype=torch.long)
        # 梯度清零
        optimizer.zero_grad()
        outputs = net_gpu(images) #将数据集传入网络做前向计算
        loss = criterion(outputs ,labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
        if i % 1000 == 0: # 每2000个batch打印一下训练状态
            print('[%d, %5d] loss: %.3f' \
                  % (epoch+1, i, running_loss ))
            running_loss = 0.0
print('Finished Training')

The effect is the same, so we won’t continue to copy it here.

Training network-LeNet

Let's train a convolutional network again and see the effect. Maybe the description of the convolutional network here is a bit jumpy. Later, I will describe the entire content of the convolutional neural network in detail. Here you only need to know that the performance of each network is different.

import torch.nn as nn
import torch.nn.functional as F
input_size = 3072 #3*32*32
hidden_size1 = 500 #第一次隐藏层个数
hidden_size2 = 200 #第二次隐藏层个数
num_classes = 10 #分类个数
num_epochs = 5 #批次次数
batch_size = 100  #批次大小
learning_rate =1e-3

class LeNet(nn.Module):
    def __init__(self,input_size,hidden_size1,hidden_size2,num_classes):
        super(LeNet, self).__init__()
         # 卷积层 '1'表示输入图片为单通道, '6'表示输出通道数,'5'表示卷积核为5*5
        self.conv1=nn.Conv2d(3,6,5)
        # 卷积层
        self.conv2 = nn.Conv2d(6, 16, 5) 
        # 仿射层/全连接层,y = Wx + b
        self.fc1   = nn.Linear(input_size, hidden_size1) 
        self.fc2   = nn.Linear(hidden_size1, hidden_size2)
        self.fc3   = nn.Linear(hidden_size2, num_classes)
        
    def forward(self,x):
        # 卷积 -> 激活 -> 池化 
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)   
        # reshape,‘-1’表示自适应
        x = x.view(x.size()[0], -1)    
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)        
        return x
net =LeNet(input_size,hidden_size1,hidden_size2,num_classes)

 

LeNet (
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=3072, out_features=500, bias=True)
  (fc2): Linear(in_features=500, out_features=200, bias=True)
  (fc3): Linear(in_features=200, out_features=10, bias=True)
)

The above is the structure of the entire convolutional network. Next, train directly to see the effect:

batch_size = 1000  #批次大小
for epoch in range(num_epochs):
    print('current epoch + %d' % epoch)
    running_loss = 0.0
    for i ,(images,labels) in enumerate(train_loader,0):
        #images=images.view(images.size(0),-1) 单通道不需要
        labels = torch.tensor(labels, dtype=torch.long)
        # 梯度清零
        optimizer.zero_grad()
        outputs = net(images) #将数据集传入网络做前向计算
        loss = criterion(outputs ,labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
        if i % 1000 == 0: # 每1000个batch打印一下训练状态
            print('[%d, %5d] loss: %.3f' \
                  % (epoch+1, i, running_loss ))
            running_loss = 0.0
print('Finished Training')

 

[1,     0] loss: 2.310
current epoch + 1
[2,     0] loss: 2.301
current epoch + 2
[3,     0] loss: 2.305
current epoch + 3
[4,     0] loss: 2.303
current epoch + 4
[5,     0] loss: 2.304
Finished Training

Model accuracy

#prediction
total = 0
correct =0 
acc_list_test = []
for images,labels in test_loader:
    #images=images.view(images.size(0),-1)
    outputs = net(images) #将数据集传入网络做前向计算
    
    _,predicts = torch.max(outputs,1)
    total += labels.size(0)
    correct += (predicts == labels).sum()
    acc_list_test.append(100 * correct / total)
    
print('Accuracy = %.2f'%(100 * correct / total))
plt.plot(acc_list_test)
plt.xlabel('Epoch')
plt.ylabel('Accuracy On TestSet')
plt.show()

 

 Much better than ordinary neural networks.

Please pay attention to prevent it from getting lost. If there are any mistakes, please leave a message for advice. Thank you very much.

That’s all for this issue. My name is fanstuck. If you have any questions, feel free to leave a message for discussion. See you in the next issue.


Guess you like

Origin blog.csdn.net/master_hunter/article/details/133064410