[PyTorch] Getting Started

Introductory knowledge

This document is based on the official website's 60min introductory tutorial
"What is PyTprch"
"PyTorch Deep Learning: 60 Minutes Quick Start (Official Website Translation)" By Huang
Haiguang Part of the content comes from "PyTorch Machine Learning From Entry to Actual Combat"-Machinery Industry Press

Numpy commonly used numerical operations and characteristics

There are many operations in the pytorch library that can be compared and used for reference in the processing in the numpy library, and numpy is also a very important and powerful numerical processing library in machine learning. It is recommended that readers first understand the common operations in numpy.

ps Here are some mind maps of the numpy library. Readers can check for leaks. If you have not used numpy before, you can click the link below to get started quickly~
"numpy tutorial|rookie tutorial"

Insert picture description here

And because one of the most important data structures in the numpy library is a multi-dimensional array called ndarray, many of its features and operations can also be borrowed from the list types in python, and some concepts of list are also organized below.
ps is very simple to organize and can only be used for simple sorting.

Insert picture description here


What is PyTorch

PyTorch is a Python-based scientific computing software package, mainly for the following two groups of audiences:

  • In order to use the GPU computing function, use it instead of numpy
  • Deep learning research platform pursuing flexibility and speed

Insert picture description here


Autograd: automatic derivation

  1. Tensor and ndarray

The relationship between tensor and pytorch can be compared to the relationship between ndarray based on numpy, and the two have great similarities in types, declarations, definitions, and operations.

[Similar operations] (example)

x = torch.empty(m,n)#构造一个未初始化的mxn的矩阵
x = torch.rand(m,n)#构造一个随机初始化的mxn的矩阵
x = torch.zeros(m,n,dtype = torch.long)#用零填充一个mxn的矩阵,且元素为long类型
x = torch.tensor([1,2,3]) #通过已知数据来构造张量

[Slightly different operations] (example)

x = torch.rand(m,n)
y = x.view(s,t)#将mxn的矩阵x变成sxt维度的矩阵,且是新建改变

x = torch.rand(1)
x.item() #用.item()方法使单个元素张量作为python数字来获得
  1. Tensor and gradient (Gradient)
    Insert picture description here

The pink box in the picture above is the place that needs extra attention, and some explanations will be given below.

[1]: The attribute .grad_fn indicates each tensor and its corresponding calculation function.

  • The tensor object created by the user does not have this attribute because it is not obtained by related calculations
  • According to different types of calculations, grad_fn attributes are also different
  • If the requires_grad attribute is not set to True, you will not get the corresponding grad_fn attribute because you do not allow it to track calculation records.
import torch
x = torch.ones(2,2)
y = x + 2
z = x * 3
print(x.grad_fn)
print(y.grad_fn)
print(z.grad_fn)
#None
#None
#None

x = torch.ones(2,2,requires_grad = True)
y = x + 2
z = x * 3
print(x.grad_fn)
print(y.grad_fn)
print(z.grad_fn)
#None
#<AddBackward0 object at 0x000002392660F240>
#<MulBackward0 object at 0x000002392660F2B0>

[2]: Prevent tensor tracking calculation history and memory usage

  • When initializing a tensor, the .requires_grad property is not explicitly set to True
  • Use torch.no_grad() to wrap code blocks
import torch
x = torch.ones(2,2,requires_grad = True)
print(x.requires_grad)
print((x*2).requires_grad)

with torch.no_grad():
  print((x*2).requires_grad)

#True
#True
#False

Neural Networks

Neural networks are different according to their depth and structure. A simple feedforward neural network with input layer-hidden layer-output layer, only needs to figure out about forward calculation, back propagation, loss function, activation The knowledge of the function is sufficient.

For details, please refer to the blog post "[PyTorch] Deep Learning Foundation: Neural Network"

1. PyTorch implementation of single layer neural network

Based on some basic knowledge in the previous blog post, we will build and train a neural network with only one linear hidden layer.

The steps of neural network training:

  • data preparation

Data preparation-convert the data we want to use into a format that PyTorch can handle: encapsulate the data set and divide the structured data into batches.
① Provide torch.utils.data.Datasetdata encapsulation.
This category is the parent class of all data sets that need to be loaded. When defining subclasses, you need to overload _len_and _getitem_two functions. The former returns the size of the data set, and the latter implements the subscript index of the data set.

②Use torch.utils.data.DataLoaderof data loading, sampling and iterator generation

class torch.utils.data.DataLoader(dataset,batch_size = 1,shuffle = False, sampler = None,batch_sampler = None,num_workers = 0,collate_fn = <function default_collate>, pin_memory = False,drop_last = False)

  • datasetThe type of Dataset, indicating the data set to be loaded
  • batch_sizeIndicate how many samples need to be loaded for each batch, the default value is 1
  • shuffleIndicate whether the data needs to be scrambled in every epoch
  • samplerStrategies for sampling samples from data sets
  • batch_samplerSimilar to sampler, except that a batch of indicators will be returned at a time
  • num_workersThe number of child processes used when loading data. The default value is 0, which means loading data in the main process
  • collate_fnDefine a list of merged samples to form a mini_batch
  • pin_memoryIf set to true, the data loader will copy the tensors to CUDA fixed memory and then return them.
  • drop_lastIf set to true, the last incomplete batch will be discarded.
  • Construct the network structure and initialize the weights
  • Determine activation function
  • Perform forward calculation
  • Determine the loss function and calculate the loss value
  • Back-propagation and update parameters
    Repeat the above-mentioned forward calculation and back-propagation process until convergence or the termination condition is reached.
import torch
import matplotlib.pyplot as plt
import torch.nn.functional as F
from sklearn.datasets import load_iris
from torch.autograd import Variable
from torch.optim import SGD

#动态判断GPU是否可用,方便在不同类型的处理器上进行迁移
use_cuda = torch.cuda.is_available()
print("use_cuda: ",use_cuda)

#加载数据集,因为是模块里封装好的数据集,只需加载即可使用
iris = load_iris()
print(iris.keys())

#数据预处理,从数据集中将输入输出的信息分离出来并封装成Variable形式
x=iris['data']
y = iris['target']
x = torch.FloatTensor(x)
y = torch.LongTensor(y)

x,y = Variable(x),Variable(y)

#定义神经网络模型
'''
PyTorch中自定义的模型都需要继承Module,并重写forward方法完成前向计算
'''
class Net(torch.nn.Module):
    #初始化函数接受自定义额输入特征维数、隐含层特征维数和输出层的特征维数
    def __init__(self,n_feature,n_hidden,n_output):
        super(Net,self).__init__()
        self.hidden = torch.nn.Linear(n_feature,n_hidden)#线性隐含层
        self.predict = torch.nn.Linear(n_hidden,n_output)#线性输出层
        
    #重写前向传播过程
    def forward(self,x):
        x = F.sigmoid(self.hidden(x))
        x = self.predict(x)
        out = F.log_softmax(x,dim = 1)
        return out
    
#网络实例化并打印查看网络结构
net = Net(n_feature = 4,n_hidden = 5,n_output = 4)
print(net)
#根据iris数据集,输入特征必须是4维的,其余两层的特征维数可以自行选择
    
#判断GPU是否可用,可以灵活调整数据计算
if use_cuda:
    x = x.cuda()
    y = y.cuda()
    net = net.cuda()
        
#定义神经网络训练的优化器,并设置学习率为0.5
optimizer = SGD(net.parameters(),lr = 0.5)
    
#进行训练
px,py = [],[]#用于记录要绘制的数据
for i in range(1000):
    #数据传入网络并进行前向计算
     prediction = net(x)
        
    #计算loss
     loss = F.nll_loss(prediction,y)
        
    #清除网络状态
     optimizer.zero_grad()
        
    #loss反向传播
     loss.backward()
        
    #更新参数
     optimizer.step()
        
    #在训练过程中打印每次迭代的损失情况
     print(i," loss: ",loss.data.item())
     px.append(i)
     py.append(loss.data.item())
        
    #每10次迭代绘制训练动态
        
     if i % 10 == 0:
        plt.cla()
        plt.plot(px,py,'r-',lw = 1)
        plt.text(0,0,'Loss = %.4f' % loss.data.item(),fontdict = {
    
    'size':20,'color':'red'})
        plt.pause(0.1)
    

2. The call mechanism of neural network in
Insert picture description here
PyTorch 3. PyTorch builds neural network classifier

For details about the loss function, objective function and optimizer, please refer to the following blog post
"[PyTorch] Deep Neural Network and Training"

Use PyTorch to build a neural network consisting of an input layer, a fully connected hidden layer, and an output layer to classify the MNIST data set.

Readers can use the following code to familiarize themselves with the characteristics and parameter configuration of the neural network

(1) Configuration parameters

'''
配置库和配置参数
'''
import torch
import torch.nn as nn
import torchvision.datasets as dsets
import torchvision.transforms as transforms
from torch.autograd import Variable

#配置超参数
torch.manual_seed(1)#设置随机数种子,确保结果可能会重复
input_size = 784
hidden_size = 500
num_classes = 10
num_epochs = 5#训练次数
batch_size = 100#批处理大小
learning_rate = 0.001

(2) Load the data set

'''
加载MINST数据集
'''
train_dataset = dsets.MNIST(root= './data',train = True,transform = transforms.ToTensor(),download = True)
#将数据集下载(download)作为训练集(train),其中指定数据保持的位置(root),并且将取值范围从[0,255]转换成[0,1.0](transform)
test_dataset = dsets.MNIST(root = './data',train = False,transform = transforms.ToTensor())
#另保存一份测试集,同样需要进行数值转换

(3) Batch processing settings

'''
数据批处理设置
'''
#DataLoader(Input Pipeline)
#在训练集中,shuffle必须设置为True,表示次序是随机的
train_loader = torch.utils.data.DataLoader(dataset = train_dataset,batch_size = batch_size,shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset = test_dataset,batch_size = batch_size,shuffle = False)

(4) Build DNN model

'''
创建DNN模型
'''
class Net(nn.Module):
    def __init__(self,input_size,hidden_size,num_classes):
        super(Net,self).__init__()
        self.fc1 = nn.Linear(input_size,hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size,num_classes)
    
    def forward(self,x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out
net = Net(input_size,hidden_size,num_classes)
#打印模型,呈现网络结构
print(net)    

【输出结果】
Net(
(fc1): Linear(in_features=784, out_features=500, bias=True)
(relu): ReLU()
(fc2): Linear(in_features=500, out_features=10, bias=True)
)

(5) Model training

'''
模型训练,将图像和标签都用Variable类进行包装,然后放入模型中进行输出
'''
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(),lr = learning_rate)

#开始训练
for epoch in range(num_epochs):
    for i,(images,labels) in enumerate(train_loader):#进行批处理
        #将tensor类型转换成Variable类型
        images = Variable(images.view(-1,28*28))
        Labels = Variable(labels)
        
        #前向计算+反向传播+使用优化器
        optimizer.zero_grad()#梯度清零
        
        outputs = net(images)
        loss = criterion(outputs,labels)
        loss.backward()#后向传播,计算梯度
        optimizer.step()#梯度更新
        
        if(i+1)%100 == 0:
            print('Epoch [%d/%d],Step[%d/%d],Loss: %.4f'%(epoch+1,num_epochs,i+1,len(train_dataset)//batch_size,loss.data.item()))

【输出结果】
Epoch [1/5],Step[100/600],Loss: 0.1175
Epoch [1/5],Step[200/600],Loss: 0.1390
Epoch [1/5],Step[300/600],Loss: 0.2715
Epoch [1/5],Step[400/600],Loss: 0.1825
Epoch [1/5],Step[500/600],Loss: 0.1173
Epoch [1/5],Step[600/600],Loss: 0.1280
Epoch [2/5],Step[100/600],Loss: 0.1060
Epoch [2/5],Step[200/600],Loss: 0.0543
Epoch [2/5],Step[300/600],Loss: 0.1447
Epoch [2/5],Step[400/600],Loss: 0.1715
Epoch [2/5],Step[500/600],Loss: 0.0646
Epoch [2/5],Step[600/600],Loss: 0.0643
Epoch [3/5],Step[100/600],Loss: 0.1027
Epoch [3/5],Step[200/600],Loss: 0.0191
Epoch [3/5],Step[300/600],Loss: 0.0442
Epoch [3/5],Step[400/600],Loss: 0.0599
Epoch [3/5],Step[500/600],Loss: 0.0470
Epoch [3/5],Step[600/600],Loss: 0.0422
Epoch [4/5],Step[100/600],Loss: 0.0448
Epoch [4/5],Step[200/600],Loss: 0.1024
Epoch [4/5],Step[300/600],Loss: 0.0436
Epoch [4/5],Step[400/600],Loss: 0.0686
Epoch [4/5],Step[500/600],Loss: 0.0393
Epoch [4/5],Step[600/600],Loss: 0.0179
Epoch [5/5],Step[100/600],Loss: 0.0292
Epoch [5/5],Step[200/600],Loss: 0.0473
Epoch [5/5],Step[300/600],Loss: 0.0563
Epoch [5/5],Step[400/600],Loss: 0.0552
Epoch [5/5],Step[500/600],Loss: 0.0352
Epoch [5/5],Step[600/600],Loss: 0.0244

(6) Verification on the test set

'''
在测试集测试识别率
'''
correct = 0
total = 0
for images,labels in test_loader:
    images = Variable(images.view(-1,28*28))
    outputs = net(images)
    _,predicted = torch.max(outputs.data,1)#得到预测结果
    total += labels.size(0)#正确结果
    correct += (predicted == labels).sum()#正确结果总数
print('Accuracy of the network on the 10000 test images: %d %%' % (100*correct // total))

[Output result]
Accuracy of the network on the 10000 test images: 97%

Guess you like

Origin blog.csdn.net/kodoshinichi/article/details/109276291