Neural network package nn and optimizer optm

import torch
import torch.nn as nn
torch.__version__
'1.7.0+cu101'

torch.nn is a modular interface specially designed for neural networks. nn is built on Autograd and can be used to define and run neural networks.

# 约定:torch.nn我们为了方便使用,会将其设置别名为nn

In addition to the nn alias, we also quoted nn.functional. This package contains some commonly used functions used in neural networks. The characteristics of these functions are that they do not have learnable parameters (such as ReLU, pool, DropOut, etc.). The function can be placed in the constructor or not, but it is not recommended here.

# 一般情况下我们会将nn.functional 设置为大写的F,这样缩写方便调用
import torch.nn.functional as F

Define a network

PyTorch has prepared a ready-made network model for us, as long as it inherits nn.Module and implements its forward method, PyTorch will automatically implement the backward function according to autograd. In the forward function, any function supported by tensor can be used. Use Python syntax such as if, for loop, print, log, etc., and the writing method is consistent with the standard Python writing method.

class Net(nn.Module):
    def __init__(self):
        # nn.Module子类的函数必须在构造函数中执行父类的构造函数
        super(Net, self).__init__()
        
        # 卷积层 '1'表示输入图片为单通道, '6'表示输出通道数,'3'表示卷积核为3*3
        self.conv1 = nn.Conv2d(1, 6, 3) 
        #线性层,输入1350个特征,输出10个特征
        self.fc1   = nn.Linear(1350, 10)  #这里的1350是如何计算的呢?这就要看后面的forward函数
    #正向传播 
    def forward(self, x): 
        print(x.size()) # 结果:[1, 1, 32, 32]
        # 卷积 -> 激活 -> 池化 
        x = self.conv1(x) #根据卷积的尺寸计算公式,计算结果是30,具体计算公式后面第二章第四节 卷积神经网络 有详细介绍。
        x = F.relu(x)
        print(x.size()) # 结果:[1, 6, 30, 30]
        x = F.max_pool2d(x, (2, 2)) #我们使用池化层,计算结果是15
        x = F.relu(x)
        print(x.size()) # 结果:[1, 6, 15, 15]
        # reshape,‘-1’表示自适应
        #这里做的就是压扁的操作 就是把后面的[1, 6, 15, 15]压扁,变为 [1, 1350]
        x = x.view(x.size()[0], -1) 
        print(x.size()) # 这里就是fc1层的的输入1350 
        x = self.fc1(x)        
        return x

net = Net()
print(net)



Net(
  (conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))
  (fc1): Linear(in_features=1350, out_features=10, bias=True)
)
# 网络的可学习参数通过net.parameters()返回
for parameters in net.parameters():
  print(parameters)
Parameter containing:
tensor([[[[-0.2004, -0.1097,  0.3272],
          [-0.0745,  0.1422,  0.2163],
          [-0.1378, -0.1274, -0.2120]]],


        [[[ 0.2788, -0.1147,  0.2957],
          [-0.0039,  0.2634, -0.3018],
          [ 0.1026, -0.1229, -0.1568]]],


        [[[-0.3330, -0.2717,  0.0299],
          [-0.1473, -0.2343,  0.3114],
          [-0.2097, -0.1937, -0.0008]]],


        [[[-0.1744, -0.2907, -0.0143],
          [-0.2882,  0.0036,  0.0654],
          [-0.0616, -0.2758,  0.2272]]],


        [[[ 0.2356,  0.0542,  0.1573],
          [-0.1292,  0.1380, -0.2210],
          [ 0.1239,  0.1169, -0.0805]]],


        [[[-0.1619,  0.2956, -0.0403],
          [ 0.1436, -0.2060,  0.1852],
          [ 0.0376, -0.2721,  0.2517]]]], requires_grad=True)
Parameter containing:
tensor([-0.2228, -0.3219, -0.2805,  0.1447, -0.2673, -0.2547],
       requires_grad=True)
Parameter containing:
tensor([[-0.0209, -0.0118,  0.0105,  ..., -0.0103, -0.0008,  0.0186],
        [-0.0201,  0.0236,  0.0136,  ...,  0.0226, -0.0057, -0.0047],
        [ 0.0076,  0.0020,  0.0195,  ..., -0.0191, -0.0084,  0.0065],
        ...,
        [-0.0065,  0.0095,  0.0240,  ...,  0.0222, -0.0079, -0.0203],
        [-0.0104,  0.0153,  0.0270,  ..., -0.0258, -0.0101, -0.0155],
        [ 0.0252,  0.0208,  0.0225,  ..., -0.0181, -0.0138, -0.0248]],
       requires_grad=True)
Parameter containing:
tensor([-0.0067, -0.0272, -0.0241, -0.0239, -0.0007,  0.0089,  0.0011, -0.0042,
         0.0206, -0.0120], requires_grad=True)
# net.named_parameters可同时返回可学习的参数及名称
for name,parameters in net.named_parameters():
  print(name,':',parameters.size())
conv1.weight : torch.Size([6, 1, 3, 3])
conv1.bias : torch.Size([6])
fc1.weight : torch.Size([10, 1350])
fc1.bias : torch.Size([10])
input = torch.randn(1, 1, 32, 32) # 这里的对应前面forward的输入是32
out = net(input)
out.size()
torch.Size([1, 1, 32, 32])
torch.Size([1, 6, 30, 30])
torch.Size([1, 6, 15, 15])
torch.Size([1, 1350])





torch.Size([1, 10])
x.size()
torch.Size([1, 1, 32, 32])
# 在反向传播前,先要把所有参数的梯度清零
net.zero_grad()
out.backward(torch.ones(1,10)) # 反向传播的实现是PyTorch自动实现的,我们只要调用这个函数即可

Note : torch.nn only supports mini-batches, and does not support inputting only one sample at a time, that is, one batch must be entered at a time.

In other words, even if we input a sample, the sample will be divided into batches. Therefore, all the inputs will increase by one dimension. Let's compare the input just now. nn is defined as 3 dimensions, but we increase it when we create it manually. One dimension becomes 4 dimension, the first 1 is batch-size

Loss function

In nn, PyTorch also prefabricated the commonly used loss function, below we use MSELoss to calculate the mean square error

y = torch.arange(0,10).view(1,10).float()
y
tensor([[0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]])
# 注:view和numpy中reshape相似
criterion = nn.MSELoss()
loss = criterion(out,y)
# loss是个scalar(标量),我们可以直接用item获取到他的python类型的数值
print(loss.item())
28.474292755126953

Optimizer

After calculating the gradients of all parameters in backpropagation, optimization methods are needed to update the weights and parameters of the network. For example, the update strategy of stochastic gradient descent (SGD) is as follows:

weight = weight - learning_rate * gradient

Implement most optimization methods in torch.optim, such as RMSProp, Adam, SGD, etc. Below we use SGD as a simple example

import torch.optim
out = net(input)
criteriong = nn.MSELoss()
loss = criterion(out, y)

# 新建一个优化器,SGD只需要调整的参数和学习率
optimizer = torch.optim.SGD(net.parameters(),lr=0.01)

# 先梯度清零(与net.zero_grad()效果一样)
optimizer.zero_grad()
loss.backward()

# 更新参数
optimizer.step()

torch.Size([1, 1, 32, 32])
torch.Size([1, 6, 30, 30])
torch.Size([1, 6, 15, 15])
torch.Size([1, 1350])

In this way, a complete dissemination of neural network data has been achieved through PyTorch. The following chapter will introduce the data loading and processing tools provided by PyTorch, which can be used to conveniently process the required data.

Guess you like

Origin blog.csdn.net/qq_49821869/article/details/113727845