Pytorch学习笔记(1)

注意：本笔记是本人学过Tensorflow后所做。

1. 函数名后面带下划线_ 的函数会修改Tensor本身。例如，x.add_(y)和x.t_()会改变 x，但x.add(y)和x.t()返回一个新的Tensor，而x不变。例子：

import numpy as np
a = np.ones(5)
print(a)

[1. 1. 1. 1. 1.]

a.add_(1) # 以`_`结尾的函数会修改自身

print(a)

[2. 2. 2. 2. 2.]

2. Autograd: 自动微分

深度学习的算法本质上是通过反向传播求导数，而PyTorch的Autograd模块则实现了此功能。在Tensor上的所有操作，Autograd都能为它们自动提供微分，避免了手动计算导数的复杂过程

3. 用pytroch构建一个卷积神经网络：

卷积神经网络结构如下：

torch.nn是专门为神经网络设计的模块化接口。nn构建于 Autograd之上，可用来定义和运行神经网络。nn.Module是nn中最重要的类，可把它看成是一个网络的封装，包含网络各层定义以及forward方法，调用forward(input)方法，可返回前向传播的结果。

代码：

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        # nn.Module子类的函数必须在构造函数中执行父类的构造函数
        # 下式等价于nn.Module.__init__(self)
        super(Net, self).__init__()
        
        # 卷积层 '1'表示输入图片为单通道, '6'表示输出通道数，'5'表示卷积核为5*5
        self.conv1 = nn.Conv2d(1, 6, 5) 
        # 卷积层
        self.conv2 = nn.Conv2d(6, 16, 5) 
        # 仿射层/全连接层，y = Wx + b
        self.fc1   = nn.Linear(16*5*5, 120) 
        self.fc2   = nn.Linear(120, 84)
        self.fc3   = nn.Linear(84, 10)

    def forward(self, x): 
        # 卷积 -> 激活 -> 池化 
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        x = F.max_pool2d(F.relu(self.conv2(x)), 2) 
        # reshape，‘-1’表示自适应
        x = x.view(x.size()[0], -1) 
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)        
        return x

net = Net()
print(net)

学过python的同学都知道，类中的__init__()方法是类的初试化函数，也就是当你调用该类时就自动运行该代码块。这里的该代码块就是搭建了一个不完善的卷积神经网络的框架。因为我们可以发现在初始化函数中并没有涉及到第二层--->第三层和第四层-->第五层。刚刚说到的这两层叫做池化层，在forward（）函数中涉及到了（题外话，大家可以先学学google的框架tensorflow，书籍PDF见我博客，其实深度学习框架都是大差不差的）。

net = Net()
input = Variable(torch.randn(1, 1, 32, 32))
print(input)
out = net(input)
print(out)

这里的函数构造和tf里面的不大一样哈。输入函数的传递是通过forward（）函数来实现的，并且每计算一次，前一次的X就会被覆盖。即out = net(input)的input参数是通过forward（）中的形参传入网络的。

4常用函数

.clamp()函数

clamp表示夹紧，夹住的意思，torch.clamp(input,min,max,out=None)-> Tensor

将input中的元素限制在[min,max]范围内并返回一个Tensor

在这里相当于网络中的一个激活函数。例子： h_relu = h.clamp(min=0)

.mm()函数

乘法函数。例子:torch.mm(tensor1, tensor2)

value, index = torch.topk(input, k, dim)

返回前K个值{value| input>dim}

torch.cat(list, index)

拼接tensor

.clone()函数

复制。y=x.clone()

.no_grad

的作用是在上下文环境中切断梯度计算

  .MSELoss(reduction='sum')函数

in this case we will use Mean Squared Error (MSE) as our loss function.

5.一个完整的神经网络程序（程序自带的注释很重要）：

# -*- coding: utf-8 -*-
import torch

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

# Use the nn package to define our model as a sequence of layers. nn.Sequential
# is a Module which contains other Modules, and applies them in sequence to
# produce its output. Each Linear Module computes output from input using a
# linear function, and holds internal Tensors for its weight and bias.
model = torch.nn.Sequential(
    torch.nn.Linear(D_in, H),
    torch.nn.ReLU(),
    torch.nn.Linear(H, D_out),
)

# The nn package also contains definitions of popular loss functions; in this
# case we will use Mean Squared Error (MSE) as our loss function.
loss_fn = torch.nn.MSELoss(reduction='sum')

learning_rate = 1e-4
for t in range(500):
    # Forward pass: compute predicted y by passing x to the model. Module objects
    # override the __call__ operator so you can call them like functions. When
    # doing so you pass a Tensor of input data to the Module and it produces
    # a Tensor of output data.
    y_pred = model(x)

    # Compute and print loss. We pass Tensors containing the predicted and true
    # values of y, and the loss function returns a Tensor containing the
    # loss.
    loss = loss_fn(y_pred, y)
    print(t, loss.item())

    # Zero the gradients before running the backward pass.
    model.zero_grad()

    # Backward pass: compute gradient of the loss with respect to all the learnable
    # parameters of the model. Internally, the parameters of each Module are stored
    # in Tensors with requires_grad=True, so this call will compute gradients for
    # all learnable parameters in the model.
    loss.backward()

    # Update the weights using gradient descent. Each parameter is a Tensor, so
    # we can access its gradients like we did before.
    print(model.parameters())
    with torch.no_grad():
        for param in model.parameters():
            param -= learning_rate * param.grad

猜你喜欢