PyTorch Tutorials 3 Neural Networks

%matplotlib inline

Neural Networks

To build the neural network using torch.nn package.

On a talk already mentioned autograd, nnpackage depends autogradbag and to define the model derivation.
A nn.Modulerespective layer contains a forward(input)method that returns output.

E.g:

It is a simple feedforward neural network that accepts an input, and then transferred layer by layer, and finally outputs the calculated result.

A typical neural network training process is as follows:

  1. Definition contains a number of parameters can be learned (or called weights) neural network model;
  2. Iterate over the data set;
  3. Processing the input by the neural network;
  4. Calculation of the loss (the difference between the correct value and the magnitude of the output);
  5. The gradient will reverse propagation network parameters;
  6. Update parameters of the network, mainly using the following simple update principles:
    weight = weight - learning_rate * gradient

Defining network

Begin to define a network:

import torch
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 5x5 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square you can only specify a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]  # all dimensions except the batch dimension
        num_features = 1
        for s in size:
            num_features *= s
        return num_features


net = Net()
print(net)
Net(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

Must be defined in the model forwardfunction, backward
the function (used to calculate the gradient) are autogradautomatically created.
You can forwarduse any operation for Tensor of function.

net.parameters()Return parameters can be learned (weights) and a list of values

params = list(net.parameters())
print(len(params))
print(params[0].size())  # conv1's .weight
10
torch.Size([6, 1, 5, 5])

Test stochastic input 32 × 32.
NOTE: This network (LeNet) input desired size 32 × 32, if the data set using MNIST train the network, the image resized to 32 × 32.

input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out)
tensor([[ 0.1470, -0.0240,  0.0103,  0.0705,  0.0650, -0.0010, -0.0083,  0.0556,
         -0.0686, -0.0675]], grad_fn=<AddmmBackward>)

All parameters of the gradient buffer is cleared, then the stochastic gradient back propagation:

net.zero_grad()
out.backward(torch.randn(1, 10))

Note

`` Torch.nn`` support only small quantities input. `` Torch.nn`` entire package only supports low-volume samples without the support of a single sample. For example, `` nn.Conv2d`` accepts a 4-dimensional tensor, `each dimension are sSamples * nChannels * Height * Width (number of samples * * H * W channels)` `. If you have a single sample, just use `` input.unsqueeze (0) `` to add another dimension

Before continuing, we look back so far used in class.

review:

  • torch.Tensor: A used automatic call backward()implementation supports automatic gradient calculation of multidimensional arrays ,
    and save on the vector gradient wrt
  • nn.Module: Neural network module. Encapsulation parameters, to run on the GPU to move, export, loading and the like.
  • nn.Parameter: One way variance, when assign it to a Moduletime, is automatically registered as a parameter.
  • autograd.Function: Implement an automatic derivation of the forward and reverse operation is defined, each variable node operation creates at least one function, each Tensorof the operations to create a back to the creation Tensorand its history encoding function of the Functionnode.

It focused on the following:

  • The definition of a network
  • Processing input, call backword

Left:

  • Calculation of the loss
  • Update the network weights

Loss function

A loss function accepts one pair (output, target) as input, calculates a value to estimate how much difference between the target value and the output of the network.

Translator's Note: output for the output of the network, target actual value

nn package many different loss function .
nn.MSELossLoss is a relatively simple function, which calculates between the output and the target mean square error ,
for example:

output = net(input)
target = torch.randn(10)  # 随机值作为样例
target = target.view(1, -1)  # 使target和output的shape相同
criterion = nn.MSELoss()

loss = criterion(output, target)
print(loss)
tensor(0.7241, grad_fn=<MseLossBackward>)

Now, if you use it the .grad_fnproperty along the lossrearward movement direction, you will see a calculating map, as shown below:

::

input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d
      -> view -> linear -> relu -> linear -> relu -> linear
      -> MSELoss
      -> loss

So, when we call loss.backward(), the whole graph is differentiated
w.r.t. the loss, and all Tensors in the graph that has requires_grad=True
will have their .grad Tensor accumulated with the gradient.

For illustration, let us follow a few steps backward:

print(loss.grad_fn)  # MSELoss
print(loss.grad_fn.next_functions[0][0])  # Linear
print(loss.grad_fn.next_functions[0][0].next_functions[0][0])  # ReLU
<MseLossBackward object at 0x0000001FCC3CEEB8>
<AddmmBackward object at 0x0000001FCC3CEBE0>
<AccumulateGrad object at 0x0000001FCC3CEEB8>

Back Propagation

Call loss.backward () to obtain an error back propagation.

But before calling the need to clear the existing gradient, or gradient to be accumulated to the existing gradient.

Now, we will call loss.backward (), and view the deviation (bias) conv1 items gradient layer before and after the back-propagation.

net.zero_grad()     # 清除梯度

print('conv1.bias.grad before backward')
print(net.conv1.bias.grad)

loss.backward()

print('conv1.bias.grad after backward')
print(net.conv1.bias.grad)
conv1.bias.grad before backward
tensor([0., 0., 0., 0., 0., 0.])
conv1.bias.grad after backward
tensor([-0.0024,  0.0044, -0.0027,  0.0066, -0.0034,  0.0067])

How to use the loss function

Read later:

nnPacket contains a variety of depths for the neural network constituting the building blocks and loss function block, complete documentation See here Wallpaper .

The last remaining one thing:

  • The right to re-new network

Update weights

In practice, the simplest weight update rule is stochastic gradient descent (SGD):

 ``weight = weight - learning_rate * gradient``

We can use a simple Python code that implements this rule:


    learning_rate = 0.01
    for f in net.parameters():
        f.data.sub_(f.grad.data * learning_rate)

But when you want to use neural network is updated using a variety of different rules, such as SGD, Nesterov-SGD, Adam, RMSPROP etc., PyTorch to construct a package torch.optimimplements all of these rules.
They are very simple to use:

import torch.optim as optim

# create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)

# in your training loop:
optimizer.zero_grad()   # zero the gradient buffers
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step()    # Does the update

.. Note::

  Observe how gradient buffers had to be manually set to zero using
  ``optimizer.zero_grad()``. This is because gradients are accumulated
  as explained in `Backprop`_ section.

Guess you like

Origin www.cnblogs.com/chenxiangzhen/p/10963484.html