Depth study and practice framework PyTorch entry: Chapter nn learning neural network toolbox

Study notes, learn from this article: https: //blog.csdn.net/u011436316/article/details/101930269

autograd realized the automatic differentiation system, but the bottom of the deep learning too, this chapter will introduce nn module is built on top of autograd neural network module. In addition to nn, we will introduce the neural network commonly used tools, such as optimizer optim, initialization init and so on.

4.1 nn.Module

Chapter 3 mentions the use of autograd can achieve deep learning model, but its low level of abstraction, if you use it to achieve deep learning model, you need a great amount of code written. In this case, torch.nn came into being, which is designed specifically for deep learning module. Is the core data structure torch.nn Module1, which is an abstract concept, the neural network may be expressed in a certain layer (Layer), it may also represent a neural network comprises many layers. In actual use, the most common practice inherited nn.Module, write their own networks / layer. Let's look at how you can use the following nn.Module realize their full connection layer. Fully connected layers, also known affine layer, the input parameters y and the input x satisfies y = Wx + b, W b and is learnable.

import torch as t
from torch import nn
from torch.autograd import Variable as V

# 定义线性模型:y = w * x + b
class Linear(nn.Module):    # 继承nn.Module,并且在其构造函数中需调用
    #nn.Module的构造函数,即super(Linear,self).init()或nn.Module.init(self)
    def __init__(self,in_features,out_features):
        #构造函数__init__定义可学习的参数,
        #并封装成Parameter,如封装w,b成Parameter.
        #Parameter是一种特殊的Variable,但其默认需要求导(requires_grad=True)
        super(Linear,self).__init__()    # 等价于nn.Module.__init__(self)
        self.w = nn.Parameter(t.randn(in_features,out_features))#封装w
        self.b = nn.Parameter(t.randn(out_features))#封装b
        
        
    def forward(self,x):#实现前向传播过程,
        #其实输入可以是一个或多个variable,对x的操作也必须是variable支持的操作
        xw = x.mm(self.w)
        y = xw + self.b.expand_as(xw)
        #tensor.expand_as()这个函数就是
        #把一个tensor变成和函数括号内一样形状的tensor,用法与expand()类似。
        return y

net = Linear(4,3)
x = V(t.randn(2,4))
y = net(x)#将net看成函数
print(y)

Output is as follows:

tensor([[-4.6985, -2.3410, -1.2974],
        [-1.7272,  0.0638,  2.1338]], grad_fn=<AddBackward0>)

Input:

layer=net
for name,parameter in layer.named_parameters():
    print(name,parameter) #w and b

Output is as follows:

w Parameter containing:
tensor([[-1.1190, -0.4002,  0.8744],
        [-0.7999, -0.5281, -1.0659],
        [ 0.5722, -0.7538, -0.9995],
        [-0.2019,  0.2152,  0.6807]], requires_grad=True)
b Parameter containing:
tensor([0.6568, 0.5761, 1.7080], requires_grad=True)

It is seen, to achieve full connection layer is very simple, it does not exceed 10 lines of code, but note the following:

  • Linear custom layer must inherit nn.Module, and its constructor calls the constructor nn.Module required, i.e. super (Linear, self) .init () or nn.Module.init (self).
  • Learning parameters must be defined in the constructor __init__, themselves, and packaged into the Parameter, as in this case we w b and packaged as Parameter. Parameter is a special Variable, but the default request guide (requires_grad = True), the interested reader can see ?? Parameter class source code via nn.Parameter.
  • before forward function to implement propagation, which may be one or more of the input variable, x of any operation must also be variable to support operations.
    No need to write back propagation function, because it is the forward spread of variable operation, nn.Module can use autograd automatically back propagation, which is a lot easier than Function.
  • In use, the net can be seen as a function of the mathematical concept intuitively, net call (x) corresponding to x results can be obtained. It is equivalent to net.call (x), in __call__ function, the main call is net.forward (x), also on the hook to do some processing. It should be possible to use net (x) in the actual use instead net.forward (x), on the specific content of the api mentioned below.
  • Module of learning parameters can be returned by the iterator named_parameters () or parameters (), the former will attach to each parameter name to make it more recognizable.

Visible, using the connection layer Module full implementation, simpler than the use of Function implemented, because no need to write back propagation function.

Module can automatically detect its own parameter, and as a learning parameters. In addition parameter, Module Module further comprises a sub, the sub-master Module Module recursive lookup in parameter. The following look at a slightly more complex networks: MLP.

MLP network structure as shown in FIG. It consists of two fully connected layers, using a sigmoid function as the activation function (not shown).

Here Insert Picture Description

class Perceptron(nn.Module):
    def __init__(self,in_features,hidden_features,out_features):
        nn.Module.__init__(self)
        self.layer1=Linear(in_features,hidden_features)
        self.layer2=Linear(hidden_features,out_features)

    def forward(self,x):
        x=self.layer1(x)
        x=t.sigmoid(x)
        #Sigmoid函数是一个在生物学中常见的S型函数,也称为S型生长曲线。
        #在信息科学中,由于其单增以及反函数单增等性质,
        #Sigmoid函数常被用作神经网络的激活函数,将变量映射到0,1之间。
        x=self.layer2(x)
        return x
perceptron=Perceptron(3,4,1)
for name,param in perceptron.named_parameters():
    #Module中的可学习参数可以通过named_parameters()或者parameters()返回迭代器,
    #前者会给每个parameter附上名字,使其更具有辨识度。
    print(name,param.size())#3.4,4,4.1,1

Output is as follows:

layer1.w torch.Size([3, 4])
layer1.b torch.Size([4])
layer2.w torch.Size([4, 1])
layer2.b torch.Size([1])

Visible even slightly more complex multi-layer Perceptron, its implementation is still very simple. It should be noted that the following two knowledge points.

  • __Init__ constructor may be customized using the preceding layer Linear (Module) as a sub-current Module Module object, it can learn the parameters, the current Module may also be learning parameters.
  • Forward propagation function, we consciously output variables are named x, Python is recovered in order to allow some of the output of the intermediate layer, thereby saving memory. But not all intermediate results will be recycled, although some variable name is covered, but still need to use in the back-propagation, Python's memory recovery module by checking the reference count, not reclaim this part of the memory.

Module parameter in the global naming follows:

  • Parameter name directly. = E.g. self.param_name
    nn.Parameter (t.randn (3,4-)), named param_name.
  • Sub Module in parameter, will add their names before the current Module name. For example = self.sub_module
    submodule (), submodule There is a parameter name is also called param_name, then both splicing of parameter
    name is sub_module.param_name.

For the convenience of users, PyTorch achieved in the majority of the neural network layer, which layer are inherited nn.Module, learning parameters Parameter may be encapsulated, and implements forward function, and specifically optimized for a GPU computing CuDNN which speed and performance are very good. This book is not ready for all layers nn.Module the detailed description, the reader can refer to the specific content of the official document or use nn.layer in IPython / Jupyter in? Check. Read the documentation should focus on the following points.

  • Constructor parameters, such as nn.Linear (in_features, out_features, bias), to be concerned about the role of these three parameters.
  • Property, can learn parameters and sub-Module. The weight and bias nn.Linear has two learning parameters, no child Module.
  • O shape, such that the input shape nn.Linear (N, input_features), the output shape (N, output_features), N is batch_size.

These custom layer has assumed the shape of the input: the input data is not a single, but a batch. To a data input, must be called unsqueeze (0) disguised as a function of the data of batch_size = batch 1.

The following departure from the application level, on some common layer to do a brief introduction, more detailed usage, please see the official documents.

Published 53 original articles · won praise 18 · views 10000 +

Guess you like

Origin blog.csdn.net/YUEXILIULI/article/details/103943531