Introduction to nn.Sequential, nn.ModuleList, nn.ModuleDict of pytorch container

foreword

  When creating a deep learning model, we often encounter nn.Sequentialthese nn.ModuleListthree nn.ModuleDictthings, especially during transfer learning training. What are they, how are they used, and what precautions should be taken when using them? Take a look at this blog post.

1.nn.Module

  Before introducing these three containers, we need to know what is Module. When we create models, almost all models inherit from this class. He is the base class of all networks, used to manage the properties of the network. There are two modules associated with it: nn.Parameterand nn.functional. All these modules come from torch.nn. Below we briefly introduce these modules.

1.1. nn.Parameter

  The first nn.Parameter, in Pytorch, nn.Parameteris a special class for creating model parameters. In a model, there are often many parameters, and it is not an easy task to manually manage these parameters. PytorchGenerally, parameters are nn.Parameterused to represent and nn.Modulemanage all parameters under its structure.

## nn.Parameter 具有 requires_grad = True 属性
w = nn.Parameter(torch.randn(2,2))
print(w)   # tensor([[ 0.3544, -1.1643],[ 1.2302,  1.3952]], requires_grad=True)
print(w.requires_grad)   # True

## nn.ParameterList 可以将多个nn.Parameter组成一个列表
params_list = nn.ParameterList([nn.Parameter(torch.rand(8,i)) for i in range(1,3)])
print(params_list)
print(params_list[0].requires_grad)

## nn.ParameterDict 可以将多个nn.Parameter组成一个字典
params_dict = nn.ParameterDict({
    
    "a":nn.Parameter(torch.rand(2,2)),
                               "b":nn.Parameter(torch.zeros(2))})
print(params_dict)
print(params_dict["a"].requires_grad)

The parameters defined above can be managed through Module:

# module.parameters()返回一个生成器,包括其结构下的所有parameters

module = nn.Module()
module.w = w
module.params_list = params_list
module.params_dict = params_dict

num_param = 0
for param in module.parameters():
    print(param,"\n")
    num_param = num_param + 1
print("number of Parameters =",num_param)

  In actual use, nn.Modulethe module class is generally constructed by inheritance, and all parts containing parameters that need to be learned are placed in the constructor.

#以下范例为Pytorch中nn.Linear的源码的简化版本
#可以看到它将需要学习的参数放在了__init__构造函数中,并在forward中调用F.linear函数来实现计算逻辑。

class Linear(nn.Module):
    __constants__ = ['in_features', 'out_features']

    def __init__(self, in_features, out_features, bias=True):
        super(Linear, self).__init__()
        self.in_features = in_features
        self.out_features = out_features
        self.weight = nn.Parameter(torch.Tensor(out_features, in_features))
        if bias:
            self.bias = nn.Parameter(torch.Tensor(out_features))
        else:
            self.register_parameter('bias', None)

    def forward(self, input):
        return F.linear(input, self.weight, self.bias)

1.2. nn.functional

nn.functional(Generally renamed as F after introduction) There are function implementations of various functional components. for example:

  • Activation function series ( F.relu, F.sigmoid, F.tanh, F.softmax)
  • Model Layer Series ( F.linear, F.conv2d, F.max_pool2d, F.dropout2d, F.embedding)
  • loss function series ( F.binary_cross_entropy, F.mse_loss, F.cross_entropy)

  In order to facilitate the management of parameters, it is generally nn.Moduleconverted into a class implementation form through inheritance, and directly encapsulated under nnthe module:

  • The activation function becomes ( nn.ReLu, nn.Sigmoid, nn.Tanh, nn.Softmax)
  • model layer ( nn.Linear, nn.Conv2d, nn.MaxPool2d, nn.Embedding)
  • loss function ( nn.BCELoss, nn.MSELoss, nn.CrossEntorpyLoss)

nnSo the activation functions, layers, and loss functions   we have established on the surface are all functionalimplemented behind the scenes. If you continue to look down, you will know that nn.Modulethis module is indeed very powerful. In addition to managing various parameters referenced by it, it can also manage submodules referenced by it.

1.3. nn.Module

Our focus is to introduce this nn.Modulemodule. nn.ModuleThere are many important dictionary attributes in:

     	self.training = True
        self._parameters: Dict[str, Optional[Parameter]] = OrderedDict()
        self._buffers: Dict[str, Optional[Tensor]] = OrderedDict()
        self._non_persistent_buffers_set: Set[str] = set()
        self._backward_hooks: Dict[int, Callable] = OrderedDict()
        self._is_full_backward_hook = None
        self._forward_hooks: Dict[int, Callable] = OrderedDict()
        self._forward_pre_hooks: Dict[int, Callable] = OrderedDict()
        self._state_dict_hooks: Dict[int, Callable] = OrderedDict()
        self._load_state_dict_pre_hooks: Dict[int, Callable] = OrderedDict()
        self._modules: Dict[str, Optional['Module']] = OrderedDict()

We only need to focus on two of them: _parametersand_modules

  • _parameters: Store and manage nn.Parameterattributes belonging to the class, such as weights, biasing these parameters
  • _modules: Storage management nn.Moduleclass, such as LeNetin the classic network, will build sub-modules, convolutional layers, pooling layers, and will be stored in _modules

Here is a question: What is the difference between nn.Parameterand nn.Modulein _parameters?

  • nn.Parameter: It is torch.Tensora subclass of and is used to mark tensors as learnable parameters of the model. In the process of defining a model, we usually use nn.Parameterto create learnable parameters as attributes of the model. The advantage of this is that nn.Parameterthe object will be automatically registered as a parameter of the model and participate in gradient calculation and parameter update.
  • _parameters: It is nn.Modulean attribute in the class, which is a dictionary used to store the learnable parameters of the model. The keys of the dictionary are the names of the parameters, and the values ​​are the corresponding parameter tensors ( nn.Parametertypes). _parametersThe value of the attribute automatically extracts the learnable parameters from the attributes of the model and adds them to the dictionary.

  can be _parametersthought of as a container for storing the learnable parameters of a model, and nn.Parameteris a special class for creating and labeling these parameters. By using to nn.Parametercreate parameters and use them as model attributes, these parameters will be automatically added to _parametersthe dictionary to facilitate their unified management and operation. That is, nn.Parameteris a special class for creating model parameters, and _parametersis a dictionary attribute that stores model parameters. nn.ParameterParameters created with are automatically added _parametersto the dictionary for easy management and access to the model's parameters.

nn.ModuleWhat is the process of building a fabric network like? Take the following network as an example:

import torch
from torch import nn


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 1, 1, 1)
        self.bn = nn.BatchNorm2d(1)
        self.relu = nn.ReLU()
        self.conv2 = nn.Conv2d(1, 1, 1, 1)

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn(x)
        x = self.relu(x)
        x = self.conv2(x)
        return x


if __name__ == "__main__":
    dat = torch.rand(1, 1, 10, 10)
    net = Net().cuda()

The build process is as follows:

  We first have a large Module(created above Net) inheriting nn.Modulethis base class, such as the one above Net, and then Netthere can be many sub-modules in it, and these sub-modules are also inherited from nn.Module. In these Modulemethods __init__, they will be called first. The initialization method of the parent class performs an initialization of the properties of the parent class. Then, when building each sub-module, it is actually divided into two steps. The first step is initialization, and then the type __setattr__judged by this method valueis saved in the corresponding attribute dictionary, and then assigned to the corresponding member. Such sub-modules are constructed one by one, and the whole Netconstruction is finally completed. You can debug the specific process by yourself.

Summarize:

  • One modulecan contain multiple subclasses module( Netincluding convolutional layers, BNlayers, activation functions)
  • One moduleis equivalent to one operation, and forward()the function must be implemented (the forward of some modules needs to be rewritten by yourself, you will know when you read down)
  • Each modulehas many dictionaries to manage its attributes (the most commonly used is _parameters, _modules)

  Knowing the construction process of the network, we can analyze the model created by others and extract some parts of it. For the introduction of this part, please refer to this blog post: Pytorch extracts neural network layer structure, layer parameters and custom initialization .

二. nn.Sequential

  After introducing nn.Modulethe modules above, let's start to introduce the container. First of all, let's look at it nn.Sequential, nn.Sequentialwhich is PyTorcha module container in , which is used to combine multiple modules in sequence. It can connect a series of modules in order to form a serial model structure. Let's take a look at how it pytorchis implemented in , here we only look at the constructor and the forward propagation part, and the code of other parts is omitted:

class Sequential(Module):
	...
	
    def __init__(self, *args):
        super(Sequential, self).__init__()
        if len(args) == 1 and isinstance(args[0], OrderedDict):
            for key, module in args[0].items():
                self.add_module(key, module)
        else:
            for idx, module in enumerate(args):
                self.add_module(str(idx), module)

	...
	
    def forward(self, input):
        for module in self:
            input = module(input)
        return input

  As you can see from the above code, nn.Sequentialit is inherited from Module, and the description Sequentialitself is also one Module, so it will also have those dictionary parameters. You can see nn.Sequentialthat you have implemented forwardthe method. nn.SequentialThe most commonly used methods are as follows:

  • forward(input): Defines the forward propagation process of the model. In nn.Sequential, this method calls forwardthe method of each module sequentially in the order of the modules, passing the output of the previous module as input to the next module to calculate the final output.
  • add_module(name, module): nn.SequentialAdd a submodule to the . nameis the name of the submodule and moduleis the submodule object to add. Modules will be forward-propagated sequentially in the order they were added.
  • parameters(): Returns nn.Sequentialan iterator over all learnable parameters in . The learnable parameters of the model can be accessed and manipulated via iterators.
  • zero_grad(): nn.SequentialSet the parameter gradients of all modules in to zero. This method is usually called before each gradient update

add_module(name, module)Just list the above methods, which are actually quite right. This method is   usually used the most to add modules. Let's see nn.Sequentialhow to use it?

class Net(nn.Module):
    def __init__(self, classes):
        super(Net, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 6, 5),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(6, 16, 5),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2),)

        self.classifier = nn.Sequential(
            nn.Linear(16*5*5, 120),
            nn.ReLU(),
            nn.Linear(120, 84),
            nn.ReLU(),
            nn.Linear(84, classes),)

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size()[0], -1)
        x = self.classifier(x)
        return x

It can also be created using the add_module method:

import torch
import torch.nn as nn

class Net(nn.Module):
    def __init__(self, classes):
        super(Net, self).__init__()

        self.features = nn.Sequential()
        self.features.add_module('conv1', nn.Conv2d(3, 6, 5))
        self.features.add_module('relu1', nn.ReLU())
        self.features.add_module('pool1', nn.MaxPool2d(kernel_size=2, stride=2))
        self.features.add_module('conv2', nn.Conv2d(6, 16, 5))
        self.features.add_module('relu2', nn.ReLU())
        self.features.add_module('pool2', nn.MaxPool2d(kernel_size=2, stride=2))

        self.classifier = nn.Sequential()
        self.classifier.add_module('fc1', nn.Linear(16*5*5, 120))
        self.classifier.add_module('relu3', nn.ReLU())
        self.classifier.add_module('fc2', nn.Linear(120, 84))
        self.classifier.add_module('relu4', nn.ReLU())
        self.classifier.add_module('fc3', nn.Linear(84, classes))

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size()[0], -1)
        x = self.classifier(x)
        return x

  Through the above network construction, we can see that forwardonly one sentence is used in the function self.features(x)to complete the execution of six sentences. The reason why it can complete this operation is due to the functions nn.Sequentialin forwardthe program. During the execution of the program, the parameters will be passed to it nn.Sequentialfor analysis. The specific implementation process can be debugged and observed.
Summary:
nn.SequentialIt is nn.modulea container, used to package a set of network layers in order, and has the following two characteristics:

  • Sequentiality: Each network layer is constructed strictly in order. At this time, we must pay attention to the relationship between the data of the front and back layers
  • Self-contained forward(): forwardIn the self-contained, forthe forward propagation operation is performed sequentially through the loop

3. nn.ModuleList

  nn.ModuleListIt is also nn.modulea container, which is used to wrap a group of network layers and call the network layer iteratively . The commonly used methods are as follows, which are very similar to the use of list:

  • append(): ModuleListadd network layer after
  • extend(): splicing twoModuleList
  • insert(): Specifies ModuleListto insert the network layer at the middle position

Just look at an example, how to use nn.ModuleListto build a network:

class ModuleListNet(nn.Module):
    def __init__(self):
        super(ModuleListNet, self).__init__()
        self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(10)])

    def forward(self, x):
        for i, linear in enumerate(self.linears):
            x = linear(x)
        return x

  The example above creates 10 nn.Linearmodules by using list comprehensions. On the whole, it is still very simple to use, and friends who are interested in the specific implementation process can view it by debugging the code.

3. nn.ModuleDict

Let's look at nn.ModuleDictthis module again. nn.ModuleDictIt is also nn.modulea container, which is used to package a set of network layers and call the network layer by index . The commonly used methods are as follows, which are similar to the operation of dictionaries:

  • clear(): emptyModuleDict
  • items(): returns an iterable of key-value pairs ( key-value pairs)
  • keys(): returns the key of the dictionary ( key)
  • values(): returns the value of the dictionary (value)
  • pop(): Return a pair of key-value pairs and delete them from the dictionary

Look at an example:

import torch
import torch.nn as nn

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()

        self.module_dict = nn.ModuleDict({
    
    
            'conv1': nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
            'relu1': nn.ReLU(),
            'conv2': nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
            'relu2': nn.ReLU(),
            'flatten': nn.Flatten(),
            'linear': nn.Linear(128 * 32 * 32, 10)
        })

    def forward(self, x):
        for module in self.module_dict.values():
            x = module(x)
        return x

# 创建模型实例
model = MyModel()

# 随机生成输入
x = torch.randn(1, 3, 32, 32)

# 进行前向传播
output = model(x)
print(output.shape)

By nn.ModuleDictcreating a network above, it is still very simple overall, similar to dictionary operations.
  The basic use of , , is basically the introduction of wine, if there is any mistake, please correct nn.Sequentialme nn.ModuleList!nn.ModuleDict

Guess you like

Origin blog.csdn.net/qq_38683460/article/details/131107291