Face segmentation/analysis face-parsing.PyTorch source code analysis resnet.py

import torch.nn.functional as F
import torch.utils.model_zoo as modelzoo

# from modules.bn import InPlaceABNSync as BatchNorm2d

resnet18_url = 'https://download.pytorch.org/models/resnet18-5c106cde.pth'


def conv3x3(in_planes, out_planes, stride=1):
    """3x3 convolution with padding"""
    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
                     padding=1, bias=False)


class BasicBlock(nn.Module):
    def __init__(self, in_chan, out_chan, stride=1):
        super(BasicBlock, self).__init__()
        self.conv1 = conv3x3(in_chan, out_chan, stride)
        self.bn1 = nn.BatchNorm2d(out_chan)
        self.conv2 = conv3x3(out_chan, out_chan)
        self.bn2 = nn.BatchNorm2d(out_chan)
        self.relu = nn.ReLU(inplace=True)
        self.downsample = None
        if in_chan != out_chan or stride != 1:
            self.downsample = nn.Sequential(
                nn.Conv2d(in_chan, out_chan,
                          kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(out_chan),
                )

    def forward(self, x):
        residual = self.conv1(x)
        residual = F.relu(self.bn1(residual))
        residual = self.conv2(residual)
        residual = self.bn2(residual)

        shortcut = x
        if self.downsample is not None:
            shortcut = self.downsample(x)

        out = shortcut + residual
        out = self.relu(out)
        return out


def create_layer_basic(in_chan, out_chan, bnum, stride=1):
    layers = [BasicBlock(in_chan, out_chan, stride=stride)]
    for i in range(bnum-1):
        layers.append(BasicBlock(out_chan, out_chan, stride=1))
    return nn.Sequential(*layers)


class Resnet18(nn.Module):
    def __init__(self):
        super(Resnet18, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
                               bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.layer1 = create_layer_basic(64, 64, bnum=2, stride=1)
        self.layer2 = create_layer_basic(64, 128, bnum=2, stride=2)
        self.layer3 = create_layer_basic(128, 256, bnum=2, stride=2)
        self.layer4 = create_layer_basic(256, 512, bnum=2, stride=2)
        self.init_weight()

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(self.bn1(x))
        x = self.maxpool(x)

        x = self.layer1(x)
        feat8 = self.layer2(x) # 1/8
        feat16 = self.layer3(feat8) # 1/16
        feat32 = self.layer4(feat16) # 1/32
        return feat8, feat16, feat32

    def init_weight(self):
        state_dict = modelzoo.load_url(resnet18_url)
        self_state_dict = self.state_dict()
        for k, v in state_dict.items():
            if 'fc' in k: continue
            self_state_dict.update({k: v})
        self.load_state_dict(self_state_dict)

    def get_params(self):
        wd_params, nowd_params = [], []
        for name, module in self.named_modules():
            if isinstance(module, (nn.Linear, nn.Conv2d)):
                wd_params.append(module.weight)
                if not module.bias is None:
                    nowd_params.append(module.bias)
            elif isinstance(module,  nn.BatchNorm2d):
                nowd_params += list(module.parameters())
        return wd_params, nowd_params


if __name__ == "__main__":
    net = Resnet18()
    x = torch.randn(16, 3, 224, 224)
    out = net(x)
    print(out[0].size())
    print(out[1].size())
    print(out[2].size())
    net.get_params()

This code mainly defines a deep learning model, which is based on the network architecture of ResNet18 . Here is an explanation of each line of code:

1-2: Import torch.nn.functional module and torch.utils.model_zoo module. The former contains many neural network functions, and the latter can load pre-trained models.

5: Set a string variable resnet18_url, this URL link points to the pre-trained ResNet18 model.

7-11: Define a function conv3x3, this function returns a 2D convolution layer, the convolution kernel size is 3x3, and the stride and padding can be customized.

13-34: A class named BasicBlock is defined, which inherits the base class of nn.Module in PyTorch. BasicBlock represents the basic block in ResNet and contains two 3x3 convolutional layers. If the number of channels of the input and output is different, or the step size is not 1, then the 1x1 convolution will be used for adjustment in the shortcut.

36-41: A function create_layer_basic is defined for creating multiple BasicBlocks .

43-61: A class named Resnet18 is defined , which also inherits the nn.Module class , representing the network structure of the entire ResNet18.

63-73: A method called forward is defined in the Resnet18 class , which is responsible for defining the forward propagation process of the network .

75-82: A method called , which is responsible for loading weights from the pre-trained model and updating the weights of the current model.

84-95: A method called get_params is defined in the Resnet18 class , which is used to obtain parameters that require and do not require weight decay in the network.

97-105: This part of code is under __name__ == " main ", which means that it will only be executed when this script is run directly. This part of the code instantiates the Resnet18 class, then generates a batch of random data, passes through the Resnet18 model, and then prints out the output size of the middle layer.

So the main goal of this script is to define a ResNet18 model, then load the pretrained weights, and do forward propagation given the input data.

Here is the code explained line by line:

 
 

pythonCopy code

import torch.nn.functional as F

This line imports the function module of PyTorch, which contains many neural network functions, such as activation function and loss function.

 
 

pythonCopy code

import torch.utils.model_zoo as modelzoo

This line imports PyTorch's model_zoo module, which provides a convenient way to load various pretrained models .

 
 

pythonCopy code

resnet18_url = 'https://download.pytorch.org/models/resnet18-5c106cde.pth'

The download link of the pre-trained ResNet18 model is set.

 
 

pythonCopy code

def conv3x3(in_planes, out_planes, stride=1): """3x3 convolution with padding""" return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, padding=1, bias=False)

Defines a function called conv3x3 that returns a 3x3 convolutional layer with padding. Input parameters include the number of channels for input and output, and the step size.

 
 

pythonCopy code

class BasicBlock(nn.Module): def __init__(self, in_chan, out_chan, stride=1): super(BasicBlock, self).__init__() self.conv1 = conv3x3(in_chan, out_chan, stride) self.bn1 = nn.BatchNorm2d(out_chan) self.conv2 = conv3x3(out_chan, out_chan) self.bn2 = nn.BatchNorm2d(out_chan) self.relu = nn.ReLU(inplace=True)

A class called BasicBlock is defined, which is the basic module of ResNet. It consists of two convolutional layers and two batch normalization layers, and a ReLU activation function.

 
 

pythonCopy code

self.downsample = None if in_chan != out_chan or stride != 1: self.downsample = nn.Sequential( nn.Conv2d(in_chan, out_chan, kernel_size=1, stride=stride, bias=False), nn.BatchNorm2d(out_chan), )

This part of the code defines that when the input channel and the output channel are not equal, or the step size is not 1, in order to make the dimensions match, the input needs to be down-sampled.

 
 

pythonCopy code

def forward(self, x): residual = self.conv1(x) residual = F.relu(self.bn1(residual)) residual = self.conv2(residual) residual = self.bn2(residual)

In the forward function, the input x is first processed through two convolution-batch normalization-activation processes to generate a residual.

 
 

pythonCopy code

shortcut = x if self.downsample is not None: shortcut = self.downsample(x) out = shortcut + residual out = self.relu(out) return out

Then determine whether the input x needs to be down-sampled, and if so, down-sample it. Then add the downsampled shortcut to the residual, and then pass the ReLU activation function to generate an output.

 
 

pythonCopy code

def create_layer_basic(in_chan, out_chan, bnum, stride=1): layers = [BasicBlock(in_chan, out_chan, stride=stride)] for i in range(bnum-1): layers.append(BasicBlock(out_chan, out_chan, stride=1)) return nn.Sequential(*layers)

This function is used to create multiple BasicBlocks, the number of creation is specified by the bnum parameter. The input channel and output channel of the first BasicBlock are determined by the parameters of the function, and the input and output channels of subsequent BasicBlocks are the same.

 
 

pythonCopy code

class Resnet18(nn.Module): def __init__(self): super(Resnet18, self).__init__() self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False) self.bn1 = nn.BatchNorm2d(64) self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)

The Resnet18 class is defined, which contains a 7x7 convolutional layer, a batch normalization layer, and a maximum pooling layer.

 
 

pythonCopy code

self.layer1 = create_layer_basic(64, 64, bnum=2, stride=1) self.layer2 = create_layer_basic(64, 128, bnum=2, stride=2) self.layer3 = create_layer_basic(128, 256, bnum=2, stride=2) self.layer4 = create_layer_basic(256, 512, bnum=2, stride=2) self.init_weight()

Four layers are defined, and each layer is composed of two BasicBlocks. At the same time, the init_weight() function is called to initialize the weight of the network.

 
 
    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(self.bn1(x))
        x = self.maxpool(x)

        x = self.layer1(x)
        feat8 = self.layer2(x) # 1/8
        feat16 = self.layer3(feat8) # 1/16
        feat32 = self.layer4(feat16) # 1/32
        return feat8, feat16, feat32

Defines the forward pass function. First, the input goes through a convolution, a batch normalization , an activation function ReLU and a maximum pooling. Then, after four layers of processing in turn , three features of feat8, feat16, and feat32 are generated respectively, and returned.

 
 
    def get_params(self):
        wd_params, nowd_params = [], []
        for name, module in self.named_modules():
            if isinstance(module, (nn.Linear, nn.Conv2d)):
                wd_params.append(module.weight)
                if not module.bias对不起,这部分代码被切割了。我会继续分析接下来的可能代码。

```python
                if not module.bias is None:
                    nowd_params.append(module.bias)
            elif isinstance(module, nn.BatchNorm2d):
                nowd_params += list(module.parameters())
        return wd_params, nowd_params

Defines the function for weight initialization. First load the weights of the pre-trained ResNet18 model from model_zoo , and then update the weights of the current model. Note that the weights of the fully connected layers are ignored here.

This function is used to obtain the parameters in the model , and add the weights of the convolutional layer and the linear layer to the weight decay parameter list wd_params . If these layers have bias items, then add the bias items to the non-weight decay parameter list nowd_params . For the batch normalization layer, add all its parameters to the non-weight decay parameter list nowd_params.

 
 

if __name__ == "__main__":
    net = Resnet18()
    fms = net(torch.randn(1, 3, 224, 224))
    for fm in fms:
        print(fm.size())

This code is what will be executed when the script is run directly. First create an instance of Resnet18 , then create a batch of random image data, pass this network, and print out the output size of each layer .

The above is the detailed explanation of this code. This code defines a ResNet18 model and provides a method to load pre-trained weights. Also included is an example of how to use this model for forward propagation.

Guess you like

Origin blog.csdn.net/sinat_37574187/article/details/131617005