learn how to blog

1. net.py file

  1. The purpose of writing python here is to tell him that I am writing python code.
#coding:utf8

# Copyright 2019 longpeng2008. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License");
# If you find any problem,please contact us
#
#     [email protected] 
#
# or create issues
# =============================================================================
import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
from torchsummary import summary

## 3层卷积神经网络simpleconv3定义
## 包括3个卷积层,3个BN层,3个ReLU激活层,3个全连接层

class simpleconv3(nn.Module):
    ## 初始化函数
    def __init__(self,nclass):
        super(simpleconv3,self).__init__()
        self.conv1 = nn.Conv2d(3, 12, 3, 2) #输入图片大小为3*48*48,输出特征图大小为12*23*23,卷积核大小为3*3,步长为2
        self.bn1 = nn.BatchNorm2d(12)
        self.conv2 = nn.Conv2d(12, 24, 3, 2) #输入图片大小为12*23*23,输出特征图大小为24*11*11,卷积核大小为3*3,步长为2
        self.bn2 = nn.BatchNorm2d(24)
        self.conv3 = nn.Conv2d(24, 48, 3, 2) #输入图片大小为24*11*11,输出特征图大小为48*5*5,卷积核大小为3*3,步长为2
        self.bn3 = nn.BatchNorm2d(48)
        self.fc1 = nn.Linear(48 * 5 * 5 , 1200) #输入向量长为48*5*5=1200,输出向量长为1200
        self.fc2 = nn.Linear(1200 , 128) #输入向量长为1200,输出向量长为128
        self.fc3 = nn.Linear(128 , nclass) #输入向量长为128,输出向量长为nclass,等于类别数

    ## 前向函数
    def forward(self, x):
        ## relu函数,不需要进行实例化,直接进行调用
        ## conv,fc层需要调用nn.Module进行实例化
        x = F.relu(self.bn1(self.conv1(x)))
        x = F.relu(self.bn2(self.conv2(x)))
        x = F.relu(self.bn3(self.conv3(x)))
        x = x.view(-1 , 48 * 5 * 5) 
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

if __name__ == '__main__':
    import torch
    x = torch.randn(1,3,48,48)
    model = simpleconv3(4)
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    model.to(device)
    x = x.to(device)
    y = model(x)
    print(model)

    summary(model, (3, 48, 48))

2. net.py file analysis

* nn.BatchNormal2D( args )
nn.BatchNorm2d represents the 2D batch normalization (Batch Normalization) operation, which is used to accelerate the training process of the deep neural network and improve the generalization performance of the model. This operation can normalize the output of the convolutional layer and the fully connected layer, making the input distribution of the network more stable, reducing the internal covariate shift (Internal Covariate Shift) problem, thereby improving the training speed and generalization of the model performance.

In this code, nn.BatchNorm2d(24) means to batch normalize the 2D feature map with input dimension 24. This operation normalizes the input of each feature channel to a distribution with a mean of 0 and a standard deviation of 1, making the network more stable during training and improving the generalization ability of the model.

Why three fc?
In deep learning, Convolutional Neural Network (CNN) is usually composed of Convolutional Layer, Activation Layer, Pooling Layer and Fully Connected Layer (Fully Connected Layer) and other layers. The convolutional layer extracts the features in the image through the convolution operation, the activation layer performs nonlinear mapping on the output of the convolutional layer, the pooling layer reduces the dimensionality of the output of the convolutional layer, and the fully connected layer combines the convolutional layer and the pooling layer. The features extracted by the normalization layer are combined, and the final feature vector is input to the softmax classifier for classification.

In this code, the simpleconv3 class includes 3 convolutional layers, 3 BN layers, 3 ReLU activation layers and 3 fully connected layers, of which 3 fully connected layers (fc1, fc2, fc3) are used to convolute The feature vectors extracted by the pooling layer and the pooling layer are combined to output the classification result. The input vector length of the first fully connected layer is 4855, the output vector length is 1200, the input vector length of the second fully connected layer is 1200, the output vector length is 128, and the input vector length of the third fully connected layer is 128 , the output vector length is nclass, which is equal to the number of categories of the classification task. Therefore, the role of these 3 fully connected layers is to map high-dimensional feature vectors to low-dimensional vectors for classification.

Guess you like

Origin blog.csdn.net/bobchen1017/article/details/129516287