[Deep Learning Experiment] Convolutional Neural Network (2): Customize a simple two-dimensional convolutional neural network

Table of contents

1. Experiment introduction

2. Experimental environment

1. Configure the virtual environment

2. Library version introduction

3. Experimental content

0. Import necessary toolkits

1. Two-dimensional cross-correlation operation (corr2d)

2. Two-dimensional convolution layer class (Conv2D)

a. __init__ (initialization)

b. forward (forward propagation function)

3. Model training


1. Experiment introduction

        This experiment implemented a simple two-dimensional convolutional neural network , including a two-dimensional cross-correlation operation function and a custom two-dimensional convolution layer class, and performed a convolution operation on a randomly generated two-dimensional tensor.

 2. Experimental environment

    This series of experiments uses the PyTorch deep learning framework. The relevant operations are as follows:

1. Configure the virtual environment

conda create -n DL python=3.7 
conda activate DL
pip install torch==1.8.1+cu102 torchvision==0.9.1+cu102 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
conda install matplotlib
 conda install scikit-learn

2. Library version introduction

software package This experimental version The latest version currently
matplotlib 3.5.3 3.8.0
numpy 1.21.6 1.26.0
python 3.7.16
scikit-learn 0.22.1 1.3.0
torch 1.8.1+cu102 2.0.1
torchaudio 0.8.1 2.0.2
torchvision 0.9.1+cu102 0.15.2

3. Experimental content

ChatGPT:

        Convolutional Neural Network (CNN) is a deep learning model that is widely used in image recognition, computer vision, pattern recognition and other fields. Its design is inspired by how the visual cortex works in biology.

        The convolutional neural network consists of multiple convolutional layers, pooling layers and fully connected layers .

  • The convolution layer is mainly used to extract local features of the image. Through the processing of convolution operations and activation functions, the feature representation of the image can be learned.
  • The pooling layer is used to reduce the dimension of the feature map and reduce the number of parameters while retaining the main feature information.
  • The fully connected layer is used to map the extracted features to the probabilities of different categories for classification or regression tasks.

        Convolutional neural networks have strong advantages in image processing. They can automatically learn feature representations with hierarchical structures and have certain invariance to image transformations such as translation, scaling, and rotation . These characteristics make convolutional neural networks the model of choice for tasks such as image classification, target detection, and semantic segmentation. In addition to image processing, convolutional neural networks can also be applied to other fields, such as natural language processing and time series analysis. By converting text or time series data into a two-dimensional form, convolutional neural networks can be used to process related tasks.

0. Import necessary toolkits

import torch
from torch import nn
import torch.nn.functional as F
  • torch.nn: The neural network module in PyTorch provides various neural network layers and functions.
  • torch.nn.functional: Functional neural network layers in PyTorch, such as activation functions and loss functions.
 
 

1. Two-dimensional cross-correlation operation (corr2d)

[Deep Learning Experiment] Convolutional Neural Network (1): Convolution operation and its Pytorch implementation (one-dimensional convolution: narrow convolution, wide convolution, equal-width convolution; two-dimensional convolution)_QomolangmaH's blog-CSDN blog icon-default.png?t=N7T8https://blog.csdn.net/m0_63834988/article/details/133278425?spm=1001.2014.3001.5501

        As shown before, in the process of calculating convolution, the convolution kernel needs to be flipped . In terms of specific implementation, cross-correlation operations are generally used instead of convolutions, thereby reducing some unnecessary operations or overhead.

  • Flip refers to reversing the order in two dimensions (top to bottom, left to right), that is, rotating 180 degrees.
  • The difference between cross-correlation and convolution is only whether the convolution kernel is flipped . Therefore cross-correlation can also be called non-flip convolution .

        Convolution is used in neural networks for feature extraction . Whether the convolution kernel is flipped has nothing to do with its feature extraction capability . Especially when the convolution kernel is a learnable parameter, convolution and cross-correlation are equivalent in capability. Therefore, for the sake of implementation (or description) convenience, we use cross-correlation instead of convolution. In fact, the convolution operations in many deep learning tools are actually cross-correlation operations.

def corr2d(X, K): 
    h, w = K.shape
    Y = torch.zeros((X.shape[0] - h + 1, X.shape[1] - w + 1))
    for i in range(Y.shape[0]):
        for j in range(Y.shape[1]):
            Y[i, j] = (X[i:i + h, j:j + w] * K).sum()
    return Y

  • Input: input tensor X and convolution kernel tensor K.
  • Output: Cross-correlation operation result tensor Y, with shape (X.shape[0] - K.shape[0] + 1, X.shape[1] - K.shape[1] + 1).
  • Each element of the output tensor Y is traversed through two nested loops, and the cross-correlation operation result is calculated using local multiplication and summation.

2. Two-dimensional convolution layer class (Conv2D)

class Conv2D(nn.Module):
    def __init__(self, kernel_size, weight=None):
        super().__init__()
        if weight is not None:
            self.weight = weight
        else:
            self.weight = nn.Parameter(torch.rand(kernel_size))
        self.bias = nn.Parameter(torch.zeros(1))

    def forward(self, x):
        return corr2d(x, self.weight) + self.bias

a. __init__ (initialization)

  • Accepts one kernel_sizeargument as the size of the convolution kernel, and optionally one weightargument as the weight of the convolution kernel.
  • If no weightparameters are provided, a weight of the same shape is randomly generated kernel_sizeand set as a trainable parameter ( nn.Parameter).
  • A bias term is defined biasand set as a trainable parameter.

b. forward (forward propagation function)

        Call the previous corr2dfunction to calculate the correlation between the input xand the convolution kernel weight , and add the calculation result to the bias term as the output of forward propagation.self.weightself.bias

3. Model testing

# 由于卷积层还未实现多通道,所以我们的图像也默认是单通道的
fake_image = torch.randn((5,5))
# 实例化卷积算子
conv = Conv2D(kernel_size=(3,3))
output = conv(fake_image)

(5, 5)A random input image of         size is created fake_image, then the class is instantiated Conv2D, and the convolution kernel size is passed in (3, 3). Then call convthe object's forwardmethod to fake_imageperform the convolution operation and save the result outputin a variable. The final output outputshape.

Note : This experiment only simply implements a two-dimensional convolution layer, only supports single-channel convolution operations, and does not include training and optimization processes. If you want to know what happens next, please listen to the next chapter for decomposition.

Guess you like

Origin blog.csdn.net/m0_63834988/article/details/133278280