[Graph Convolutional Network] 01-Convolutional Neural Network: From Euclidean Space to Non-Euclidean Space

Artificial Neural Network Development Wave

insert image description here
The third wave - Convolutional Neural Network
Professor at the University of Toronto, Canada, Geoffery Hinton, a master in the field of machine learning, and his students published a paper in "Science" (Hinton, G. E. Reducing the Dimensionality of Data with Neural Networks[ J]. Science, 2006, 313(5786):504-507.), opened a new wave of deep learning in academia and industry!
insert image description here
It was the ImageNet competition in 2012 that really made the convolutional neural network receive widespread attention.
insert image description here

Convolution calculation and convolutional neural network structure

Convolution definition

  • Convolution is an important operation in analytical mathematics
  • Let f(x) and g(x) be two integrable functions on R
    • Continuous form convolution is defined as follows
      insert image description here
    • discrete spatial convolutioninsert image description here

Basic concepts in convolution

  • Kernel Size: The receptive field of the convolution operation. In two-dimensional convolution, the general convolution kernel size is an odd number, mainly to match the center of the convolution kernel with the center of the calculation result. Usually set to 3, that is, the convolution kernel size is 3×3.
  • Stride: The stride size of the convolution kernel when traversing the image, the default value is usually set to 1
  • Boundary expansion (Padding): The processing method of the sample boundary, usually to make the input size consistent with the size of the convolution result.
  • Input and output channels (Channels): When constructing a convolutional layer, it is necessary to define the number of input channels I and the number of output channels O. The parameter quantity of each network layer is I×O×K (K is the number of parameters of the convolution kernel)

The characteristics of convolution

  1. Convolution calculations are local calculations
  2. Convolutions are feature detectors

More convolution animations

Convolution animations

N.B.: Blue maps are inputs, and cyan maps are outputs.

img img img img
No padding, no strides Arbitrary padding, no strides Half padding, no strides Full padding, no strides
img img img
No padding, strides Padding, strides Padding, strides (odd)

Transposed convolution animations

N.B.: Blue maps are inputs, and cyan maps are outputs.

img img img img
No padding, no strides, transposed Arbitrary padding, no strides, transposed Half padding, no strides, transposed Full padding, no strides, transposed
img img img
No padding, strides, transposed Padding, strides, transposed Padding, strides, transposed (odd)

Dilated convolution animations

N.B.: Blue maps are inputs, and cyan maps are outputs.

img
No padding, no stride, dilation

Basic concepts - pooling, full connection

  • pooling layer

    • Special form convolution.
    • Dimensionality reduction, data calculation reduction, overfitting mitigation, feature invariance (translation, scale)
      insert image description here
  • fully connected layer

    • Model output layer
    • classification, regression
      insert image description here

Multilayer Convolutional Neural Network Example

insert image description here insert image description here insert image description here
The convolution kernel size is 5*5, the step size is 1, the boundary is not expanded, the input channel is 3, and the output channel is 2 The number of output channels is 6, and other parameters remain unchanged. The number of output channels of the first layer is 3, and the number of output channels of the second layer is 6.

The development history of convolutional neural network
insert image description here

Why can get rapid development:

  • Data explosion: image data, text data, voice data, social network data, scientific computing, etc.
  • Computing performance has been greatly improved

The basics of modern convolutional neural network structure - LeNet

  • LeNet was born in 1994 and proposed by Yann LeCun for handwritten character recognition and classification
  • 6-layer network structure: two convolutional layers, two downsampling layers and two fully connected layers
  • The convolution layer consists of two parts: convolution calculation and sigmoid nonlinear activation function
    insert image description here

The focus of research turns to convolutional neural networks - AlexNet, VGGNet

  • Deeper network: AlexNet has a total of 8 layers, and VGGNet has a total of 16 or 19 layers
  • Data augmentation: In order to enhance the generalization ability of the model, the 256×256 original image is randomly cropped to obtain an image with a size of 224×224, which is input into the network for training
  • ReLU nonlinear activation function: reduce the amount of calculation, alleviate gradient disappearance, and alleviate overfitting. The ReLU activation function has now become the most general activation function in neural networks
  • Dropout: The neurons in the fully connected layer are deactivated with a certain probability, and the deactivated neurons no longer participate in training. The reference of Dropout effectively alleviates the overfitting of the model
  • Pre-Training: First train a part of the small network to ensure stability, and then gradually deepen the network on this basis.

Convolutional Neural Network Depth and Width Expansion - GoogLeNet

  • The network is deeper: GoogLeNet has a total of 22 layers
  • Multi-resolution structure: introduce Inception structure to replace traditional convolution + activation
  • Calculation reduction: use 1×1 convolution kernel to achieve data dimensionality reduction
    Inception structure
    insert image description here
    GoogLeNet structure
    insert image description here

Convolutional neural network depth, width and expansion - ResNet, DenseNet

  • Deeper network: ResNet has more than one hundred layers (ResNet-101)
  • Residual connection: features are transferred via two routes, the regular route and the shortcut
  • Skip connection: low-level features are fused with high-level features
Schematic diagram of ResNet residual connection Schematic diagram of DenseNet structure
insert image description here insert image description here

Convolutional neural network generalization extension

  • Depthwise Separable Convolution
    • 5×5 channel convolution
    • 1×1 convolution to fuse the features of each channel
      insert image description hereinsert image description here
  • Atrous convolution (dilated convolution)
    • local input unchanged
    • The receptive field becomes larger
Dilated Convolutional Receptive Field Atrous convolution calculation process
insert image description here img

Convolutional Neural Network Computational Paradigm

  • Multi-dimensional European space
  • local spatial response
  • Convolution parameter sharing

insert image description here
insert image description here


Convolutional Neural Networks Extended to Non-Euclidean Spaces

Euclidean space irregular connection - active convolution

  • Active Convolution (CVPR 2017)
    • Bilinear interpolation: In discrete coordinates, the pixel values ​​of continuous positions can be calculated by interpolation method
      insert image description here

    • learnable parameters△αk,△βk

    • Variable convolution kernel shape fixed

  • The shape of the convolution kernel can be changed

insert image description here
Euclidean space irregular connection - deformable convolution

  • Deformable Convolution (ICCV 2017)

    • 3×3 deformable convolution (N=9)
    • Each position corresponds to a bias
    • Bias is learned through additional convolutions
    • Each offset is a 2D vector
      insert image description here
  • Convolution kernel position parameterization

  • Continuous bilinear interpolation

  • Traditional BP algorithm training
    insert image description here

European Spatial Convolutional Neural Networks

  • Processing fixed input dimension data, local input data must be ordered
  • Voice, image, and video (rule structure) meet the above two requirements

Non-Euclidean spatial structure data

  • Variable local input dimensions
  • local input order out of order
    insert image description here

Non-Euclidean Spatial Convolutional Neural Networks
insert image description here

insert image description here


Contents of this series

  1. Introduction, Convolutional Neural Networks: From Euclidean Space to Non-Euclidean Space (content of this article)
  2. Introduction to Spectral Graph Convolution
  3. Introduction to spatial map convolution (1)
  4. Introduction to spatial map convolution (2)
  5. Practical Applications of Graph Convolution
  6. Implementation of graph convolution code based on PyTorch

Follow-up notes will be updated in the column "Graph Convolutional Neural Networks".

Guess you like

Origin blog.csdn.net/happy488127311/article/details/128769291