Chapter7_ convolution neural network

Convolution neural network

the whole frame

  • Full connection layer: all neurons are connected between adjacent layers
  • CNN's structure
    • Convolution layer
    • ReLU layer
    • Pooling layer
    • Close to the output layer is used in "Affine-ReLU" composition
    • The final output layer uses the "Affine-Softmax" composition

Convolution layer

Problems layer fully connected

  • Shape data is ignored: an image input layer is connected to all the multi-dimensional data into one-dimensional data leveled
  • Convolution layer shape can be maintained constant
  • Said input and output data of convolution layer called the characteristic of FIG. (Featuremap)

Convolution operation

  • Corresponds to the filter operation
  • The elements corresponding to various locations on the filter element and the input multiplied and then summed

filling

  • Before the convolution processing surrounding layer, the input data may want to fill a fixed data
  • Resize output, maintaining the same space

Stride

  • Location application interval is called filter steps (a stride of)

Calculating the size of the output

  • Assuming that the input size \ ((H, W is) \) , filter size \ ((FH, FW) \) , an output size of \ ((OH, OW) \) , filled \ (P \) , step web is \ (S \) , the output size
    \ [OH = \ frac {H + 2P-FH} {S} +1 \\ OW = \ frac {W + 2P-FW} {S} +1 \]

Convolution calculation of three-dimensional data

  • When a plurality of feature maps the channel direction, it will be convolution of input data according to channel and filter, and adding the results.
  • Weighting filters the data to be written in accordance with the weight (output_channel, input_channel, height, width) of the order

Batch

  • Saving data to be passed between the layers of 4-dimensional data. Stored data in the order (batch_num, channel, height, width) of

Pooling layer

  • Pooling is highly reduced, the calculation in the long direction of the space
  • Pooling and window size set to the same value stride
  • feature
    • Not to learn the parameters
    • Does not change channels
    • Robust to small changes in the position (minor deviation occurs when the input data, pooling will return the same result)

Implementation of Convolutional layer and pooled layer

im2col function

  • The input data is expanded to fit the filter.
  • After application of the input data im2col 3 dimensional data into two-dimensional matrix
  • Input data: im2col function unwound from a two-dimensional data, the behavior of a filter for each region
  • Filter: is a longitudinal Expand Expand
  • After multiplying the input data and the filter is a two-dimensional data, must be converted to the appropriate size
import sys,os
path = os.getcwd()+"\\sourcecode"
sys.path.append(path)
from common.util import im2col
import numpy as np

x1 = np.random.rand(1,3,7,7)#批大小为1,通道为3的7x7的数据
col1 = im2col(x1,5,5,stride=1,pad=0)#滤波器通道为3,大小为5x5
print(col1.shape)

x2 = np.random.rand(10,3,7,7)
col2 = im2col(x2,5,5,stride=1,pad=0)
print(col2.shape)
(9, 75)
(90, 75)
class Convolution:
    def __init__(self,W,b,stride=1,pad=0):
        self.W=W
        self.b=b
        self.stride=stride
        self.pad = pad
        
    def forward(self,x):
        FN,C,FH,FW = self.W.shape
        N,C,H,W = x.shape
        out_h = int(1+(H+2*self.pad-FH)/self.stride)
        out_w = int(1+(W+2*self.pad-FW)/self.stride)
        
        col = im2col(x,FH,FW,self.stride,self.pad)
        col_W = self.W.reshape(FN,-1).T
        out = np.dot(col,col_W)+self.b
        
        out = out.reshape(N,out_h,out_w,-1).tranpose(0,3,1,2)
        return out
class Pooling:
    def __init__(self,pool_h,pool_w,stride=1,pad=0):
        self.pool_h = pool_h
        self.pool_w = pool_w
        self.stride = stride
        self.pad = pad
        
    def forward(self,x):
        N,C,H,W = x.shape
        out_h = int(1+(H-self.pool_h)/self.stride)
        out_w = int(1+(W-self.pool_w)/self.stride)
        
        #按照通道单独展开
        col = im2col(x,self.pool_h,self.pool_w,self.stride,self.pad)
        col = col.reshape(-1,self.pool_h*self.pool_w)
        
        out = np.max(col,axis=1)
        out = out.reshape(N,out_h,out_w,C).tranpose(0,3,1,2)
        
        return out

Representative CNN

LeNet

  • A continuous cell layer and a convolution layer, and finally the whole is connected via the output layer
  • The sigmoid function activation function
  • Original LeNet using subsampling (subsampling) the reduced size of the intermediate data without pooling

AlexNet

  • Activation function using ReLU
  • Using local normalized LRN (Local Response Normalization) layer.
  • Use Dropout

Guess you like

Origin www.cnblogs.com/suqinghang/p/12264507.html