Convolution neural network
the whole frame
- Full connection layer: all neurons are connected between adjacent layers
- CNN's structure
- Convolution layer
- ReLU layer
- Pooling layer
- Close to the output layer is used in "Affine-ReLU" composition
- The final output layer uses the "Affine-Softmax" composition
Convolution layer
Problems layer fully connected
- Shape data is ignored: an image input layer is connected to all the multi-dimensional data into one-dimensional data leveled
- Convolution layer shape can be maintained constant
- Said input and output data of convolution layer called the characteristic of FIG. (Featuremap)
Convolution operation
- Corresponds to the filter operation
- The elements corresponding to various locations on the filter element and the input multiplied and then summed
filling
- Before the convolution processing surrounding layer, the input data may want to fill a fixed data
- Resize output, maintaining the same space
Stride
- Location application interval is called filter steps (a stride of)
Calculating the size of the output
- Assuming that the input size \ ((H, W is) \) , filter size \ ((FH, FW) \) , an output size of \ ((OH, OW) \) , filled \ (P \) , step web is \ (S \) , the output size
\ [OH = \ frac {H + 2P-FH} {S} +1 \\ OW = \ frac {W + 2P-FW} {S} +1 \]
Convolution calculation of three-dimensional data
- When a plurality of feature maps the channel direction, it will be convolution of input data according to channel and filter, and adding the results.
- Weighting filters the data to be written in accordance with the weight (output_channel, input_channel, height, width) of the order
Batch
- Saving data to be passed between the layers of 4-dimensional data. Stored data in the order (batch_num, channel, height, width) of
Pooling layer
- Pooling is highly reduced, the calculation in the long direction of the space
- Pooling and window size set to the same value stride
- feature
- Not to learn the parameters
- Does not change channels
- Robust to small changes in the position (minor deviation occurs when the input data, pooling will return the same result)
Implementation of Convolutional layer and pooled layer
im2col function
- The input data is expanded to fit the filter.
- After application of the input data im2col 3 dimensional data into two-dimensional matrix
- Input data: im2col function unwound from a two-dimensional data, the behavior of a filter for each region
- Filter: is a longitudinal Expand Expand
- After multiplying the input data and the filter is a two-dimensional data, must be converted to the appropriate size
import sys,os
path = os.getcwd()+"\\sourcecode"
sys.path.append(path)
from common.util import im2col
import numpy as np
x1 = np.random.rand(1,3,7,7)#批大小为1,通道为3的7x7的数据
col1 = im2col(x1,5,5,stride=1,pad=0)#滤波器通道为3,大小为5x5
print(col1.shape)
x2 = np.random.rand(10,3,7,7)
col2 = im2col(x2,5,5,stride=1,pad=0)
print(col2.shape)
(9, 75)
(90, 75)
class Convolution:
def __init__(self,W,b,stride=1,pad=0):
self.W=W
self.b=b
self.stride=stride
self.pad = pad
def forward(self,x):
FN,C,FH,FW = self.W.shape
N,C,H,W = x.shape
out_h = int(1+(H+2*self.pad-FH)/self.stride)
out_w = int(1+(W+2*self.pad-FW)/self.stride)
col = im2col(x,FH,FW,self.stride,self.pad)
col_W = self.W.reshape(FN,-1).T
out = np.dot(col,col_W)+self.b
out = out.reshape(N,out_h,out_w,-1).tranpose(0,3,1,2)
return out
class Pooling:
def __init__(self,pool_h,pool_w,stride=1,pad=0):
self.pool_h = pool_h
self.pool_w = pool_w
self.stride = stride
self.pad = pad
def forward(self,x):
N,C,H,W = x.shape
out_h = int(1+(H-self.pool_h)/self.stride)
out_w = int(1+(W-self.pool_w)/self.stride)
#按照通道单独展开
col = im2col(x,self.pool_h,self.pool_w,self.stride,self.pad)
col = col.reshape(-1,self.pool_h*self.pool_w)
out = np.max(col,axis=1)
out = out.reshape(N,out_h,out_w,C).tranpose(0,3,1,2)
return out
Representative CNN
LeNet
- A continuous cell layer and a convolution layer, and finally the whole is connected via the output layer
- The sigmoid function activation function
- Original LeNet using subsampling (subsampling) the reduced size of the intermediate data without pooling
AlexNet
- Activation function using ReLU
- Using local normalized LRN (Local Response Normalization) layer.
- Use Dropout