2. take yourself CNN
2.1 CNN network structure
- CNN basic structure: an input layer, a convolution layer, active layer, reservoir layer, fully connected layer
2.1.1 Input layer:
2.1.2 Convolution level:
- Intuition convolution : = convolution calculation feature extraction
- Single convolution kernel gray image: extracting the individual features
- The term: feature map (characteristic map), activation map (activation map), convolved feature (feature convolution), receptive field (receptive field)
- Single convolution kernel on the RGB images : single feature extraction
- = The depth of the convolution kernel layer depth data input (channel number)
- The RGB image using multiple convolution kernel: extracting a plurality of different features
- A topical extracting a convolution kernel mode, the plurality of convolution kernels to extract a plurality of different modes of local
- Convolution hidden layer stack
- = The number of convolution kernel depth = next lower hierarchy data layer convolution kernel convolution depth
- Number = number of convolution kernel extraction features, hyper-parameters can be adjusted
- Convolution hidden layer: the combination of features
- The multilayer Convolution: one convolution feature is only partial, the higher the number of layers, the more global characteristics of the learned
- Parameters Note: stride
- A sliding step, there is the stride width and the height
- stride> 1, corresponding to the convolution result stride = 1 do downsampling
- Parameters Note: padding
- padding = valid: zero-padding operation is not performed, s = 1, every time convolution, width and height dimensions decrease data F-1, F is the size of the convolution kernel
- padding = same: for 0 or complementary copy around the input; the same width and height before and after convolution
- summary
- Input: W1 * H1 * D1
- Ultra parameters: ①the number of filters: K②the dimension of filters: F③stride steps: S④padding: P
- Output: W2 * H2 * D2 W = (W1 + 2P-F) / S + 1 H2 = (H1 + 2P-F) / S + 1 D2 = K
- Parameters: (F * F * D1 + 1) * K
2.1.3 active layer
- Activation function: sigmoid (x), tanh (x), relu (x)
2.1.4 Pooling layer
- In the sample width and height dimensions, without changing the depth dimension
- You can reduce the calculation amount doubled
- Compared stride, pooling layer may be selected in the sampling mode
- Pooling maximum (max-pooling): the neighborhood of the feature points taking the maximum value as the last feature
- Pooled mean (mean-pooling): the neighborhood of the feature point as the final averaged characteristic value
- summary
- Input: W is . 1 * H . 1 * D . 1
- Super parameters: the dimension offilters: F
- Output: W2 * H2 * D2 W2 = (W1-F) / S + 1 H2 = (H1-F) / S + 1 D2 = D1
- Parameters: max-pooling and not mean-pooling parameters
2.1.5 Full connection layer
- The feature map a plurality of layers into a straight one-dimensional vector
- Full use of the connector is connected to the output layer vector
- Corresponding to the output layer is a score for each category
2.1.6 Summary of network structures
- The general structure of convolution neural network:
- CONV + ReLU POOL combinations and multiple occurrences of: extracting a feature
- A plurality of structure FC as CNN or special output layer: as a classifier / detector / slicer
2.2 CNN network training
How to achieve CNN with 2.3 Paddle