About convolution neural network theory

1, the concept Introduction

Convolutional neural network (Convolutional Neural Networks, CNN) is a class that contains convolution calculation possessed deep structure of feed-forward neural network (Feedforward Neural Networks), is one of the representatives depth learning algorithm (deep learning) is.

Convolution neural network characterization study (representation learning) ability to carry out their hierarchical structure translationally invariant classification (shift-invariant classification) to input information can be supervised learning and unsupervised learning, convolution in its hidden layer sparseness parameter sharing and inter-core layer connections such that convolutional neural network can be a small amount of calculation of gRID (grid-like topology) features, such as pixel and audio learning, and there is no additional stabilizing effect on the data Engineering feature (feature engineering) requirements, and it is widely used in the field of computer vision, and natural language processing.

Convolutional neural network can be used for classification, retrieval, identification (classification and regression), segmentation, feature extraction, the key point positioning (gesture recognition) and the like.

2, in contrast to the neural network of
Here Insert Picture Description
the left neural network (fully connected), the right is convolutional neural network. Convolutional neural network: an input layer; convolutional layer; activation function; cell layer; fully connected layer (INPUT - CONV - RELU - POOL - FC)
Here Insert Picture Description

3, the convolutional layer

Suppose the input image size (32,32,3), the size of the convolution kernel (5,5,3). The depth of the input image and the convolution kernel must be consistent depth (here, 3).

Convolution result (feature extraction) is (the size of the convolution kernel and the same size) for extracting a region of a certain size characteristic value representing the region, all regions traversed images will give an integral value of FIG characterized by the feature. Can be specified in the convolution kernel convolution process number (same size), a plurality of features can be obtained FIG.

Here Insert Picture DescriptionIn addition to the convolution target input image, but also is a characteristic graph:
Here Insert Picture Descriptionactual results convolution (classification or regression further complete the task according to the result of feature extraction):
Here Insert Picture Description
calculated convolution Example:
input image size (7,7, 3) specify the size of the convolution kernel (3,3,3), by a movement (from left to right, top to bottom) convolution kernel window to traverse all regions of the image (inner product calculation, i.e. the corresponding element to by then summed), and then summing the result for each depth plus the bias term B, to give the final result. Parameters in the core through the convolutional neural network convolution forward propagation and back propagation adjustment
Here Insert Picture Description
slide step (a stride of)
Here Insert Picture Description
the sliding step, the smaller the characteristic view taken generally desirable where better characteristic value, the sliding step size is generally small, but considering the amount of computation, the sliding step size not be too small.

The amount of information in a different number of points as the convolution process the results of the contribution may differ. To make better use of the information of the edge pixel, adding one or more filler layers (pading) at a peripheral edge, filled with zeros below shows:

Here Insert Picture Description
Calculating an output size:
Input size: W1 x H1 x D1; hyper-parameters need to specify: filter number (K), filter size (F), the step length (S), a boundary padding (P)
output sizes formula:
Here Insert Picture Description
Here Insert Picture Description
Volume product parameters (weights) shared principles

All rights eigenvalues for each feature shared FIG weight w, and have their own bias term B, so that the number of parameters for the convolution operations involving the figure is calculated as follows:
Here Insert Picture Description
4, Pool layer

Cell layer (Pooing layer) is a general characteristic feature FIG compressed (using the average or maximum value represents the area of the region, no weighting parameters) Operation (downsampling).
Here Insert Picture Description
Example Pooing Result:
Here Insert Picture Description
5, the forward propagation and reverse propagation principles

5.1 convolution layer before propagation:
Here Insert Picture Description
where x is the input image data has four dimensions represent the number (usually batch operations on input data by a batch operation in practice, there is a first to a batch of Photos for example), the number of image channels (the RGB), high H pictures, picture width w. FIG 3 the selected convolution kernel, wherein the four dimensions represent the convolution kernel number, the number of pictures corresponding to the channel number, and a high convolution kernel width.

Convolution layer backpropagation 5.2:
Here Insert Picture Description
object backpropagation convolutional layer is to update the weighting parameter w, the figure shows the first convolution back propagation process, in order to update the parameter w. So first calculate dJ / dw. Will pass over the upper layer is assumed that a gradient dJ / dout, according to the chain rule derivative, so dJ / dw = dJ / dout * dout / dw = dJ / dout * x. Convenient variable names in a computer because of the dJ / dout referred to as dout, dJ / dw denoted by DW, i.e., in the case of FIG. Later also use this mark concerned.

First, to be clear: the size dw and w are the same. A point is multiplied by a region can get a region. Then the back-propagation procedure is equivalent to: the input layer multiplied by the window in the draw with a matrix element will be obtained in a dout dw matrix; and slide the sliding window continues to seek the next dw, turn down, and finally the resultant plurality of dw addition, the implementation of w = w - dw completes the calculation of back propagation.

5.3 pooled layer before propagation and back propagation to
Here Insert Picture Description
the corresponding results and mean propagation and reverse propagation front of the two pools max manner respectively in FIG. For pooled mean propagation in the reverse mode is evenly distributed among the corresponding value directly corresponding region; max for pooling approach is only at a maximum value when the original backpropagation (the same before and after), the region corresponding to the position of each other 0.

Guess you like

Origin blog.csdn.net/qq_43660987/article/details/91799355