Lead
Machine learning, when our data is a picture, or say when the matrix, to how to process it? In fact, the appearance of the picture we have seen, by its very nature is one or more matrix. If we follow the matrix rows or columns, pulled into a long vector, then DNN method may be employed a process, as shown below:
as left as shown above, this is a
-dimensional picture, and color photographs, it is R, P, G three channels, then the amount of this picture data is
, if we have a first layer of hidden layer units 100, the amount of data becomes
, this order of magnitude, for the consumption of resources is enormous.
In order to reduce the consumption of resources, we need to reduce the magnitude of the data, how to reduce it? From the perspective of the picture level, we can geometrically transform the image, extract certain features meaningful data. For example, may be an image blur (blur), sharpening (sharpen), extracted vertical boundaries and the like.
The image blurring, for example:
the map image blurring is how to do it first of all we need to know a few concepts:?
Convolution kernel: the kernel, and represents a picture of a custom matrix calculation matrix;
convolution kernel size: that ksize, represents the size of the convolution kernel, which can be 1,2,3,4 ...
moving step: the stride, step represents a move in the original convolution kernel, which can be 1,2,3,4
patch: That padding, 0 represents filled by the surrounding matrix convolution kernel output
NOTE: Usually at ksize stride and fixed, the shape of the output can be obtained by the following matrix equation:
figure above, we first define a convolution kernel, as sit down, we define a
matrix. Making it slide along the image matrix, are obtained by multiplying the movement position corresponding to each time and outputs a new matrix. It means that in order to do a fuzzy picture of the process.
Convolution layer
From the above example, we can see a convolution kernel can get one output. If you use more than one convolution kernel, it will produce multiple outputs. In DNN, in general, we will add more hidden layers. Each layer has a plurality of output nodes. Illusion if we can be the result of each convolution kernel as a hidden layer node, then this one hidden layer can also be called convolution hidden layer. We can further understood by the following example:
The above this figure is the formation of a convolution layer. When entering a
the RGB channel matrix through 6
convolution kernel obtained
output. Specific interpretation picture is as follows:
- RGB-3 channel input image, a depth of 3. Therefore, the depth of the convolution kernel is also 3, in order to make the convolution kernel with the picture slide, we also define the depth of the convolution kernel is 3, the size of 。
- By convolving the output form the above formula, a pictures through convolution kernel calculation, to give a outputs.
- A total of six convolution kernel, so our final output is
If we as one hidden layer output, the output as shown below:
figure it can be used to activate functions ReLu activation function.
Pooling layer
Above we use convolution layer as a layer of hidden layers, but the amount of calculation is quite large. Now, there is a layer faster than the convolution build the hidden layer mode ---- pooled layer. It is by way of down-sampling, the size of the rapid reduction of input images without losing important information. Pooling layer generally have two expressions:
- max pooling (maximum cell layer)
- average pooling (average cell layer)
Next, we use the max pooling example:
As shown above, we have a pictures matrix, and now we need to build a matrix core, but not the data, and then slid along so that nuclear matrix image matrix, each slide, each time the maximum value as the output matrix. Similarly, average pooling is the average value of each matrix as the output.
Convolution neural network
CNN, the abbreviation Convolutional Neural Network, the concept of convolution layer nucleation layer pooled our interpretation of the above, then CNN, is a convolution layer cell layer + + + pool convolution layer layer ... an ongoing cycle of nerves network structure.
The most classic CNN structure, a convolutional neural network structures for handwritten digit recognition proposed by Yann LeCun. As shown below:
a structure called the neural network in FIG LeNet-5, which consists of two layers convolution +1 +3 cell layers fully connected layers. Fully connected layers and layers convolution pool so what difference does it make? Convolution and all operations are pooled layer, all layers are connected all the data involved in computing.
About the most basic concepts of convolution neural network first introduced here to
learn Source : depth study and practice