CNN_ convolution neural network acquaintance (1)

Lead

Machine learning, when our data is a picture, or say when the matrix, to how to process it? In fact, the appearance of the picture we have seen, by its very nature is one or more matrix. If we follow the matrix rows or columns, pulled into a long vector, then DNN method may be employed a process, as shown below:
Cat
as left as shown above, this is a 1000 × 1000 1000\times1000 -dimensional picture, and color photographs, it is R, P, G three channels, then the amount of this picture data is 3 6 3^6 , if we have a first layer of hidden layer units 100, the amount of data becomes 3 8 3^8 , this order of magnitude, for the consumption of resources is enormous.
In order to reduce the consumption of resources, we need to reduce the magnitude of the data, how to reduce it? From the perspective of the picture level, we can geometrically transform the image, extract certain features meaningful data. For example, may be an image blur (blur), sharpening (sharpen), extracted vertical boundaries and the like.
The image blurring, for example:
cat_blur
the map image blurring is how to do it first of all we need to know a few concepts:?
Convolution kernel: the kernel, and represents a picture of a custom matrix calculation matrix;
convolution kernel size: that ksize, represents the size of the convolution kernel, which can be 1,2,3,4 ...
moving step: the stride, step represents a move in the original convolution kernel, which can be 1,2,3,4
patch: That padding, 0 represents filled by the surrounding matrix convolution kernel output
NOTE: Usually at ksize stride and fixed, the shape of the output can be obtained by the following matrix equation:
W The in t = ( W K ) + 2 P ) / S + 1 W_{out}=(W - K )+ 2P )/S + 1
figure above, we first define a convolution kernel, as sit down, we define a 3 × 3 3\times3 matrix. Making it slide along the image matrix, are obtained by multiplying the movement position corresponding to each time and outputs a new matrix. It means that in order to do a fuzzy picture of the process.
image_metric

Convolution layer

From the above example, we can see a convolution kernel can get one output. If you use more than one convolution kernel, it will produce multiple outputs. In DNN, in general, we will add more hidden layers. Each layer has a plurality of output nodes. Illusion if we can be the result of each convolution kernel as a hidden layer node, then this one hidden layer can also be called convolution hidden layer. We can further understood by the following example:
Converlution
The above this figure is the formation of a convolution layer. When entering a 32 × 32 × 3 32\times32\times3 the RGB channel matrix through 6 5 × 5 5\times5 convolution kernel obtained 28 × 28 × 6 28\times28\times6 output. Specific interpretation picture is as follows:

  • RGB-3 channel input image, a depth of 3. Therefore, the depth of the convolution kernel is also 3, in order to make the convolution kernel with the picture slide, we also define the depth of the convolution kernel is 3, the size of 5 × 5 5\times5
  • By convolving the output form the above formula, a 32 × 32 × 3 32\times32\times 3 pictures through 5 × 5 × 3 5\times5\times3 convolution kernel calculation, to give a 28 × 28 28\times28 outputs.
  • A total of six convolution kernel, so our final output is 6 × 28 × 28 6\times28\times28
    If we as one hidden layer output, the output as shown below:
    Convenlution
    figure it can be used to activate functions ReLu activation function.

Pooling layer

Above we use convolution layer as a layer of hidden layers, but the amount of calculation is quite large. Now, there is a layer faster than the convolution build the hidden layer mode ---- pooled layer. It is by way of down-sampling, the size of the rapid reduction of input images without losing important information. Pooling layer generally have two expressions:

  • max pooling (maximum cell layer)
  • average pooling (average cell layer)
    Next, we use the max pooling example:
    max_pooling
    As shown above, we have a 4 × 4 4\times4 pictures matrix, and now we need to build a 2 × 2 2\times2 matrix core, but not the data, and then slid along so that nuclear matrix image matrix, each slide, each time the maximum value as the output matrix. Similarly, average pooling is the average value of each matrix as the output.

Convolution neural network

CNN, the abbreviation Convolutional Neural Network, the concept of convolution layer nucleation layer pooled our interpretation of the above, then CNN, is a convolution layer cell layer + + + pool convolution layer layer ... an ongoing cycle of nerves network structure.
The most classic CNN structure, a convolutional neural network structures for handwritten digit recognition proposed by Yann LeCun. As shown below:
mnist_cnn
a structure called the neural network in FIG LeNet-5, which consists of two layers convolution +1 +3 cell layers fully connected layers. Fully connected layers and layers convolution pool so what difference does it make? Convolution and all operations are pooled layer, all layers are connected all the data involved in computing.
About the most basic concepts of convolution neural network first introduced here to
learn Source : depth study and practice

Released eight original articles · won praise 6 · views 2527

Guess you like

Origin blog.csdn.net/weixin_42374329/article/details/105211122