Convolution neural network concepts

Convolution neural network

  Convolutional neural network (CNN) is a representative algorithm depth study  . It has characterized learning ability, able to input information translationally invariant classified according to their hierarchical structure, it is also known as "shift-invariant artificial neural network" . With the improvement of learning theory made depth values and computing devices, convolutional neural network has been developed rapidly, and applied to the computer vision , natural language processing field and the like .

  By convolution of two functions f, g third function to generate a mathematical operator, characterized by the function f and g through the area of ​​the overlapping portion of the translation and inversion. The mathematical definition of the formula:

  Indeed, the use of a convolutional network in discrete convolution, i.e. discontinuous, it is a way of operation, is in accordance with the convolution kernel, the position corresponding to the input data and the weighting operation, in conjunction with the following convolution the concept of nuclear, will be well understood.

 Convolution neural network the two most important point is the knowledge convolution kernel and architecture convolution neural network

  • Convolution kernel
    • Convolution kernel defined
    • Convolution operation
    • depth
    • Stride
    • Zero padding
  • Convolution neural network structure
    • INPUT input layer
    • Convolution layer CONV
    • Activation function layer RELU
    • Pooling layer POOL
    • Full connection layer FC

 

Convolution kernel

Convolution kernel is the core of the network, CNN's training process is to continually update the convolution kernel parameters until optimal process.

  Convolution kernel is defined : for a partial region of the input image, the weighted average processing, wherein this process heavy weights, defined by a function that is the convolution kernel.

Following three RGB color image with a color value of the channel, respectively to the red, green, and blue pixels within each channel can be used as a two-dimensional array to the right of the figure shows, the pixel values ​​represent values ​​between 0-255. Suppose a color image of 900 * 600, which may be a computer (900 * 600 * 3) indicates the array.

Convolution process

  Convolution process is based on a small matrix, i.e. the convolution kernel, on top of said matrix of pixels each step continuously by sweeping past, sweep to check the number of convolution is multiplied by the corresponding position, and for obtaining the sum , once every sweep, to obtain a value, then all Saowan generates a new matrix. As shown below

  How to read the convolution kernel set convolutional neural network convolution kernel the size, number. Convolution layers and generally (3,3) of the small matrix, the convolution kernel which each value is that we need to look for (training) neurons parameters (weight), will initially have a random initial value, when the train network, after the network will be continuously updated through these parameter values ​​to spread, until find the best parameter values. "Best" by the need to assess the loss of function.

  Convolution operation is equivalent to feature extraction , is equivalent to a convolution filter, we need to extract features.

Below convolution, sweeping from left to bottom right, the right of the finally obtained spectrum characteristic.

 

  Illustration: a filter (red outline) moving (convolution operation) on the input image to generate a feature map. On the same image, another filter (green border) generated convolution different characteristics in FIG. To note, convolution operation to capture local dependencies in the original image is very important. Note also that these two different filters how the different features obtained from the same piece of the original image of FIG. Remember that two or more images and filters only numerical matrices.
  In fact, the convolution neural network during the training will learn the value of these filters themselves (although we still need to specify parameters such as the number of filters, size, and other network frame before the training process). The more the number of the filter, the more the image feature extraction, image identifying new effect will be better.
 
Controlled by three parameters characteristic map (characteristic convolution) in size from the previous step to decide to perform convolution:
  • Depth : a depth corresponding to the number of filters we use the convolution operation. In the network shown in Figure 7, we use three different initial convolution filters ship image, thereby generating three different features of FIG. These three features can be seen as a two-dimensional map matrix stack, therefore, feature mapping the "depth" of 3.
  • Steps : the number of pixels is moving stride primary filter matrix on the input matrix. When the stride is 1, we first move the filter a pixel. When the stride is 2, each filter is moved two pixels. The larger steps, the smaller the generated feature map. It has two longitudinal and transverse directions
  • Zero filling : In some cases, the input matrix border filled with zeroes will be very convenient, so we can apply a filter to the input image boundary element matrix. Zero fill a nice feature is that it allows us to control the size of the characteristic map. Also known as zero padding is added convolution width, without the use of a zero-filled narrow convolution.

Zero padding (padding)

  After convolution dimensions become less, the resulting matrix will be smaller than the original matrix, this is not good calculation, the Padding is required, before each convolution in the original matrix patch package outside layer 0, can only fill in the lateral direction, or just fill in the longitudinal, or 0 up all around, so that the image is convoluted with the input image coincide in size.

For example: the need to do a convolution of the original matrix 5 * 5, with a 3 * 3 convolution kernels to sweep, sweep out the results matrix should be: 3 * 3 matrix, smaller.

Padding convolutional operation plus complement 0 circle, i.e. about 300 * 300 matrix plus outer circle "0", so that it becomes 300 * 300 302 * 302 matrix, it is then convolved 300 * 300, the same size and artwork.

 

Convolution neural network architecture

  CNN generally divided into five parts: an input layer , a convolution layer , the activation function layer , pooled layer , fully connected layer

  However, attention to the need of the injection point, a convolutional network for most, four layers are used alternately to intermediate, i.e. showing a convolution layer - layer activation function - cell layer - layer Convolution - activation function layer - cell layer ... alternating structure, of course, for some of the emerging network of convolution, even pooling layer are omitted, five-story structure just above the level of the general will.

 

The input layer

  Enter the entire network, the image is typically a matrix of pixels, in the above figure, it can be seen a three-dimensional structure of the input, this is because the general image will have a depth concept, as RGB color image is a * b * c form, wherein the first two-dimensional image is specified length and width, the third dimension is the depth that is three RGB color, a monochrome image is the depth 1

 

Convolution layer

  The main part of this layer is to perform convolution, convolution kernel already introduced the concept of convolution layer is actually realized this calculation the convolution kernel, at this level, we may see a few less species keywords:

  • Filter Filter: neurons convolution achieved previously defined.
  • Step Stride
  • Padding Padding
  • Depth Depth: Depth image here does not refer to, but refers to the number of neurons in one layer (filter), focused on different features Filter processed is different from those we want different feature maps, it is provided Filter plurality, such that each Filter to give a Feature Map process, will give a plurality Filter plurality Feature Map, these Feature Map perspective stacked together is output, it can be seen Filter number is the same as the Feature Map of this number, that is depth.

 

Excitation function layer

  In practice, using an activation function is often treated layer before convolution bound, in fact, is the role of the role of activation function, to linear, convolution in the network, the excitation function is generally used ReLu function, Note that under most circumstances will not use Sigmoid function processing.

 

Pooling layer

  Acting on the Feature Map, further concentration corresponding to the size of the input matrix, is further extracted features. After convolution operation we extract a lot of feature information, neighboring regions with similar characteristics information, you can replace each other, if all of these features to retain data will have information redundancy, to increase the computational difficulty, this time pooling is equivalent to drop dimensional operation. Pool of matrix is ​​in a small area, the maximum value or average value in place of the region to region, the small size of the matrix may be provided at the time of building their own network. Small matrix is ​​swept from top left to bottom right. As shown below

 

Pooling layer has the following functions:

  1. 对 Feature Map 又进行一次特征提取,这也是减小数据量的操作
  2. 获取更抽象的特征,防止过拟合,提高泛化性
  3. 经过这个处理,对输入的微小变化有更大的容忍,也就是说如果数据有一些噪音,那么经过这个特征提取的过程,就一定程度上减小了噪音的影响。

最后一点要注意的是,池化层并非是卷积网络所必需的。一些新的CNN网络设计时候并没有使用池化层。

全连接层

  在一开始的结构图中可以看出, CNN网络还有一个特点 : 可能有多轮 卷积层 和池化层的处理,这时候,模型已经将图像输入处理成了信息含量更高的特征,为了实现分类的任务(一般是分类,当然也会有其他的任务),需要使用全连接层来完成分类任务。

  对n-1层和n层而言,n-1层的任意一个节点,都和第n层所有节点有连接。即第n层的每个节点在进行计算的时候,激活函数的输入是n-1层所有节点的加权。像下面的中间层就是全连接方式。

 

 

 

 

 

 

 

 

 

 

 

 

 

Guess you like

Origin www.cnblogs.com/zhuminghui/p/11531878.html