Andrew Ng "convolution neural network" in the first week notes

Disclaimer: the author is limited, blog inevitably a lot of flaws and even serious mistakes, I hope you correct. While writing the biggest goal is also to exchange learning, not paying attention and spread. Long way to go, and you encourage each other. https://blog.csdn.net/yexiaohhjk/article/details/87716730

The first week convolution neural network

1. The computer vision (Computer vision)

  • Deep learning in visual computing research can inspire many areas, including voice recognition, etc.

  • Visual computing tasks:

    • Photo Gallery (Image classification)
    • Target Detection (Object detection)
    • Style Migration (Neural style transfer)
  • Computer vision input data is facing big challenges posed two problems, one neural network complexity, multi-parameter, prone to over-fitting. The second is computationally intensive require large memory. It makes the convolutional neural network (CNN) came into being.

Example 2. Detection edge (Edge detection example)

  • Convolution operation is part of the basic neural network convolution

  • Convolution: The actual value for each picture area is multiplied by the summation filter (filters), to achieve the original retention edge feature dimension reduction. Vertical edge detection for example, the size of the original picture 6x6, the filter size filter 3x3, the convolved image size 4x4, to obtain the following results :( noted here *represents a convolution operation, with nothing to do multiplication python, the convolution conv_forward()represented; tensorflow, the convolution tf.nn.conv2d()represented; kerasin convolution Conv2D()shown).

  • Why convolution can detect the edge?

    As a simple example, the following figure is the leftmost picture, it can be seen there is a significant vertical intermediate, obtained via the convolution of the intermediate image to the original image represents an edge. Here to see some rough edges, because our image is too small to apply the normal image will find its edge detection is quite good.

  • The following figure shows the difference between the two modes, a first called the positive side, there are images showing light to dark, she called the second negative-side, showing images from dark to light. In practice, these two ways gradient edge detection result does not affect, if not mind this distinction, it can operate on the absolute value of the output image.

    The horizontal edge detection filter operator is as follows:

  • In addition to the simple mention of Vertical, Horizontal above filter outside, common filter researchers have proposed Soble, Schor and other characteristics of the two filters are the right to increase the central area of the picture heavy.

    But in the depth of learning, the complexity of the image edge detection can choose to filter values ​​of the parameters as also neural network training. Learning may be superior to the filter obtained in any of a handwriting filter, as a vertical edge relative to the horizontal, which can detect any inclination angle of the edge.

4.Padding (filling)

  • If there is original nxn, filter (filter size fxf, then the size of a convolution operation of FIG (n-f+1)*(n-f+1).

  • But this convolution approach will make a lot of pictures reduced in size, while the image edge features are retained very little. So added paddingthe operation to fill the edge of pa 0, it is the last original size (n+2*p-f+1)*(n+2*p-f+1).

  • If the filling is not convolution is called Vaildconvolution of the image size is unchanged before and after convolution called Sameconvolution satisfied p = (f-1)/2to.

  • While generally selected filter (filters) is an odd size dimension.

The convolution steps (Stride)

  • Step decision to move from a filter:

  • If the original image size nxn, padding size p, filter for the fxfstep size for the stride s, the convolution of the image size (n+2p-f)/s+1, if the result is rounded down to an integer.

    This principle is only when your filter where all is within images or image is filled, will perform convolution operation.

  • Convolution operation on the matrix in the book of the paper and signal processing, the operation will first have flipped along the diagonal of the matrix. But for deep learning, this step is not important.

6.Convolutions Over Volume dimensional convolution

  • This section on the RGB image convolution on an example on the three-dimensional convolution standing, three-dimensional data on a two-dimensional data of the channel increases, so that the third dimension is the number of image channels, corresponding to the number of channels to be input filter as the number of channels of the image, such as the input picture is 6x6x3, respectively, the picture height (height), width (weight) and the passage (#channel), the filter must also be channel 3, it may be a 3x3x3 5x5x3, i.e., the size It can be customized as needed, and the channel must be consistent.

  • A two-dimensional convolution calculation process and the like, i.e. filter cover picture, multiplied by the corresponding elements of the last addition, then in steps moving the filter on the image, the filter may be designed as a different channel may be different, for example, want to detect vertical red channel, R-channel design to detect a vertical edge, G and B channels and filters are set to 0.

  • At the same time convolution filter selection is to choose according to your needs. There are many features to be detected, the number of filter can be constructed, the final result of the synthesis of a plurality of multilayer matrix. Below, the RGB image through a convolution of two filter generates two 4x4images.

7.One Layer of a Convolutional Network single convolutional network

Convolutional neural network layer structure is as follows :( two filters)
Here Insert Picture Description
weighting filter W is, for example, the filter element 27, then the value 27 and W have to be updated, plus B, the parameters is 28, if there is one filter 10 then this parameter is the total layer 280, regardless of how much input image size, the parameter is 280, i.e., the number of parameters and filters only the relevant, which is one of the characteristics of CNN, parameter is less effective to prevent overfitting.

Here Insert Picture Descriptionm is the number of samples.

8.Simple Convolutional Network Example simple convolution neural network

This section describes the implementation process is simple convolutional neural network through an example:

Here Insert Picture DescriptionThe layers of the CNN structure model as shown in FIG. Convolution until third hidden layer, i.e. its dimensions are 7 x 7 x 40, the final step is to expand all features into a convolution of the obtained vector dimension of 1960 x 1, which is input as the final layer. And filled into Logistic (binary) or SoftMax (multiple classification). Finally get the predicted output.

CNN in the hidden layer of three types:

  • Convolution (CONV) convolutional layer
  • Pooling (POOL) cell layer
  • Fully Connected (FC) full connection layer

9. Pooling Layers cell layer

Pooling layer effect is to reduce the size of the model, to improve the computing speed and increase the robustness of feature extraction. So pooling in general do not have padding to fill. It is also similar to the filter by moving a complete results, pooled weighted but not node layer and, instead of using a simpler maximum value (max pooling) or averaging operation (average pooling), the former than the latter wider use.

(. 1) max Pooling
Pooling convolution with a similar, but much simpler, max pooling is worth taking the maximum filter with a filtered image.
Here Insert Picture Description2) Pooling Average
Average Pooling is to use a filtered image taken are worth filter, which is better to use a wide max pooling.

Notably: hyperparametric pooled process and a filter with a filter size stepper length f s, there is no need to learn the parameters, i.e. when the counter-propagating no parameters need to be updated. It is a static property.
The most common pool of hyperparametric has two sets of filter: f = 2 s = 2 and f = 3 s = 2
Here Insert Picture Description

10. CNN Example

Here Insert Picture Description
FIG, CON layer immediately behind a POOL layer, together constituting the first layer (since there is no need to learn parameters POOL), CONV2 and a second layer constituting POOL2.
Of particular note is FC3 and FC4 is fully connected layer FC, which is consistent with the standard neural network architecture. The final output layer (SoftMax) consists of 10 neurons.

About hyperparametric selection advice, try not to set their own hyper-parameters, parameter values over others but with reference to literature inside.
The CNN architecture model is used behind one or more CONV with a POOL, and then continues with one or more of a behind the POOL ...... CONV several fully connected with the last layer, softmax output.

FIG network parameters are as follows:

Here Insert Picture Description

11. Why Convolutions?

Relative to the standard neural networks, image processing, then the network parameters convolution greatly reduced, so that we can train the model with a smaller training set, but also effectively prevent over-fitting
Why use convolution can reduce the parameters?
There are two reasons: sparse parameter sharing and connection.

  • Shared parameters (parameter sharing): a feature detector (such as a vertical edge detection) useful for an image block area, but also may be useful for other image areas, i.e. without adding feature detector can detect an entire image a plurality of similar features or characteristics.
  • Sparse connection (sparsity of connection): size limitations because the sub-filter operator, each output depends on only a few characteristics of the filter cover.

Also good at capturing CNN translation invariant feature. That is CNN object detection, it does not affect the position of the object is located by the picture.

Here Insert Picture Description

Guess you like

Origin blog.csdn.net/yexiaohhjk/article/details/87716730