The neural network notes tensorflow

1, NN ---- neural network

2, CNN convolution neural network

CNN network a total of five hierarchy:

    • The input layer
    • Convolution layer
    • Active layer
    • Pooling layer
    • FC fully connected layer

 

An input layer

With conventional neural networks / machine learning as preprocessing operation model requires input,

Common pretreatment mode the input layer are:

  • To mean
  • Normalized
  • PCA / SVD dimensionality reduction, etc.

1, to mean: Save the dimensions are corresponding to the mean dimension, the dimensions are such that the center of each of the input data into 0, to be the reason it is because if the mean not mean it will be easy to fit.

2, normalized:

One is the best value for normalization, such as the maximum normalized to 1, the minimum normalized to 1; or the maximum normalized to 1, the minimum value normalized to 0. Applicable to data already distributed within a limited range. Another is the mean and variance normalization, is usually normalized to the mean 0, variance normalized to 1. Distribution applicable to the circumstances there is no obvious boundaries.

3, PCA / whitening

(1) .PCA means carried by discarding less information dimension, the main feature information retained reduce the dimension of the data, the idea is to use a few representative, unrelated replaces the features a lot of, there is a certain correlation characteristics, thus speeding up the process of machine learning. (Dimension reduction techniques may speak separately)      

 PCA can be used for feature extraction, data compression, de-noise, dimensionality reduction operation.      

(2) The purpose of bleaching is to remove the associated data and the order of the variance between the homogenization, since a strong correlation between adjacent pixels in the image, so a lot of training inputs are redundant. This time decorrelation operation can use whitening operation, so that:                    

       1. Reduce the correlation between feature                    

       2. The features have the same variance (covariance 1)      

(3) For example whitened, as described for the distribution of characteristics associated with two at left, a linear relationship can be seen that the combination of features points, then we whitening (projected feature vectors) can be turned into the form of right no relevance.    

And because whitened variance were homogenized, it can also enhance the training speed.      

Second, the convolution layer

Third, the excitation layer

The so-called incentive is actually a result of the convolution output layer to do a non-linear mapping. 

 

 If no excitation function (in fact, is equivalent to the excitation function f (x) = x), in this case, the output of each layer is a layer of a linear function of the input. Easy come, no matter how many layers there are neural networks, the output is a linear combination of input, with no hidden layer effect is the same, this is the most primitive perception of the machine. 
Commonly used excitation functions are:

Sigmoid function
Tanh function
RELU
the Leaky RELU
the ELU
MAXOUT
  intermediate layer suggestion excitation: RELU First, because the iterative fast speed, but it is possible without effect. If the case ReLU failure, consider using Leaky ReLU or Maxout, this time the general situation can be resolved. Tanh function in text and audio processing have better results. Last layer activation function is typically classified network (i.e. the prediction function): softmax function, or in combination with other functions

Fourth, pooled layer

Pooled (Pooling): also known as subsampling or downsampling. Mainly used for feature reduction, the number of the compressed data and parameters to reduce over-fitting, while improving the fault tolerance of the model. There are:

  • Max Pooling: the largest pool of
  • Average Pooling: Pooling average 

Blog on pooled: https://blog.csdn.net/qq_41661809/article/details/96500250

Although people do not readily distinguish the features of view of the pool, but it does not matter, the machine can still be identified.

Fifth, the output layer (layer fully connected)

 

Guess you like

Origin www.cnblogs.com/h694879357/p/12303632.html