TensorFlow2.0 study notes 2.3: activation function

Activation function

This picture shows the neuron model used and its corresponding forward propagation formula when we implemented the classification of iris flowers in the previous lecture. It
can be seen from the formula that even if there are multiple layers of neurons connected end to end to form a deep neural network, it is still It is a linear combination, and the expressive power of the model is not enough.
Insert picture description hereThis picture shows the MP model proposed in 1943.
Compared with the simplified model above, there is one more nonlinear function. This nonlinear function is called the activation function. Neural network can improve expression ability as the number of layers increases

Insert picture description here

Commonly used activation functions

Sigmoid() function The
sigmoid function is also called Logistic function. It is used for hidden layer neuron output. The value range is (0,1). It can map a real number to the interval of (0,1) and can be used for two classifications. . The effect is better when the feature difference is more complicated or the difference is not particularly large.

Sigmoid as an activation function has the following advantages and disadvantages:
Advantages: smooth and easy to derive.
Disadvantages: The activation function has a large amount of calculation. When backpropagating to find the error gradient, the derivation involves division; when backpropagating, the gradient disappears easily, and the training of the deep network cannot be completed.
Insert picture description hereTanh() function
Insert picture description hereRelu() function
Insert picture description here

Leaky Relu() function

Insert picture description here
Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_44145452/article/details/112978353