Summary of Deep Learning System (Getting Started)

deep learning

Learning recommendation:
1. Wu Enda's deep learning course: https://www.deeplearning.ai
2. Li Hongyi's deep learning course: search for Li Hongyi directly at station b 3. "deep learning" is commonly known as
Huashu https://github.com/exacity/deepinglearningbook-chinese . vivid examples

Relationship sorting:
artificial intelligence→machine learning→artificial neural network→deep learning (relationships contained in layers)

The framework of this paper:
1. Artificial neuron
2. Multilayer perceptron
3. Activation function
4. Backpropagation algorithm
5. Loss function
6. Weight initialization
7. Regularization

1. Artificial neurons

insert image description here
The figure is a schematic diagram of human neurons and artificial neurons.
Artificial neural network: a large number of artificial neuronsconnection methodconstitutedmachine learning model

2. Multilayer perceptron

insert image description here
insert image description here
As shown in the figure, after adding several hidden layers, it becomes a multi-layer perceptron.

3. Activation function

(1) Let the multi-layer perceptron become a real multi-layer, otherwise it is equivalent to one layer.
(2) Introduce nonlinearity so that the network can approximate any nonlinear function (universal approximation theorem)

The properties that the activation function needs to have:
1) Continuous and derivable, so that it is convenient to use numerical optimization methods to learn network parameters.
2) The activation function and its derivatives should be as simple as possible, which is conducive to improving the efficiency of network calculations.
3) The value range of the derivative function of the activation function must be within an appropriate range, otherwise it will affect the efficiency and stability of the training.

Activation function:
insert image description here

4. Backpropagation

Forward propagation: the input layer data starts, from front to back, the data is gradually transmitted to the output layer.
Backpropagation: The loss function starts, from the back to the front, the gradient is gradually passed to the first layer.

Backpropagation function: used for weight update to make the network output closer to the label
Loss function: measure the difference between the model output and the real label.
The principle of backpropagation: the chain rule in calculus.
insert image description here

Gradient descent method: The weight value is updated along the negative direction of the gradient to reduce the value of the function Gradient
: a vector, the direction is the direction where the directional derivative obtains the maximum value
Learning rate: Control the update step size

5. Loss function

Loss function: Measures the distance between the model output and the true label.
Two common loss functions: 1. MSE (mean square error) 2. CE (cross entropy)
Information entropy: describes the uncertainty of information.

6. Weight initialization

Weight initialization: Assign weight parameters before training, and a good weight initialization is conducive to model training.
1. Xavier initialization
2. Kaiming initialization

7. Regularization

Regularization: reducevariancestrategy, popularly understood as alleviatingoverfittingstrategy.
Loss function: Loss
Cost function: Cost
Objective function: Objective
Obj=Cost+Regularization
Regularization is a regular term, which is aconstraint

Guess you like

Origin blog.csdn.net/weixin_51610638/article/details/120786305