[Deep learning] Differential principle of back propagation

table of Contents

 

1. The definition of backpropagation

2. Examples of backpropagation in simple networks


1. The definition of backpropagation

       The previous article "What is gradient learning" introduced the principle of gradient learning. This article will introduce the core algorithm of neural network "automatic learning": back propagation, even BP algorithm. The input-output relationship of the BP algorithm is essentially a mapping relationship based on gradient descent.

       The learning process of BP algorithm is composed of forward propagation process and back propagation process. In the forward propagation process, the input information passes through the input layer and the hidden layer, processed layer by layer and transmitted to the output layer. If the expected output value cannot be obtained in the output layer, the sum of the squares of the output and the expected error is taken as the objective function, and then transferred to backpropagation. The partial derivative of the objective function with respect to the weight of each neuron is obtained layer by layer to form the objective The gradient of the function to the weight vector is used as the basis for modifying the weight, and the learning of the network is completed in the weight modification process. When the error reaches the expected value, the learning ends. The BP algorithm is essentially a supervised learning algorithm.

Stages of BP algorithm:

     Forward propagation stage: input training into the network to obtain incentive response

     Back-propagation stage: The difference between the excitation response and the target output corresponding to the training input is used to obtain the response error of the hidden layer and the output layer.

2. Examples of backpropagation in simple networks

As an example:

 

The above is a 3-layer neural network, each layer has only one neuron, the first layer represents data input, and the last layer represents data output. Assume the following variables:

 

 

Input and expected output values:

Input value a(L-1)

Expected value output y(L+1)

5

2

The loss function uses the mean variance error:

1. The process of forward propagation:

1) Random weight parameters and offset parameters of each neuron

level

Weight w

Offset value b

L

0.1

1

L+1

0.5

2

2) Calculate the output value of each layer

3) Calculate the target loss

The calculation result of the loss function is large, start the back propagation phase

2. The process of back propagation

1) Differentiation

According to the chain rule:

2) Update parameter value

After updating all the variables, repeat the process of forward propagation and back propagation until the loss function approaches zero.

        In a complex neural network, the number of layers of the network and the number of neurons in each layer are relatively large, the output of neurons may also use activation functions, and parameter updates follow the principle of back propagation.

Guess you like

Origin blog.csdn.net/henku449141932/article/details/107768781