In the last chapter, we learned about derivatives and the chain rule. I think everyone should have learned it. This chapter starts to learn about forward propagation and back propagation in deep learning.
I explained Logistic regression before, this chapter uses Logistic regression as an example to explain forward propagation and back propagation.
Let us recall the logistic regression formula we learned earlier:
Forward Propagation
The understanding of Forward Propagation is relatively simple. The process of obtaining the output from the input through the hidden layer calculation is the forward process, also known as the forward propagation.
We assume that the sample has two eigenvalues, respectively , then the calculation method of forward propagation is as follows:
The so-called forward propagation, that is, starting from the input, calculating according to the direction of the arrow, and finally getting the loss function value, this is a forward propagation.
Backward Propagation
Backward Propagation is the opposite of the calculation direction of forward propagation. It calculates the gradient (partial derivative) of each layer of parameters through the loss function and the reverse flow of the network to update the parameters, as shown in the following figure:
The ultimate goal of backpropagation is to minimize the loss function value by updating the parameters. The specific method is to calculate the gradient of the parameters step by step according to the orange arrows, and then update the parameters.
Since backpropagation is very important, I will derive it step by step below. I hope everyone can gain something after reading this article.
1. Starting from the loss function , the calculated gradient:
=
2. The gradient calculated next :
Which we have calculated in the previous step, we simply calculate can.
Where = , then =
Then =
3. Then the calculated gradient:
Where and we already seeking over, so if you ask them to,
=
then
The same
The same
In this way, the gradient of the parameter is updated, and only the corresponding parameter needs to be updated:
Among them is the learning rate, which is a hyperparameter, which means that manual debugging is required. It has been discussed in the previous article, so I won't repeat it here.
I think I've covered it in enough detail. If you still don't understand, please comment on your questions and I will answer them one by one.
If you have learned something from reading this article, please move your cute little hand to pay attention.
The above is the entire content of this article. To get the deep learning materials and courses, scan the official account below and reply to the word "data" to get it. I wish you a happy learning.