To prove that the negative direction of the gradient is the fastest descending direction of the function,
It is to prove that the positive direction of the gradient is the fastest rising direction of the function.
prove:
Assuming a vector x, there is a function f(x), we want f(x), tends to the smallest,
Suppose a random direction l, note that the dimensions of l and x are the same
If the function falls or rises along the direction l (because we don’t know whether it rises or falls along the l direction)
Then get the function: f(x+l)
Perform a first-order Taylor expansion of f(x+l) to obtain the following formula:
Then f(x+l)-f(x) is the change in the value of the function along the direction l.
That is to say, if f(x+l)-f(x)> 0, it is rising along the direction l; if f(x+l)-f(x) <0, it is along the direction l Falling
Back to our ultimate question: Why is the positive direction of the gradient the fastest rising direction of the function?
After we see f(x+l)-f(x), the right side of the equation is:
Consider the following situation: When the independent variable changes particularly small
Is negligible
Then the remaining formula is:
So our ultimate question (why is the positive direction of the gradient the fastest direction of the function's rise?) is how to maximize the above formula.
Since the above formula is a dot product, when the two vectors are in the same direction, the above formula is the largest, that is to say, this is the direction in which the function rises fastest. On the contrary, if the two vectors are in opposite directions, the above formula is the smallest, that is to say, this is the direction in which the function drops fastest.
------------------------------end--------------------------