Gradient related concepts

1. The basic concepts of
    directional derivative: is a number; reflect f (x, y) v rate of change in the direction of the point P0.

    Partial derivatives: plural number (per Element a); refers to polyhydric function directional derivative along the axis direction, thus there are two binary function partial derivative.

    Partial derivative functions: a function; is a function of the number of the partial derivative point.

    Gradient: is a vector; each element is a function of the partial derivatives of monohydric, variable; both its size (the size of the maximum directional derivative), but also the direction.

2. The directional derivative
    reflect f (x, y) v rate of change in the direction of the point P0.

    Examples are as follows:

    

2.0 directions derivative formulas
   

2.1 partial derivatives
     

2.2 Partial binary function Geometric meaning of derivative
     

     

 2.3 partial derivative

     The partial derivative of the relationship between the partial derivative:

    Partial derivatives are partial derivative function value at a given point, so that when the partial derivatives, may first find the partial derivative of the function, then the point is substituted into partial derivative, thereby obtaining the partial derivative functions at this point.

     

3. The total differential
   

4. gradient
      the gradient is a vector; both size, but also the direction.
    

 

 

 

 4.1 geometric meaning

    Function z = f (x, y) is a function of the rate of change in the gradient direction at the point P0 (i.e., directional derivatives) maximum.

    Direction of the gradient is the function f (x, y) at this point the fastest growing direction, modulo the maximum value of the gradient directional derivative.

    

 

 

New to the concept of gradient descent when it is in learning machine learning algorithm, a lot of training is the gradient descent algorithm, and information and teachers are moving in the opposite direction, said gradient changes, the fastest decline in the value of the function, but the study when the reason, many people have expressed unclear. So I put together my own understanding, from this point of directional derivative out to prove this conclusion, let us also know these know why ~

Here I started not to mention the concept of gradient, fully sort out below according to their own understanding, step by step introduction of origin gradient:

  • Derivative

 

Geometric meaning of derivative may be a lot of people are familiar with: When the function and value of the domain in the domain of real numbers when the derivative can be expressed on the slope of the tangent function curve. In addition to the tangent slope of the derivative function of said rate of change at that point.

 

The above equation is transformed to the following picture:

 

(From Wikipedia)

Straightforward, the derivative represents a change in the independent variable tends to infinity was little change in the value of the function independent variable ratio represents the derivative, geometric meaning there is tangent to the point. There are physical meaning (instantaneous) rate of change of the moment ...

Note that functions of one variable, there is only one independent variable changes, the rate of change means that there is only one direction because of the derivative which is why no partial functions of one variable.

  • Partial derivative

 

Since talking partial derivatives, it involves at least two arguments, an example of two independent variables, z = f (x, y). From the number of leads to partial derivatives, that is, from the curve of the surface to upper curve that it is only one tangent. But the surface a little, there are countless tangent.

We are talking partial derivative refers to the rate of change along the coordinate axis of Function.

[official]Refers to a function in the y direction is constant, the rate of change function values ​​along the x-axis direction,

[official]Refers to a function in the x direction is not changed, the function change ratio along the y-axis direction

An image corresponding to the image expressed as follows:

 

So what's the partial derivative corresponding geometric sense is it?

  • Partial derivative [official]is curved plane is [official]the cut surface obtained at the point [official]tangent to [official]the slope of the x-axis
  • Partial derivative [official]is curved plane is [official]the cut surface obtained at the point [official]tangent to [official]the slope of the y-axis

 

Possible here, readers had discovered the limitations of partial derivatives, the original we learned of partial derivatives refers to the rate of change of multi-function along the axis, but we are often a lot of time to consider the rate of change in any direction of multi-function, then it leads to the directional derivative.

 

 

  • Direction 导数

 

We finally leads to the main event, directional derivative, here we slowly walked into it

Suppose you stand on the hillside, with know hillside slope (tilt)

FIG slopes follows:

 

Expressed as assumptions hillside [official], you should already be doing both directions of the main slope.

The slope may be obtained for the y-direction y partial differentiation.

Similarly, the slope of the x-direction can be obtained for the partial differential x

Then we can use the partial derivative is obtained maybe has a slope in any direction (similar to a plane of all vectors with two basis vectors may be represented as)

现在我们有这个需求,想求出[official]方向的斜率怎么办.假设[official]为一个曲面,[official][official]定义域中一个点,单位向量[official]的斜率,其中[official]是此向量与[official]轴正向夹角.单位向量[official]可以表示对任何方向导数的方向.如下图:

 

那么我们来考虑如何求出[official]方向的斜率,可以类比于前面导数定义,得出如下:

[official]为一个二元函数,[official]为一个单位向量,如果下列的极限值存在

[official]此方向导数记为[official]

则称这个极限值是[official]沿着[official]方向的方向导数,那么随着[official]的不同,我们可以求出任意方向的方向导数.这也表明了方向导数的用处,是为了给我们考虑函数对任意方向的变化率.

 

在求方向导数的时候,除了用上面的定义法求之外,我们还可以用偏微分来简化我们的计算.

表达式是[official](至于为什么成立,很多资料有,不是这里讨论的重点)

那么一个平面上无数个方向,函数沿哪个方向变化率最大呢?

目前我不管梯度的事,我先把表达式写出来:

[official]

[official],[official]

那么我们可以得到:

[official]([official]为向量[official]与向量[official]之间的夹角)

那么此时如果[official]要取得最大值,也就是当[official]为0度的时候,也就是向量[official](这个方向是一直在变,在寻找一个函数变化最快的方向)与向量[official](这个方向当点固定下来的时候,它就是固定的)平行的时候,方向导数最大.方向导数最大,也就是单位步伐,函数值朝这个反向变化最快.

Well, now we have found the fastest decline in the value of the function of the direction, and that direction is [official]the same vector direction. So this time I named A vector gradient (When a point is determined, the gradient direction is determined), also It is to explain why the gradient direction is a function of the rate of change of the direction of the maximum! ! ! (Because the name has always been the biggest change is a function of the direction of the gradient)

Guess you like

Origin www.cnblogs.com/psztswcbyy/p/11592741.html