Mathematical foundations of deep learning of the chain rule

Last talked foundation derivative and partial derivative, it is not enough to use these up today to say something about the back-propagation function is used to solve complex derivation of the chain rule.

1 composite function

Known function y = f (u), when u is represented as u = when g (x), y as a function of x can be expressed as y = f (g (x)) such as nested structure, such nesting structure function, called f (u), g (x) is a complex function.

file

2 chain rule

2.1 single variable functions the chain rule

Known univariate function y = f (u), when expressed as a function uu univariate when u = g (x), the composite function f (g (x)) as a guide function can easily be found out.

file

The above formula is called a composite function of one variable derivation of the formula, also called the chain rule.

file

The right formula, if dx, dy, du are treated as a single letter, then left the right side of the equation can be seen as the result of simple points about, this view is always true. By using the derivative dx, dy represents the like, we can remember the chain rule: the derivative function may be as complex as using approximately score points. However this rule does not apply to the case about the sub dx, dy square and the like.

file

Let's try to sigmoid function with the wx + b are complex derivation of it

file

More than 2.2 variable functions the chain rule

In the case of multi-variable functions, thought the chain rule is also applicable. As long as the score of the image processing equation derivative deformation on the line, but things are not so simple to think, because of the need to apply the chain rule for all variables relevant.

Let's look at the case of two variables. Z is variable u, v in the function, if u, v are x, the function y, z is x, y of the function, the multivariate function below the chain rule established at this time.

file

Variable z is u, the function v, u, v x, respectively, a function of y, z when x derivation on, first of u, v derivative, and then multiplied by the corresponding derivative of z, the final product will add up.

When y z derivation on, too. The child is still under way to set up.

file

file

Published 38 original articles · won praise 1 · views 2188

Guess you like

Origin blog.csdn.net/wulishinian/article/details/104856488