[Mathematics in Artificial Intelligence] Functional Differential Calculus of One Variable

Series Article Directory

[Artificial Intelligence Study Notes] Mathematics in Artificial Intelligence - Overview
[Mathematics in Artificial Intelligence] Differential Calculus of One Variable Function
[Mathematics in Artificial Intelligence] Basic Linear Algebra
[Mathematics in Artificial Intelligence] Multivariate Function Differential



foreword

Compared with software development, the field of artificial intelligence requires a lot of mathematical knowledge. Mainly calculus, linear algebra, probability theory and optimization are covered.
This paper mainly introduces the differential calculus of functions of one variable.
This article serves as my notes on learning artificial intelligence, mainly for myself to review the past and learn the new in the future, and sorting it out here is considered a second study. It's an honor to be of help to you. If I am wrong, I welcome corrections. If there is any infringement, please contact the author to delete.


1. Definition of derivative

insert image description here
Derivative: This is the core concept in calculus. When the independent variable Δx of the function is close to 0, if there is a
value in the formula, then the function can be derived.
insert image description here
The limit limit is considered to be the dividing line between advanced mathematics and elementary mathematics.

2. Left and right derivatives, derivable functions

There are two directions to approach 0, the left derivative tends to 0 from the left, and vice versa is the right derivative.
The left and right derivatives of the absolute value function below are different, one -1 and one +1, and the position of 0 is not derivable.
f(x)=|x|
insert image description here
Relu function
max(0,x)
insert image description here
insert image description here

3. The geometric and physical meanings of derivatives

Geometric meaning,
insert image description here
physical meaning of slope of tangent, instantaneous speed

insert image description here
insert image description here

4. Derivation formula

1. Basic functions
2. Four arithmetic operations
3. Composite functions

According to the combination of the three formulas, the derivative value of any formula can be obtained

1. Basic functions

Power function
insert image description here
Exponential function
insert image description here
Exponential function with base a
insert image description here
Logarithmic function Logarithmic function
insert image description here
with any base
insert image description here
The formula of the derivative can be derived according to the following formula . Don’t worry about defining the derivative of the trigonometric function, we will rarely use it
.
insert image description here
insert image description here
And the trigonometric function is very annoying, it is a periodic function, and
many times in our machine learning, it is required to be a monotonous function, whether it is monotonically increasing or monotonically decreasing, it is best not to have a periodic function.

2. Four algorithms

Derivative addition, subtraction, multiplication and division
insert image description here

3. Composite function derivation rule

insert image description here

5. The use of derivatives

  1. To find the extreme value, the derivative is often set to 0, and the derivative function form of the function must be found here.
  2. The activation function in the neural network will be used. In fact, it is still the case where the derivative is 0, but it is only in the form of a composite function.

example
insert image description here

6. Higher order derivatives

What we learned earlier is the first-order derivative, and taking the derivative again is the higher-order derivative, and the second-order and above-order derivatives are collectively called higher-order
derivatives.
insert image description here

6. The relationship between derivative and function monotonicity

insert image description here
If the derivative of the function is greater than 0, the function is monotonically increasing.
insert image description here
If the derivative of the function is less than 0, the function is monotonically decreasing.
insert image description here
The derivative of the above function is 2X, so when x<0, the function decreases monotonically, and when x>0, the function increases monotonically.

7. Extreme value theorem

The derivative provides the basis for us to find the extremum . For the derivable function, because there must be a function whose derivative is equal to
0 at the extremum position.
insert image description here
The derivative of the function at the extreme value is equal to 0, which is a necessary condition, but not a sufficient condition, because the derivative at the extreme value must be equal to 0,
but the derivative equal to 0 does not mean that it must be an extreme value.
For example, the cube of X:
insert image description here

8. The relationship between the derivative and the concave-convexity of the function

The second derivative of a function is related to the concave-convexity of the function. How is the concave-convexity defined?
Let’s do a brief review first, and I’ll tell you more about it in the optimization method. Here, remember that a convex function is downward convex, and it’s concave anyway. Whether it’s a convex function can be determined by the second-order derivative. If the second-order If the derivative is greater than 0, it is a convex function.
insert image description here
Taking the square of X as an example, its second order derivative is 2, which is greater than 0, so it is a convex function.
insert image description here
It is called the stagnation point, and the stagnation point is the alternating point of function increase and decrease, one side increases and the other decreases, or one side decreases and the other increases.
insert image description here
Called the inflection point, the inflection point is concave-convex, with one side concave and one side convex or one side convex and one side concave.
Take the cube of X as an example, the first derivative is the square of 3X, and the second derivative is 6X, so when X is less than 0, it is a concave function, and when
X is greater than 0, it is a convex function.

Nine, unary function Taylor expansion

insert image description here
insert image description here
We connect the derivatives mentioned above by talking about the Taylor expansion of the unary function. We will
talk about the Taylor expansion of the multivariate function later.
Taylor expansion is to approximate a differentiable function f(x) by polynomial function, and carry out Taylor expansion at x=x0, if the
function f(x) is n-th order differentiable. The constant term + the first order term + the second order term adds up to one factorial of n multiplied by the n order derivative.
Taylor expansion was very useful when we learned advanced mathematics before. It can be used to study certain properties of functions
to complete many tasks. In machine learning, it is used to find the extreme value of functions. In many cases, the function f( x) may be very complicated
, let's use Taylor expansion to make an approximation, how does the gradient descent method do it? It is an approximation, only retaining
the first-order terms of Taylor expansion, and Newton’s method, Newton’s method is to retain the second-order terms of Taylor expansion, ignoring the terms above the second order, and use the quadratic function to
perform the function f(x).


Summarize

The above is what I want to talk about today. This article only briefly introduces the differential calculus of unary functions, which belongs to the relatively basic knowledge points in advanced mathematics. Later, I will continue to explain the differential calculus of multivariate functions.

Guess you like

Origin blog.csdn.net/guigenyi/article/details/129762923