Series Article Directory
[Artificial Intelligence Study Notes] Mathematics in Artificial Intelligence - Overview
[Mathematics in Artificial Intelligence] Differential Calculus of One Variable Function
[Mathematics in Artificial Intelligence] Basic Linear Algebra
[Mathematics in Artificial Intelligence] Multivariate Function Differential
Article Directory
- Series Article Directory
- foreword
- 1. Definition of derivative
- 2. Left and right derivatives, derivable functions
- 3. The geometric and physical meanings of derivatives
- 4. Derivation formula
- 5. The use of derivatives
- 6. Higher order derivatives
- 6. The relationship between derivative and function monotonicity
- 7. Extreme value theorem
- 8. The relationship between the derivative and the concave-convexity of the function
- Nine, unary function Taylor expansion
- Summarize
foreword
Compared with software development, the field of artificial intelligence requires a lot of mathematical knowledge. Mainly calculus, linear algebra, probability theory and optimization are covered.
This paper mainly introduces the differential calculus of functions of one variable.
This article serves as my notes on learning artificial intelligence, mainly for myself to review the past and learn the new in the future, and sorting it out here is considered a second study. It's an honor to be of help to you. If I am wrong, I welcome corrections. If there is any infringement, please contact the author to delete.
1. Definition of derivative
Derivative: This is the core concept in calculus. When the independent variable Δx of the function is close to 0, if there is a
value in the formula, then the function can be derived.
The limit limit is considered to be the dividing line between advanced mathematics and elementary mathematics.
2. Left and right derivatives, derivable functions
There are two directions to approach 0, the left derivative tends to 0 from the left, and vice versa is the right derivative.
The left and right derivatives of the absolute value function below are different, one -1 and one +1, and the position of 0 is not derivable.
f(x)=|x|
Relu function
max(0,x)
3. The geometric and physical meanings of derivatives
Geometric meaning,
physical meaning of slope of tangent, instantaneous speed
4. Derivation formula
1. Basic functions
2. Four arithmetic operations
3. Composite functions
According to the combination of the three formulas, the derivative value of any formula can be obtained
1. Basic functions
Power function
Exponential function
Exponential function with base a
Logarithmic function Logarithmic function
with any base
The formula of the derivative can be derived according to the following formula . Don’t worry about defining the derivative of the trigonometric function, we will rarely use it
.
And the trigonometric function is very annoying, it is a periodic function, and
many times in our machine learning, it is required to be a monotonous function, whether it is monotonically increasing or monotonically decreasing, it is best not to have a periodic function.
2. Four algorithms
Derivative addition, subtraction, multiplication and division
3. Composite function derivation rule
5. The use of derivatives
- To find the extreme value, the derivative is often set to 0, and the derivative function form of the function must be found here.
- The activation function in the neural network will be used. In fact, it is still the case where the derivative is 0, but it is only in the form of a composite function.
example
6. Higher order derivatives
What we learned earlier is the first-order derivative, and taking the derivative again is the higher-order derivative, and the second-order and above-order derivatives are collectively called higher-order
derivatives.
6. The relationship between derivative and function monotonicity
If the derivative of the function is greater than 0, the function is monotonically increasing.
If the derivative of the function is less than 0, the function is monotonically decreasing.
The derivative of the above function is 2X, so when x<0, the function decreases monotonically, and when x>0, the function increases monotonically.
7. Extreme value theorem
The derivative provides the basis for us to find the extremum . For the derivable function, because there must be a function whose derivative is equal to
0 at the extremum position.
The derivative of the function at the extreme value is equal to 0, which is a necessary condition, but not a sufficient condition, because the derivative at the extreme value must be equal to 0,
but the derivative equal to 0 does not mean that it must be an extreme value.
For example, the cube of X:
8. The relationship between the derivative and the concave-convexity of the function
The second derivative of a function is related to the concave-convexity of the function. How is the concave-convexity defined?
Let’s do a brief review first, and I’ll tell you more about it in the optimization method. Here, remember that a convex function is downward convex, and it’s concave anyway. Whether it’s a convex function can be determined by the second-order derivative. If the second-order If the derivative is greater than 0, it is a convex function.
Taking the square of X as an example, its second order derivative is 2, which is greater than 0, so it is a convex function.
It is called the stagnation point, and the stagnation point is the alternating point of function increase and decrease, one side increases and the other decreases, or one side decreases and the other increases.
Called the inflection point, the inflection point is concave-convex, with one side concave and one side convex or one side convex and one side concave.
Take the cube of X as an example, the first derivative is the square of 3X, and the second derivative is 6X, so when X is less than 0, it is a concave function, and when
X is greater than 0, it is a convex function.
Nine, unary function Taylor expansion
We connect the derivatives mentioned above by talking about the Taylor expansion of the unary function. We will
talk about the Taylor expansion of the multivariate function later.
Taylor expansion is to approximate a differentiable function f(x) by polynomial function, and carry out Taylor expansion at x=x0, if the
function f(x) is n-th order differentiable. The constant term + the first order term + the second order term adds up to one factorial of n multiplied by the n order derivative.
Taylor expansion was very useful when we learned advanced mathematics before. It can be used to study certain properties of functions
to complete many tasks. In machine learning, it is used to find the extreme value of functions. In many cases, the function f( x) may be very complicated
, let's use Taylor expansion to make an approximation, how does the gradient descent method do it? It is an approximation, only retaining
the first-order terms of Taylor expansion, and Newton’s method, Newton’s method is to retain the second-order terms of Taylor expansion, ignoring the terms above the second order, and use the quadratic function to
perform the function f(x).
Summarize
The above is what I want to talk about today. This article only briefly introduces the differential calculus of unary functions, which belongs to the relatively basic knowledge points in advanced mathematics. Later, I will continue to explain the differential calculus of multivariate functions.