1. Mathematics basics before machine learning

you say spring is too short

haven't had time to see myself

Will be smashed into a feasting summer

then bloom

fuck

1. Sum and product

1.1 Summing

Suppose now we want to write down the simple sum operation of adding 1 to 100 on paper:

1 + 2 +3 + 4 + 5 + ........ + 99 + 100

Simplify using summation notation (pronounced "sigma"):

For cases where it is not clear how much to add:

 Use the sum notation for sets:

1.2 Quadrature

Suppose now we want to write down the simple quadrature operation of multiplying 1 to 100 on paper:

1 * 2 * 3 * 4 * 5 * ........ * 99 * 100

Simplify using quadrature notation (pronounced "pi"):

For cases where it is not clear how much to multiply:

2. Differentiation

2.1 Introduction to Differentiation

In the field of machine learning, there are many methods used to solve optimization problems , one of which is to use differentiation .

Through differentiation, you can know the slope of the function at a certain point, and you can also understand the change of the function in an instant .

Example: Scene driving on the street (stop and go)

It can be seen from the whole picture that the car traveled about 120m in 40s, so the driving speed of the vehicle during this period is:

120m / 40s = 3m/s

3m/s is the average speed of the vehicle. It can be seen from the figure that the speed of the car is slow when it is just started, and the speed becomes 0 at the red light, that is, the instantaneous speed of the vehicle at each point in time is different.

In order to find the "instantaneous speed" of the vehicle, let's gradually reduce the time interval.

The speed of the vehicle within 10s - 20s can be calculated:

60m / 10s = 6m/s

Similarly, reduce the time interval again and again, find the slope between 10s - 11s, and find the slope between 10.0s - 10.1s. Finally, the slope at the moment of 10s can be obtained, that is, the speed. The method of narrowing the interval to find the slope like this is differentiation.

In order to find this "instantaneous change", assuming that the function is f(x) and h is a small number, then the slope of the function f(x) at point x can be expressed as:

3. Partial differentiation

In the preceding differentiation, the function f(x) is a univariate function with only one variable x. However, in practice, it is a multivariate function with many variables.

How to deal with multivariate functions?

Core: Only need to pay attention to the variables of differentiation, and treat other variables as constants . This kind of differentiation method is called partial differentiation.

Example:

Partial differentiation of function h with respect to x1:

Partial differentiation of function h with respect to x2:

Only focus on the variable to be differentiated like this, and treat all other variables as constants, and you can know what the slope is under this variable. No matter how much the variable increases, this method is applicable.

4. Composite functions

There are two functions:

By substituting any value in x, you can get the output value of the function:

Not only can you substitute constants into functions, but you can also substitute functions for calculations:

A function composed of multiple functions like the above is called a compound function .

Example: Differentiate the composite function f(g(x)) with respect to x

1. Temporarily replace the function with a variable:

2. Differentiate in steps:

In other words, just multiply the result of y’s differentiation with respect to u by the result of u’s differentiation with respect to x .

3. Practical Differentiation:

Summary: To differentiate a complex function, you can treat the function as a composite function composed of multiple simple functions and then differentiate it. The key part is how to divide the function into simple functions .

5. Vectors and matrices

A vector is a data structure that arranges numbers vertically.

A matrix is ​​a data structure that arranges numbers vertically and horizontally.

Commonly used lowercase letters represent vectors, uppercase letters represent matrices, and are all in bold.

Matrices support sum, difference, and product calculations, respectively. Suppose there are the following two matrices A and B, and calculate their sum, difference, and product respectively.

Compute the sum and difference:

Calculate the product: You need to multiply the elements of the rows of the left matrix and the columns of the right matrix in turn, and then add the results together.

Final Results:

Finally, learn about transposition, the operation of exchanging rows and columns is called transposition .

When calculating the product of vectors, it is often calculated after transposing the following vector:

6. Geometric vectors

Vectors have magnitude and direction:

Vector addition and subtraction: 

Computation algebraically just adds and subtracts the elements in the vector:

Product between vectors:

Like this, after calculating the vector inner product, the result is no longer a vector, but an ordinary number (size). This common number has a slightly more uncommon name - scalar . So the inner product can also be called a scalar product. In addition, since the operation symbol of the inner product is not the multiplication symbol "×", but the dot "·", it is sometimes called the dot product .

Assuming that the angle between vectors a and b is θ, then the inner product can also be expressed as follows:

Let θ be the horizontal axis and cosθ be the vertical axis, then the graph of the cos function is as follows:

A normal vector is a vector perpendicular to a line:

7. Exponentials and logarithms

index law

exponential function

law of logarithms

Logarithmic function

Natural logarithm

logarithmic differentiation

Guess you like

Origin blog.csdn.net/2301_76354366/article/details/131706765