Second-Machine learning related mathematics foundation

 

 

.Video learning content this week: https://www.bilibili.com/video/BV1Tb411H7uC?p=2

1) P2 probability theory and Bayesian prior

2) P3 matrix and linear algebra

Machine learning is a multi-disciplinary interdisciplinary subject that involves more mathematical knowledge. The knowledge in this lesson has been learned before. This time it will be reorganized according to the key points. We must pay more attention to it. By watching the video, everyone deepened their impression of the basic mathematics of the course.

It is recommended that you take notes while watching, record the main points and the time point, so that you can look back when necessary. Study notes are also part of the assignment.

 

3. Operation requirements:

1) Paste the video study notes, which require authenticity, do not plagiarize, you can take pictures by handwriting.

2) Summarize "gradient", "gradient descent", and "Bayes' theorem" in your own words. Word editing, mind mapping, handwriting and photographing are required, and conciseness and neat layout are required.

Benford's law:

 Probability formula:

Bayesian formula:

Two-point distribution:

        

Taylor series:

Poisson distribution:

    

Evenly distributed:

 

index distribution:

Normal distribution:

   

 

      

 

Beta distribution:

   

 

 

Independence of the event:

     

Variance, covariance:

  

 

  

Pearson correlation coefficient, Chebyshev inequality:

   

Law of Large Numbers:

  

Important inferences:

 

 

 

 

 

 

 

 

 

 

2) Summarize "gradient", "gradient descent", and "Bayes' theorem" in your own words. Word editing, mind mapping, handwriting and photographing are required, and conciseness and neat layout are required.

 

①Gradient: The definition of gradient is as follows:

Gradient definition

The gradient is proposed to answer only one question: In which point of the variable space does the function have the largest rate of change along which direction?
The gradient is defined as follows: The gradient of a function at a certain point is a vector whose direction is consistent with the direction in which the maximum directional derivative is obtained, and whose modulus is the maximum value of the directional derivative.
There are three points to note here: 1) The gradient is a vector, that is, the direction has the size; 2) The direction of the gradient is the direction of the largest directional derivative; 3) The value of the gradient is the value of the largest directional derivative.

②Gradient descent: Gradient descent is an iterative method that can be used to solve least squares problems (both linear and nonlinear). When solving model parameters of machine learning algorithms, that is, unconstrained optimization problems, gradient descent is one of the most commonly used methods, and another commonly used method is least squares. When solving the minimum value of the loss function, the gradient descent method can be used to solve iteratively step by step to obtain the minimized loss function and model parameter values. Conversely, if we need to find the maximum value of the loss function, then we need to use the gradient ascent method to iterate. In machine learning, two gradient descent methods have been developed based on the basic gradient descent method, which are stochastic gradient descent method and batch gradient descent method.

 

 

 

③Bayes' theorem:

 The conditional probability formula in probability theory:

(1)

Among them, P (B / A) represents the probability of event B occurring under the premise that event A has occurred, which is called the conditional probability of event B when event A occurs, and P (AB) represents the probability of simultaneous occurrence of AB events. Similarly, the probability that event A occurs under event B is as follows:
(2)

The probability of occurrence of event B is P (B), which contains two parts. One is that event B has occurred under the condition of A, and the other is that event B has occurred when A is not, so P (B) The calculation formula is as follows:

(3)

Bring formula (3) into formula (2), the complete Bayes formula can be obtained as follows:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  

 

Guess you like

Origin www.cnblogs.com/kushoulder/p/12699021.html