The depth of the foundations of mathematics learning

Introduction to Machine Learning:

Feature vector

The objective function

Machine Learning Category:

Supervised learning: classification problems (such as face recognition, character recognition, speech recognition), regression

Unsupervised learning: clustering, dimensionality reduction

Reinforcement Learning: has delayed a prediction based on the current state of the state, to maximize returns, returns, such as unmanned, chess

Deep learning mathematics: calculus, linear algebra, probability theory, optimization methods

Functions of one variable calculus:

Taylor expansion of a univariate function: polynomial approximation function in place of

We must do at some point in the vicinity of the Taylor expansion

One yuan differential calculus: derivative, Taylor expansion, extreme value mutiny law.

Multi-variable calculus:

Partial derivatives: other variables as constants, one of the variables of the derivative.

Higher order partial derivatives: In general, the mixed second order partial derivative regardless of the derivative order.

Gradient: first order partial derivatives of the vector constituting the multivariate function of each variable.

Taylor multi-function expansion

Linear Algebra:

Vector: a point in n-dimensional space. Often a column vector mathematics, often in programming a row vector (row priority storage)

Vector arithmetic: addition, multiplication, subtraction, inner product, transposition, the norm of the vector (the vector maps to a non-negative real number)

Norm of a vector: Lp norm p-th power sum of the absolute value of the component to open power p. L1 norm: sum of the absolute components. L2 norm: length of the vector / mold.

Matrix: is a two-dimensional array. Inverse matrix; Eigenvalue; quadratic matrix.

Tensor: the equivalent of a multidimensional array programming language, n-order tensor.

2 is a tensor matrix, the vector is a tensor. Such as RGB color image is the amount of 3 order tensor.

Jacobian matrix: matrix of all partial derivatives of the dependent variable for all the independent variables constituting Jacques gradient than a behavior matrix of each multivariate function.

Hessian matrix: second order partial derivative matrix configuration, multivariate function is a symmetric matrix, corresponding to the second derivative functions of one variable.

 

Extreme discriminant Methods of Function: Hessian matrix acts as f '' (x)

If the Hessian matrix is ​​positive definite, this function has a minimum value in a point; if negative definite Hessian matrix, the function has a maximum value at the point; uncertain if the Hessian matrix, saddle point was not extreme points.

Positive definite matrix is ​​defined: x [T] Ax> 0

Criteria of the positive definite matrix: full eigenvalue matrix is ​​greater than 0, the order of all master-matrix are greater than 0, the unit matrix array contract.

Matrix and vector derivation:

the x-gradient wTx = w

xTAx gradient of x = (A + AT) x

xTAx Hessian operator of x (Hessian matrix) = A + AT

 

Probability:

The probability of random events, random events

Conditional Probability:

p(a,b)=p(b|a)*p(a)     p(a,b)=p(a|b)*p(b)   =>p(b|a)*p(a)=p(a|b)*p(b)

Bayesian formula:

On both sides of the formula dividing the p (b) to give: p (a | b) = p (a) * p (b | a) / p (b) will be seen as a result of the fruit seen as b, then p ( b | a) is called the prior probability, p (a | b) is called posterior probability, Bayes' formula is based on the relationship between the prior probability and posterior probability.

Independent random events p (a, b, c) = p (a) p (b) p (c)

Random Variables:

After a variable quantization of random events, taking the value of a variable associated with each probability value.

Discrete random variables:

Only a limited number of cases value, or an infinite number of columns may be the case (all integers such as 0 to positive infinity of infinitely column, and all real number of 0 to 1 is not infinite column). Description discrete probability distribution of random variables: p (x = xi) ≥0, Σp (x = xi) = 1.

Continuous random variables:

Column value is not infinite case, i.e., a real number in a range. Description of continuous random variable is a probability density function and distribution function, the probability density function to satisfy: f (x) ≥0, ∫f (x) dx = 1; distribution function is defined as F (y) = p (x≤y ) = ∫f (x) dx

 

Guess you like

Origin www.cnblogs.com/wisir/p/11616029.html