Principle & EM algorithm [source] of machine learning algorithms to understand

Copyright notice: reproduced please indicate the source and marked "AI algorithms Wuhan study" https://blog.csdn.net/qq_36931982/article/details/90893730

 

Machine learning algorithms use two EM alternately calculating step, Step E: calculating a desired value of the log-likelihood estimation using the parameter values ​​of the current; M-step: Find possible to Step E generated likelihood expectation maximization parameter values, and then the new parameter value obtained by re-calculation step E, until convergence.

EM algorithm is easy to understand, the more complex formula derivation, we start from a simple example of the gradual in-depth understanding of EM algorithm, EM knowledge to sort out.

Coin toss experiment: two coins A and B, and randomly selected 10 times after a coin toss was repeated 5 times. We need to estimate the probability of each positive.

The following figure shows a coin is known to select A or B, through statistical experimental situation of each positive and negative step, the coin A is finally obtained positive probability is 0.8, the probability of coins is the front B 0.45

As shown in the following figure B, is the clear choice of each coin which is conducted experiments, in which case we need to estimate the probability of A and B coin face-up, is aware of a sample of data, you need to estimate the unknown parameters θA and θB, and the presence of hidden variables during the sample Z (a is selected from each unknown coin or coins B)

The EM algorithm first estimates of the required parameters, the initialization value is θA = 0.6, θB = 0.5.

E-step to understand: is calculated according to the parameters of the initial value or the previous iteration of the posterior probability of the hidden variable, in fact, expect hidden variables. As a current estimated value of the hidden variable:

M-step: Setting a wish to maximize the likelihood function to obtain a new parameter value θA and θB

 

【Mathematical basis】

 

1, Jensen inequality

Convex function: Let f be the domain of a function of a real number, if for all real numbers x, second-order derivative is greater than or equal to 0, then f is a convex function; when x is a vector, if it hessian matrix H is positive semidefinite ( H is greater than or equal to 0), then f is a convex function.

Strictly convex function: If only greater than 0, is not equal to 0, it is strictly convex function.

Properties of one: If the function f is a convex function, x is a random variable, then E [f (x)] greater than or equal f (E (x));

Two properties: if f is strictly convex function, then E [f (x)] = f (E (x)) if and only if P (x = E (X) ) = 1, and X is a constant that is

 Diagram, a solid line f is a convex function, X is a random variable, it can be seen in FIG E [f (X)]> = f (E [X]) was established.

2, the desired formula Lazy Statistician rules

 

[E-step and M-step]

 

The EM algorithm is for "maximum likelihood function estimation" extending, using the EM algorithm purpose is clear: "determining probability model p (x, z) of the parameter [theta] ' , with the maximum likelihood is different function is that the likelihood function more unknown variables Z, the main body of the EM algorithm formula below.

 

【references】

(EM algorithm) The EM Algorithm

EM algorithm is derived in detail

What is the expectation maximization algorithm?
 

 

Guess you like

Origin blog.csdn.net/qq_36931982/article/details/90893730