Maximum likelihood estimation and maximum a posteriori

  • This article first number from the public: RAIS , formula error go to: here to view.

Foreword

This series of articles for "Deep Learning" study notes, you can see the original book to read together, the better.

MLE VS MAP

The maximum likelihood function (MLE) and the maximum a posteriori probability estimation (MAP) are two completely different estimation methods, maximum likelihood function belongs to the faction frequency statistics (that there is one true value of θ), belonging to the Bayesian maximum a posteriori estimation statistics (think θ is a random variable, in line with a certain probability distribution), which is the difference between the two methods of understanding. The same model, the probability is the parameter push data, statistical data is pushed parameter.

Maximum likelihood estimate

The likelihood function is a function of the model parameters, the model is based on observations, the estimated value of the model parameters. Given output x, about the likelihood function L (θ | x) θ is equal to a given probability parameters [theta] of the variable X values. Its mathematical definition is:

\ [L (i | x) = f_th (x) = P_th (X = x) \]

Maximum likelihood estimation which is a good estimate of when the sample tends to infinity, the maximum likelihood estimate is the best asymptotic convergence rate, and because of its consistency and efficiency statistics, machine learning is also preferred estimation methods. In the case of independent and identically distributed:

\ [\ Hatth_ {MLE} = argmaxP (X; i) = argmaxP (x_1; i) P (x_2; i) ... P (x_n; i) = argmax \ log \ prod_ {i = 1} ^ nP ( x_i; i) \\\\ = argmax \ sum_ {i = 1} ^ n \ log P (x_i; i) = argmin- \ sum_ {i = 1} ^ n \ log P (x_i; i) // 负对 数 似 然 \]

Since the logarithmic function monotonically increases, and therefore would like to request a maximum value L, which may be seeking to find as its maximum number of functions, so the same result is obtained. Depth study done by the nature of the cross-entropy used in classification tasks is to strive for maximum likelihood function.

Conditions of maximum likelihood estimation

\[\hatθ_{MLE}=argmaxP(Y|X;θ)=argmax\sum_{i=1}^{m}\log{P(y^{(i)}|x^{(i)}|θ)} \]

Maximum a posteriori estimation

Bayesian formula:

\[P(θ|x)=\frac{P(x|θ)P(θ)}{P(x)} \]

Where P (x | θ) is the likelihood function, P (θ) is the prior probability.

The mathematical definition of the maximum a posteriori estimation are as follows:

\ [\ Hat \ theta_ {MAP} (x) = \ arg \ max_ \ theta f (\ theta | x) = \ arg \ max_ \ theta \ frac {f (x | \ theta) g (\ theta)} { \ int_ \ vartheta f (x | \ vartheta) g (\ vartheta) d \ vartheta} = \ arg \ max_ \ theta f (x | \ theta) g (\ theta) \]

theta for the parameters to be estimated, f is the probability, g is a priori estimate, maximizing a posteriori estimate obtained by f · g. When the prior distribution is constant, coincides with the maximum a posteriori estimation maximum likelihood estimation.

to sum up

Maximum likelihood estimation and the maximum a posteriori estimation comparative analysis.

  • This article first number from the public: RAIS , formula error go to: here to view.

Guess you like

Origin www.cnblogs.com/renyuzhuo/p/12630174.html