Probability and Statistics for Noob - Maximum Likelihood Estimation (MLE) - IMAX - IMAX Blog

Probability and Statistics for Noob - Maximum Likelihood Estimation (MLE) - IMAX - IMAX Blog

To put it more simply, the maximum likelihood estimation is to use the known sample results to infer the parameter values ​​that are most likely (maximum probability) to lead to such a result (the model is known, the parameters are unknown).

>

Basic idea
When n groups of sample observations are randomly selected from the model population, the most reasonable parameter estimator should maximize the probability of extracting the n groups of sample observations from the model. The parameter estimator that best fits the sample data.
Likelihood function

>

Log-likelihood function
When the sample is independent and identically distributed, the likelihood function can be abbreviated as L(α)=Πp(xi;α), which involves multiplication and is not easy to deal with, so take the logarithm to study it and get the correct Number likelihood function l(α)=ln L(α)=Σln p(xi;α)

>

Solving the maximum likelihood
also uses the method of finding the extreme value of a multivariate function.

For example: There are white balls and black balls in a sack, but I don't know the ratio between them, then I have 10 draws to put back, and I found that I drew 8 black balls and 2 white balls. I asked for the most When the ratio between the possible black and white balls, the maximum likelihood estimation method is adopted: I assume that the probability of drawing a black ball is p, then the probability of getting 8 black balls and 2 white balls is:
P(black = 8)=p^8*(1-p)^2,
now I want to find out what p is The process is the process of finding the extreme value.
You may have doubts, why do you need ln, this is because ln turns multiplication into addition, and does not change the position of the extreme value (the monotonicity remains the same), so the derivation will be much more convenient~
Similarly, such a question : Let the probability density of the population X be
known : X1, X2..Xn are sample observations,
find: the maximum likelihood estimate of θ
This is the same, to get a set of sample observations such as X1, X2..Xn The probability is
P{x1=X1,x2=X2,…xn=Xn}= f(X1,θ)f(X2,θ)…f(Xn,θ)
Then we just find the θ that maximizes P , the same is the process of finding the extreme value, so I won't repeat it.

Regarding the relationship with the least squares method:

     (1)对于最小二乘法,当从模型总体随机抽取n组样本观测值后,最合理的参数估计量应该使得模型能最好地拟合样本数据,也就是估计值和观测值之差的平方和最小。最小二乘法从Cost/Loss function角度去想,一般建立模型就是让loss function最小,而最小二乘法可以认为是 loss function = (y_hat -y )^2的一个特例,类似的想各位说的还可以用各种距离度量来作为loss function而不仅仅是欧氏距离。所以loss function可以说是一种更一般化的说法。
     (2)对于最大似然法,当从模型总体随机抽取n组样本观测值后,最合理的参数估计量应该使得从模型中抽取该n组样本观测值的概率最大。最大似然估计是从概率角度来想这个问题,直观理解,似然函数在给定参数的条件下就是观测到一组数据realization的概率(或者概率密度)。最大似然函数的思想就是什么样的参数才能使我们观测到目前这组数据的概率是最大的。类似的从概率角度想的估计量还有矩估计(moment estimation)。就是通过一阶矩 二阶矩等列方程,来反解出参数。

Obviously, these are two parameter estimation methods from different principles.

     (3)最大似然估计和最小二乘法还有一大区别就是,最大似然估计是需要有分布假设的,属于参数统计,如果连分布函数都不知道,又怎么能列出似然函数呢? 而最小二乘法则没有这个假设。 二者的相同之处是都把估计问题变成了最优化问题。但是最小二乘法是一个凸优化问题,最大似然估计不一定是。在最大似然法中,通过选择参数,使已知数据在某种意义下最有可能出现,而某种意义通常指似然函数最大,而似然函数又往往指数据的概率分布函数。与最小二乘法不同的是,最大似然法需要已知这个概率分布函数,这在时间中是很困难的。一般假设其满足正态分布函数的特性,在这种情况下,最大似然估计和最小二乘估计相同。

Probability and Statistics for Rookies - Least Squares, Maximum Likelihood Estimation - IMAX - IMAX Blog

Probability and Statistics for Novices - Maximum Likelihood Estimation (MLE) - IMAX - IMAX's Blog Probability and Statistics for Novices - Maximum Likelihood Estimation (MLE) - IMAX - IMAX Blog

All in all, the least squares method takes the sum of the squares of the difference between the estimated value and the observed value as the loss function, and the maximum likelihood method takes the likelihood probability function of maximizing the target value as the objective function, and deals with linear regression from the perspective of probability and statistics. The likelihood probability function is related to least squares under the assumption that it is a Gaussian function.
Probability and Statistics for
Novices - Maximum Likelihood Estimation (MLE) - IMAX - IMAX's Blog ) - IMAX - IMAX Blog

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324550719&siteId=291194637