Gaussian distribution
It is the normal distribution that we often say, also called the normal distribution, with many names~~the uniform is called Gaussian distribution.
The graphics are very common ~
the simplest, the distribution of human height, academic performance, basically obey the Gaussian distribution.
One-dimensional Gaussian distribution:
If the random variable X obeys a probability distribution with a location parameter of μ and a scale parameter of σ, the probability density function is as follows:
μ-mean σ-
standard deviation,
then this random variable is called a normal random variable, The distribution that a normal random variable obeys is called a normal distribution, denoted as X~N(μ,σ 2 ), pronounced as X obeys N(μ,σ 2 ), or X obeys Gaussian distribution.
The Gaussian distribution is here first, and the details of formula derivation will go back to the probability theory textbook.
Maximum likelihood estimation
The first time I saw this name, I was a little confused and didn't know what to express, but I read his English explanation: Maximum Likelihood Estimate, MLE Maximum Likelihood Estimate, is it easier to understand?
Its main function is to use the known sample results to reversely infer the parameter values that are most likely (the maximum probability) to cause such results.
When "the model has been determined and the parameters are unknown", through several experiments, observe the results, and use the experimental results to obtain a certain parameter value that can maximize the probability of the sample, which is called maximum likelihood estimation.
Learn about maximum likelihood estimation by doing an example:
example:
Assuming that the life of a batch of batteries conforms to the Gaussian distribution, the probability density is the above formula:
σ, μ> 0 are unknown parameters, three randomly selected batteries from this batch of batteries for life experiments, the failure time is 4, 5, 7
Find the maximum likelihood estimation of σ and μ.
First of all, each of our data points is generated independently of other data points. If the event (ie battery life test) is independent, then the total probability of observing all data is the product of the probability of observing each data point individually (ie the product of marginal probabilities)
1. Write the sample joint probability density equation:
only required Make L(μ,σ) take the maximum value of μ,σ. .
But the above function is very difficult to find, so we convert to a log likelihood function
2. The log likelihood function is as follows: The
next step is the problem of finding the extreme value of a binary equation that we are familiar with.
3. First find the average value μ, we find the partial derivative of μ for the function: the
solution is μ = 5.333 The
same: find σ 2 to find the partial derivative of σ for the function:
then substituting μ into σ to get
σ = 1.85
Think about the graph of the Gaussian distribution and you will understand that the likelihood equation must be the only solution, and it must be the maximum value~
in conclusion:
The two most important steps to find the maximum likelihood estimation:
(1) Write the likelihood function and the log likelihood function;
the log likelihood function:
(2) Find the partial derivative solution equation;