Bayesian estimation of three parameters

1. The Bayesian parameter estimation

Probability and Statistics

  • Probability: Properties of the observation data at a given data generation process; models and parameters -> transactions; reasoning
  • Statistics: The observed data, consider the opposite of its data generation process; Data -> model and parameter: induction
  • Relationship: Probability theory is the mathematical basis of statistics, statistics is the application of probability theory.

Descriptive statistics and inferential statistics

  • Descriptive statistics: summary of observations depict or basic conditions (mean, variance, median, quartiles, etc.)
  • Inferential statistics: overall situation speculated that the data (parametric statistics, non-parametric statistics, the estimated amount, the real distribution, the empirical distribution) obtained based on partial data

"Likelihood" and "probability":

  • In English: likelihood (likelihood) and probability (probability) all refer to the likelihood of an event occurring
  • In statistics: the probability is known parameters to predict the result of the possibility, the likelihood is known results, the parameters are predicting the likelihood of a certain value.
  • For the function \ (P (x | \ theta ) \)
    • If \ (\ theta \) is known and remains unchanged, \ (the X-\) is a variable, the function \ (P (x | \ theta ) \) called probability function, represent different (x \) \ probability of
    • If \ (X \) is known and remains constant, \ (\ Theta \) is a variable, the function \ (P (x | \ theta ) \) is called the likelihood function, representing different \ (\ Theta \) under , \ (the X-\) probability of occurrence, also denoted \ (L (\ theta | x ) \) or \ (L (X; \ theta ) \) or \ (f (x; \ theta ) \)

School and frequency Bayesian

  • School and angle frequency Bayesian only solve the problem of different
  • Frequency school from the "natural" point of view. Objective parameters of the model are considered fixed, the sample information from the population, the study sample information can only make reasonable inferences and estimates of the overall information and more through the sample, the more accurate
  • Bayesian from "observer" angle. Think the unknown parameters can start with the subjective point of view, any unknown quantity can be regarded as random, should be distributed to describe the unknown parameters with a probability
    • School is representative of the frequency of maximum likelihood estimation; is representative of Bayesian maximum a posteriori estimate.
    • Frequency to frequency as the main faction probability, Bayesian probability with confidence as the main body

Bayesian formula: \ (P (A | B) = \ {FRAC P (B | A)} {P (B)} * P (A) \)

  • \ (P (A | B) \) is the conditional probability of occurrence of A, B is known, and because the value of B from the posterior probability is referred to as the A, B represents the confidence after the event occurs, the event A occurs degree
  • \ (P (A) \) is the prior probability or marginal probability of A, it represents the confidence of event A
  • \ (P (B | A) \) is the conditional probability of occurrence of B after A is known, and because the value obtained from the A and B is referred to as posterior probability, also referred to as a likelihood function.
  • \ (P (B) \) is the prior probability or marginal probability B is called normalized constants
  • \ (\ frac {P (B | A)} {P (B)} \) called a standard likelihood ratio, B indicates the level of support for the event of event A provided

1.2. Maximum likelihood estimation (MLE)

Maximum likelihood estimation parameter \ (\ Theta \) regarded as a fixed value, but its value is unknown. Idea is to make observations (sample) probability \ (P (X | \ theta ) \) the biggest \ (\ theta \) is the best \ (\ theta \) .

Maximum likelihood estimation solution steps:

  1. Write a single sample likelihood
  2. Write the overall likelihood function \ (L (X; \ theta ) \)
  3. Number of revolutions of the pair of likelihood function
  4. Seeking the maximum log-likelihood function (derivative, solution of the likelihood function)

1.3. Maximum a posteriori probability estimation (MAP)

The maximum likelihood function think \ (\ theta \) with a certain probability distribution, called the prior distribution, in addition to considering the likelihood function when solving \ (P (X | \ theta ) \) , we should also consider \ ( \ Theta \) prior distribution \ (P (\ Theta) \) , therefore believed that the \ (P (X | \ theta ) P (\ theta) \) takes the maximum value \ (\ Theta \) is the most good \ (\ theta \)

  • Since the prior distribution of X \ (P (X) \) is fixed, the function can be changed to maximize the \ (\ frac {P (X | \ theta) P (\ theta)} {P (X)} = P (\ theta | X) \ )

Solving steps maximum a posteriori probability estimates:

  1. Determining the prior distribution parameter \ (P (\ theta) \ ) and the likelihood function \ (L (X; \ theta ) \)
  2. The posterior distribution function determining parameters \ (L (X; \ theta ) P (\ theta) \)
  3. The posterior distribution function converted to a logarithmic function
  4. Logarithmic function of selecting the maximum value (derivative, solution of equations)

1.4. Bayesian estimation

Bayesian estimation is an extension of the maximum a posteriori estimate, at this time does not directly estimate parameters \ (\ theta \) value, but allows parameters obey certain probability distribution. Maximum likelihood estimation and the maximum a posteriori probability estimates are calculated parameter \ (\ theta \) values, and Bayesian estimation is not, Bayesian estimation extends the maximum a posteriori probability estimation MAP (a is equal, approximately equal to one) method, which a priori distribution parameter \ (P (\ Theta) \) (prior distribution of a series of observations and X- \ (P (X-) \) can not be ignored), is obtained the posterior distribution of parameter \ (P (\ Theta | X-) \) , then the calculated desired value, as its final value. Also defines a volume parameter of the variance, to assess the accuracy of the parameter estimation or confidence.
Bayesian estimation of solution steps:

  1. Determining a parameter likelihood function \ (P (X | \ theta ) \)
  2. Determining the prior distribution parameter \ (P (\ Theta) \) , the posterior distribution should be conjugated prior
  3. According to Bayesian posterior distribution formula for solving arguments
    • \(P(\theta|X)=\frac{P(X|\theta)P(\theta)}{\int P(X|\theta)P(\theta)d\theta}\)
  4. To obtain an estimate of Bayesian
    • \(\hat{\theta}=\int \theta p(\theta|X)d\theta\)

1.5. When MAP estimate is equal to the maximum likelihood estimate

When the prior distribution (no a priori information, this time the frequency equivalent to the Bayesian approach method), MAP estimate equal to MLE. Intuitively speaking, it is characterized by the lack of any prior knowledge of the most likely value. In this case, all the weights assigned to the likelihood function, so when we then multiplied by the a priori and like, to thereby obtain a posteriori likelihood is extremely similar. Thus, the maximum likelihood method can be seen as a special MAP

With the increase of data, a priori have become increasingly weak, the growing role of data, distribution parameters will move closer toward the maximum likelihood estimate. And it may prove posteriori estimate of the a priori and the result of the maximum likelihood estimation maximum convex combination.

Guess you like

Origin www.cnblogs.com/yunp-kon/p/11247564.html
Recommended