(Turn) Gamma distribution, Beta distribution, Multinomial polynomial distribution, Dirichlet Dirichlet distribution

1. Gamma Function

First, we can look at the definition of the Gamma function:

Gamma important properties include the following few: 

1. recurrence formula:

2. For any positive integer n, there

Therefore we can say is a generalization of the Gamma function factorial.

3. 

4. 

About recurrence formula can be used to complete the integration by proof:

2. Beta Function

B function, also known as Beta Function Euler integration or the first category, a special function is defined as follows:

B function has the following properties:

3. Beta distribution

Before introducing the beta distribution (Beta distribution), we need to be clear about the prior probability, posterior probability, likelihood function and the concept of conjugate distribution.

1. Popular speaking, the prior probability is the former has not yet happened, we estimate the probability of occurrence of the matter. The use of prior probabilities calculated past historical data, called objective a priori probability; when unable to obtain historical data or incomplete data, with people's subjective experience to judge a priori probability obtained, known as subjective a priori probabilities. For example, the probability of a coin toss head up is 0.5, which is subjective a priori probabilities.
2. The posterior probability means the acquisition of new or additional information through other means investigation, prior probability of correct Bayes formula, the probability is then obtained.
3. The difference between the prior probability and posterior probability of: prior probability is not measured in accordance with all the information about the state of nature, but rather the use of existing materials (mainly historical data) computed; posterior probability related to the use of the natural state more comprehensive information, both prior probability data, but also additional information. Describes a case where the variable is in the absence of a priori probability of truth; then posterior probability (Probability of outcomes of an experiment after it has been performed and a certain event has occured.) In consideration of the fact that: Another expression after the conditional probability.
4. The conjugate distribution (conjugacy): posterior probability distribution function and the prior probability distribution function of the same form.

Relationship prior probability and posterior probability is:

 

Beta distribution of the probability density function is:

 

 

Random variable X subject to parameters α \ alphaα, β \ betaβ distribution of Β usually written:

 

 

Beta distribution relationship with Gamma distribution is as follows:

One sentence, beta probability distribution can be viewed as a probability distribution, when you do not know what is the probability of a specific number, it can be given the possibility of the probability of all sizes.

Expectation and variance of the Beta distribution are as follows:

4. Beta binomial distribution is conjugated a priori

This conclusion is important, in practical application is quite extensive.
Prior to this, we first briefly recall the Bernoulli distribution and binomial distribution.
Bernoulli (Bernoulli distribution) have called 0-1, Bernoulli distributed based on Bernoulli experiments (Bernoulli trial) from.

Bernoulli trials are only two possible outcomes of a single randomized trial, that for a random variable X is:

 

Is the question "YES OR NO" on the experimental nature of Bernoulli. The most common example is a coin toss.
If the probability of a Bernoulli experiment, assuming success (X = 1) is p (0 <= p <= 1), the probability of failure (X = 0) is 1-p, called the Bernoulli random variable X obedience distributed.

Binomial distribution (Binomial distribution) n is the number of successful re Bernoulli trials discrete probability distribution.
If the test is a E n Bernoulli trials heavy, probability of success of each Bernoulli trial is p, the probability of X represents the number of successes, then X is a binomial distribution, referred to as X ~ B (n, p) its probability mass function is:

It is apparent from the above definitions, the binomial distribution is a Bernoulli n = 1 is a special case.
Binomial distribution is the most widely used example is the coin toss, and assume that the probability of the coin face up to p, n times repeated throw coins, k times for a positive probability is the probability of a binomial distribution.

In less experimental data, if we directly use maximum likelihood estimation, a binomial distribution parameters may appear over-fitting phenomenon. For example, throwing three coins are positive, then the maximum all like flipping a coin to predict the results of future natural law are positive. To avoid this, the prior probability distribution can be considered incorporated p (μ) to control the parameters [mu], to prevent over-fitting phenomenon. So how should we choose p (μ)?

We mentioned earlier, the relationship between the prior probability and posterior probability is:

 

 

 

Binomial likelihood function is:

 

If the relationship selected prior probability p (μ) and [mu] is (1-μ) th power of the product, then the posterior probability distribution will be the same in the form of a priori, so that the a priori probability is the posterior probability distribution of the conjugate .

By the third part, we know the probability density function of the Beta distribution is:

 

We just meet the above requirements! So, conjugated Beta distributed binomial distribution priori!

5. multinomial distribution

The binomial distribution is extended to polynomial distribution (Multinomial Distribution), binomial distributed Bernoulli experiment n times, provides the results of each experiment are only two. N now or do experiments, but the results of each experiment into a m a, m and the probability of occurrence of mutually exclusive and results and is 1, which is the result of the occurrence of a polynomial is X times the probability distribution.
Throw the dice is a typical polynomial distribution. 6 die face should have six different points, so that the probability of each single point is upwardly 1/6 (corresponding to p1 ~ p6, their values are not necessarily 1/6, and is a long and can be mutually exclusive, such as a irregularly shaped dice), throwing repeated n times, ask if there is a probability that the k is up 6 points:

The general probability mass function polynomial distributed as follows:

 

 

 The test was carried out N times, the number of the i-th note is mi The possible, wherein

 

 

 

Simple deduction about the probability mass function is derived:
k kinds of independent values may, n experiments, every possible probability p1, p2, ..., pk.
The first was selected m1 times m2 was selected second time, mk times the probability of the k-selected species are as follows:

 

Expand either get the results above.

6. Dirichlet Dirichlet distribution

Earlier we talked about the conjugate prior Beta distributed binomial distribution, Dirichlet distribution is the conjugate prior polynomial distribution.
Dirichlet (Dirichlet) at the same time can be seen as the Beta distribution extended to the multivariate case. The probability density function is defined as follows:

among them

Dirichlet distribution as a parameter. And there is:

B ( [alpha] ) represents a normalization constant Dirichlet distribution:

 

Beta function is similar to the following equation holds:

 

Dirichlet distribution is expected:

 

 https://blog.csdn.net/bitcarmanlee/article/details/82156281#commentBox

Guess you like

Origin www.cnblogs.com/boceng/p/11748246.html