The probability distribution of learning

Prior Conjugate (conjugate prior):

Conjugate is a conceptual Bayesian theory, generally it refers to the conjugate prior distribution and the likelihood function like conjugation between; the conjugated outcome is so posterior distribution (Bayes' formula, posterior the probability distribution being ⽐ in the prior probability distribution and the likelihood function like product) and the prior distribution has the same functional form, it simply is subject to the same forms of distribution.

Reason for using a priori reason conjugate form is such that the posterior distribution and the prior distribution of the same, so that one person comply visual (they should be the same form), on the other hand can form a chain priori , i.e., the posterior distribution is now p (θ | x) may be used as the prior distribution p (θ) in the next calculation, if the same form, and form a chain.

 

Likelihood function:

Bayesian formula where, p (x | θ) is called the likelihood function. Before learning and will find maximum likelihood estimates of L (θ | x) is not the same. But in fact Bayesian formula where . Explanation: In case where the parameter θ observation value x is given likelihood function is equal, in the case where the parameter θ is given, taking the probability of x. Note: P | vertical bar (x θ) does not indicate the conditional probability, only takes one of the meanings of the value.

 

Probability density function:

Probability distribution function, the probability value is less than a given value, is

In the form of cumulative probability F (x) = P (xi <x) = sum (P (x1), P (x2), ......, P (x)) (for a discrete variable);

Or integrates the probability density function f (x) (for continuous variables).

Discrete probability function used to describe the variable, i.e. giving the probability of occurrence of each value in the form of a function, P (x) (x = x1, x2, x3, ......).

Probability density function for describing a continuous variable, the probability density function f (x) value at the point a, is not the time of {x = a}. However, the larger the value, the greater the probability that the value X in the vicinity of a point. The probability density function can only be used to describe the probability of a certain interval, and is equal in this section integrating the f (x).

 

Binomial distribution (p refers to the probability of success of the event, the same below):

Bernoulli distribution / 0-1 binomial distribution is a special case at n =. 

 

Geometric distribution:

 

Poisson distribution (x represents the number of times to be calculated to within a given time range of events, u represents the average number of times within a given time to a range of events):

 

 

https://www.cnblogs.com/simayuhe/p/5143538.html

https://blog.csdn.net/baimafujinji/article/details/51374202

https://www.cnblogs.com/aaronsw/p/7071124.html

https://www.jianshu.com/p/0cfc3204af77

http://www.360doc.com/content/17/1231/22/9200790_718001949.shtml

Guess you like

Origin www.cnblogs.com/boceng/p/11668172.html