Machine learning some basic knowledge of probability theory

Some basic knowledge of probability theory

Conditional Probability

\ (P (B | A) = \ frac {1} {3} \) the probability is represented by means happen when A, B occurs

There formula
\ [P (B | A) = \ frac {P (AB)} {P (A)} \]

\[P(AB) = P(B|A)*P(A)=P(A|B)*P(B)\]

\[ P(A|B) = \frac{P(B|A)*P(A)}{P(B)}\]

Full probability formula

\ (B_1, B_2, B_3 \) ...... \ (B_n \) is a division of the sample space S can be obtained
\ (P (A) = P (A | B_1) + P (A | B_2) + ...... P (A | B_n) = \ sum_ I = {0}} ^ {n-\) P (A | B_i) $

Bayesian formula

\[P(B_i|A) = \frac{P(A|B_i)*P(B_i)}{\sum_{i=0}^{n}$P(A|B_i)}\]

Several understanding and interpretation of the Bayesian formula

\ [P (A | B)
= \ frac {P (B | A) * P (A)} {P (B)} \] where P (A) is the prior probability probability, usually in the machine learning refers to the probability of a category that appears>

P (B | a) is the conditional probability is the probability of occurrence of B in a class

P (a | B) is the posterior probability, specifically refers to the meaning of: when an event occurs B , this time from the a classification probability is.

Maximum likelihood estimation maximum-likelihood

principle

Structure using known samples, to the maximum reverse thrust parameter values ​​may cause this result. Maximum likelihood estimates are based on a statistical method of maximum likelihood principle on, it is the application of probability theory in statistics. Maximum likelihood estimation is provided a method to assess the observed data given the model parameters, namely: "model has been set, parameters are unknown." By several tests, the results observed, the results obtained using experimental value of a parameter enables the probability of occurrence of the maximum sample, it referred to as the maximum likelihood estimation.

Since the sample set are independent and identically distributed samples, it may only be considered a Class D sample sets, to estimate the parameter vector θ. Sample set of known note: \ [D = \ {x_1, x_2, X_3, ...... x_n \} \]

\ [l (\ theta) = p (D | \ theta) = p (x_1, x_2, x_3 ...... x_N | \ theta) = \ prod_ {i = 1} ^ {n} P (x_i | \ theta) \ ] is the likelihood function D

How ML Maximum Likelihood function

Find the set of samples such that the maximum probability of occurrence value θ.

\[ \hat{\theta}=argmax l(\theta)=argmax\prod_{i=1}^{N}P(x_i|\theta)\]

Simple to understand, we are known to be in \ (\ theta \) so that the maximum probability D sequence occurs in the case occurred. The ride is not very good even calculated. We can do something to change.
\ [\ Hat {\ theta} = argmax l (\ theta) = argmax \ prod_ {i = 1} ^ {N} P (x_i | \ theta) = argmax (ln (\ prod_ {i = 1} ^ {N } P (x_i | \ theta) )) = argmax \ sum_ {i = 1} ^ {N} ln (P (x_i | \ theta)) \]

Guess you like

Origin www.cnblogs.com/bbird/p/11519772.html