Naive Bayesian theory and practical Python

Creative Commons License Copyright: Attribution, allow others to create paper-based, and must distribute paper (based on the original license agreement with the same license Creative Commons )

principle

Naive Bayes ( Naive Bayes ) method is based on the assumption independence Bayes' theorem and characteristic condition (which is a strong assumption, although making the process easier, but sometimes at the expense of a certain classification accuracy) classification belonging generated ( the Generative approach one) method.

Why is it belongs to the generation methods?

It is through the joint probability distribution of the training data set learning p ( X , Y ) p(X,Y) , so we can represent the distribution of data from a statistical point of view, reflects the similarity of similar data itself.

Specifically , our goal is to seek posterior probability p ( Y = c k X = x ( 1 ) , . . . . , x ( n ) ) p(Y=c_k|X=x^{(1)},....,x^{(n)}) , i.e., a known input X X value, the output request Y = c k Y = c_k Probability, the probability of which the largest is the classification results.

And by the conditional probability formula: P ( Y X ) = P ( X Y ) P ( Y ) / P ( X ) P(Y|X) = P (X|Y) *P(Y) / P(X)

Where the prior probability distribution P ( Y = c k ) P(Y=c_k) K = 1,2, ... n-can be obtained directly from the training data.

For the conditional probability P ( X = x Y = c k ) P (X=x|Y=c_k) Launched as the following, it is found that there is exponential parameters, it is virtually impossible to estimate. :
Here Insert Picture Description
Here reflect naive Bayes method is simple (simple) features: We conditional probability distribution of here to do aconditional independenceassumptions.
Here Insert Picture Description
Here is assumed independence assumptions mean that the conditionscharacteristic for classification at the class of conditional independence are determined.

Our goal P ( Y X ) P(Y|X) in the denominator launched by the P (X) total probability formula is:
Here Insert Picture Description
then brought aboveconditional independence assumptions formula:
Here Insert Picture Description
and we chose the biggest result of probability, the Bayesian classifier can be expressed as:
Here Insert Picture Description
noted denominator for all c k c_k They are the same, so in fact the final classification only requirements:

Here Insert Picture Description

Real

To be added


reference

This article is a blogger summarized in "statistical learning methods" "machine learning real."

Guess you like

Origin blog.csdn.net/yexiaohhjk/article/details/92729521