Carding machine learning algorithm context of Bayesian classifier

Bayesian decision theory : Bayes classifier theoretical basis.

What is the Bayesian decision theory?

Looking decision criteria, namely Bayesian decision criteria (decision rule), so that the overall risk minimization; that is, choose to make the minimum conditions that mark each sample risk categories. Each sample is selected so that the posterior probability of the biggest classifier

By Bayes' theorem is obtained posterior probability, and thus converted to the class conditional probabilities ( Likelihood Solution) of

 

How to estimate the class conditional probability ?

A common strategy is to assume that it has identified certain probability distribution form, and then based on training samples of the probability distribution parameters were estimated . Maximum Likelihood Estimation (Maximum Likelihood Estimation, MLE) from the frequency school of thought, according to data sampling method to estimate the classical probability distribution parameters.

 

Class conditional probability that the joint probability on all properties, it is difficult from a limited number of training samples directly estimated from, how to do that?

1. naive Bayes classifier practice attribute condition independence assumption (attribute conditional independence assumption), that is known classes, assume that all attributes are independent.

Real task, this assumption is often difficult to set up

2. semi naive Bayes classifier on the property of conditional independence assumption that a certain degree of relaxation, take due account of the interdependence between the part of the attribute information, which not only do not need to complete the joint probability calculation, but not so completely ignore relatively strong property dependencies. As only rely on estimates (One-Dependent Estimator, ODE) policy, which assumes that each attribute in addition to classes, most rely on only one other property. DETAILED algorithm SPODE (Super ODE), TAN ( Tree Augmented naïve Bayes), AODE (Averaged One-Dependent Estimator).

3. The Bayesian Network (Bayesian network), ie the belief network (belief network), dependencies between attributes characterize and use the conditional probability table (Conditional Probability Table, abbreviated CPT) to describe the properties of the joint probability distribution.

 

That in the presence of "non-observed" the case of variable (latent variables), and how to estimate model parameters?

EM algorithm

 

Reference material

 [1] Zhou Zhihua. Machine learning Tsinghua University Press. 2016.147-170 (Chapter 7 Bayesian classifier)

Guess you like

Origin www.cnblogs.com/klchang/p/11279905.html