1. The most intuitive understanding
Andrew Ng teacher CS229 course notes contents:
http://cs229.stanford.edu/notes/cs229-notes2.pdf
Consider a classification problem in which we want to learn to distinguish between elephants (y = 1) and dogs (y = 0), based on some features of an animal. Given a training set, an algorithm like logistic regression or the perceptron algorithm (basically) tries to find a straight line—that is, a decision boundary—that separates the elephants and dogs. Then, to classify a new animal as either an elephant or a dog, it checks on which side of the decision boundary it falls, and makes its prediction accordingly.
Here’s a different approach. First, looking at elephants, we can build a model of what elephants look like. Then, looking at dogs, we can build a separate model of what dogs look like. Finally, to classify a new animal, we can match the new animal against the elephant model, and match it against the dog model, to see whether the new animal looks more like the elephants or more like the dogs we had seen in the training set.
In short:
In order to distinguish between elephants and dogs, discriminant model based on the training set to find the decision boundary , generative model based on the training set , respectively, to the elephant and dog modeling , test case to see which one is closer to the category. This is the origin discriminant (discrimitive model) and generative model (generative model) name ( based on the literal meaning of understanding on the right ). Clearly, the generative model on the issue of distinguishing elephants and dogs, redundancy is large, it is likely to be less effective.
Generative models can generate data.Discriminative models can discriminate data.
参见:https://www.zhihu.com/question/22374366/answer/155544744
2. Contrast a classic example
Suppose there are four samples:
sample1 | sample2 | sample3 | sample4 | |
---|---|---|---|---|
x | 0 | 0 | 1 | 1 |
Y | 0 | 0 | 0 | 1 |
Generative model model in line with , specifically:
y = 0 | y = 1 | |
---|---|---|
x=0 | 1/2 | 0 |
x=1 | 1/4 | 1/4 |
Discriminant model model in line with , specifically:
y = 0 | y = 1 | |
---|---|---|
x=0 | 1 | 0 |
x=1 | 1/2 | 1/2 |
As can be seen:
- Generation model of joint distribution (Joint Distribution) modeling, and discriminant model is modeled directly posterior conditional probability (Conditional Distribution) based.
- The results generated model can be introduced (similar to) discriminant model available, but it can not be inferred discriminant model joint distribution, which means a greater amount of information generated model contains.
Features Comparison
Generation model: it can restore the joint probability density distribution , the method can not be determined; model generated faster convergence rate, i.e. when the sample size increases, the model can be learned more quickly converge to the true model; hidden variables when present, can still be used to generate a model, and No discrimination can not be used.
Discriminant model: direct conditional probability distribution learned or decision-making functions , direct prediction, often higher accuracy, the data can be various levels of abstraction, and defining feature using the feature, it is possible to simplify the problem.
Common Case
See "statistical learning methods," Li Hang
formula: Naive Bayes method, invisible Markov model
discriminant: Perceptron,
near algorithms, decision trees, logistic regression, support vector machines, lifting scheme, CRFs