[Machine Learning] Basic Discriminant Model Generation Model

- Concept distinction and representative algorithm:

Both belong to the category of supervised models, and the most direct basis for distinction is that the target probability distributions learned from the given training data are different. According to the training data,

A generative model learns the joint probability distribution P(X,Y). Then, the conditional probability distribution P(Y|X) is obtained as the prediction model. Formula: P(Y|X)=P(X,Y)/P(X). Representative algorithms include: Naive Bayes, HMM, etc. Usually there is only one model, input instances, and get results.

The discriminant model learns the conditional probability distribution P(Y|X), or directly learns the discriminative decision function f(X). Representative algorithms include k-nearest neighbor algorithm, decision tree, logistic regression (learning P(Y|X)), maximum entropy model, perceptron (learning decision function f(X)), support vector machine, perceptron, boosting algorithm, condition random Airport. Usually there are multiple models (n types of n models), input examples, compare the model results, and output the most suitable results.

Angle 1, combined with the case on Wikipedia:



Angle 2, the discriminative model and generative model of the blog post machine learning - nolonely - Blog Park gives an example:

Example of discriminant model: To determine whether a sheep is a goat or a sheep, the method of using the discriminant model is to learn the model from historical data, and then predict the probability that the sheep is a goat by extracting the characteristics of the sheep. probability.

Example of generative model: Using the generative model is to first learn a goat model according to the characteristics of the goat, and then learn a sheep model according to the characteristics of the sheep, and then extract the features from the sheep, and put them into the goat model to see the probability is How much, what is the probability in the sheep model, whichever is bigger is whichever.

Taste the above example carefully, the discriminant model can directly give the probability of a sheep based on the characteristics of the sheep (such as logistic regression, when the probability is greater than 0.5, it is a positive example, otherwise it is a negative example), while the generative model It is to try them all, the one with the greatest probability is the final result

Angle 3, from a priori posterior likelihood angle.

Both the generative model and the discriminative model will eventually learn a posterior probability, namely P(Y|X). According to the Bayesian formula, P(Y|X)=[P(Y)*P(X|Y)]/ P(X), the posterior is proportional to the prior and the likelihood.

If we get the prior and likelihood from the training data, and then multiply to get the posterior, and then make a decision, it is called a generative model. That is, the posterior distribution is generated by multiplication. For example, Naive Bayes can refer to the form of formula 4.6 on page P48 of Li Hang's book.

If we ignore the prior and posterior, directly assume the posterior distribution, and train the data to obtain the model, which is called the discriminant mode. such as logistic regression.

supplementary concept,

The posterior probability is P(Y|X), the probability of this hypothesis that we want to calculate after seeing new data.

The prior probability is P(Y), which is the distribution of the input, the probability of a hypothesis before new data is obtained.

The likelihood is P(X|Y), the distribution of parameters, the probability of this data/parameter under that assumption.

The normalized constant P(X) is the probability of obtaining this data under any assumption.


- Pros and cons of both:

Generative model: The joint probability distribution P(X, Y) can be restored; when the sample size increases, the learning model converges quickly; when there are hidden variables , the generative learning method can still be used, but the discriminative method cannot be used; it can be used for outlier detection .

Discriminant model: directly face prediction, high accuracy; no need to learn multiple distributions for comparison like generative models, high efficiency; data can be abstracted, that is, feature engineering, such as creating new features, to simplify learning problems.


Reference books:

Li Hang: Statistical Learning Methods

Know: politer , etc.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324505881&siteId=291194637