Types of classification problems

1. Two categories:
logistic regression
Fisher linear discriminant analysis — LDA
2. Multi -class linear discriminant analysis and multi-class logistic regression operation
in multi-class Spss

Two categories

Set qualitative variables -> numeric variables-Spss uses dummy variables

Create dummy variables:

• The number of dummy variables introduced is generally the number of categories-1 For
example: qualitative variable (male/female), there are two categories, so set a dummy variable (0 male, 1 female)
• For example, there are two types of fruits: three variables will be added (two are variables corresponding to the sample label for two types of fruits, and one label corresponds to the test set and the training set)
For binary classification, it may be necessary to modify the newly added label to get the dummy variable

Logistic regression

1. Dependent variable: category (value)
2. Covariate: independent variable

Stepwise regression-generally backward stepwise regression

1. If stepwise regression is not used, select the enter button

Save button in the interface

• Probability In
two classifications, y = = 1 y==1 probability of occurrence
• Group members
The result of classification, predict which group the sample belongs to

Option buttons in the interface

• Enter
when using forward stepwise regression, the probability of entering
• Remove
using backward stepwise regression, the probability removed

Result analysis

• y ^ \ hat {y} Under the second classification, y = = 1 y==1 for this sampleProbability of

Poor prediction

But be aware that it may overfit

Fisher Linear Discriminant Analysis — LDA

• Grouping variables: group according to the value of which variable
To define the scope

Sort button

• Summary table: the accuracy of classification can be obtained

Save button

• Prediction group members: whether the result of the two-class prediction is 0 or 1
• Group membership probability: the probability of belonging to 1 and the probability of belonging to 0

Multi-category

Fisher Linear Discriminant Analysis — LDA

Suppose there are four categories
Only need to modify the definition scope

Logistic regression

Save button

• Estimated response probability: the probability for each category

Option button

• Can be adjusted after using stepwise regression

Result analysis

• Canonical discriminant function coefficients
if classification is n, the plane is divided to have the number of n-1, there are n-1 functions

Detect overfitting

Guess you like

Origin blog.csdn.net/qq_43779658/article/details/108125007
Recommended
Ranking
Daily