Types of classification problems
Types of classification problems
- Two categories:
logistic regression
Fisher linear discriminant analysis — LDA - Multi -class linear discriminant analysis and multi-class logistic regression operation
in multi-class Spss
Two categories
Set qualitative variables -> numeric variables-Spss uses dummy variables
Create dummy variables:
- The number of dummy variables introduced is generally the number of categories-1 For
example: qualitative variable (male/female), there are two categories, so set a dummy variable (0 male, 1 female) - For example, there are two types of fruits: three variables will be added (two are variables corresponding to the sample label for two types of fruits, and one label corresponds to the test set and the training set)
For binary classification, it may be necessary to modify the newly added label to get the dummy variable
Logistic regression
- Dependent variable: category (value)
- Covariate: independent variable
Stepwise regression-generally backward stepwise regression
-
If stepwise regression is not used, select the enter button
Classification button in the interface-select qualitative variables and select control group
Save button in the interface
- Probability In
two classifications, y = = 1 y==1and==1 probability of occurrence - Group members
The result of classification, predict which group the sample belongs to
Option buttons in the interface
- Enter
when using forward stepwise regression, the probability of entering - Remove
using backward stepwise regression, the probability removed
Result analysis
- y ^ \ hat {y} and^Under the second classification, y = = 1 y==1 for this sampleand==Probability of 1
Poor prediction
But be aware that it may overfit
Fisher Linear Discriminant Analysis — LDA
- Grouping variables: group according to the value of which variable
To define the scope
Statistics button
Sort button
- Summary table: the accuracy of classification can be obtained
Save button
- Prediction group members: whether the result of the two-class prediction is 0 or 1
- Group membership probability: the probability of belonging to 1 and the probability of belonging to 0
Multi-category
Fisher Linear Discriminant Analysis — LDA
Suppose there are four categories
Only need to modify the definition scope
Logistic regression
Save button
- Estimated response probability: the probability for each category
Option button
- Can be adjusted after using stepwise regression
Condition button
Result analysis
- Canonical discriminant function coefficients
if classification is n, the plane is divided to have the number of n-1, there are n-1 functions
Detect overfitting