Types of classification problems
Types of classification problems
 Two categories:
logistic regression
Fisher linear discriminant analysis — LDA  Multi class linear discriminant analysis and multiclass logistic regression operation
in multiclass Spss
Two categories
Set qualitative variables > numeric variablesSpss uses dummy variables
Create dummy variables:
 The number of dummy variables introduced is generally the number of categories1 For
example: qualitative variable (male/female), there are two categories, so set a dummy variable (0 male, 1 female)  For example, there are two types of fruits: three variables will be added (two are variables corresponding to the sample label for two types of fruits, and one label corresponds to the test set and the training set)
For binary classification, it may be necessary to modify the newly added label to get the dummy variable
Logistic regression
 Dependent variable: category (value)
 Covariate: independent variable
Stepwise regressiongenerally backward stepwise regression

If stepwise regression is not used, select the enter button
Classification button in the interfaceselect qualitative variables and select control group
Save button in the interface
 Probability In
two classifications, $and==1$ probability of occurrence  Group members
The result of classification, predict which group the sample belongs to
Option buttons in the interface
 Enter
when using forward stepwise regression, the probability of entering  Remove
using backward stepwise regression, the probability removed
Result analysis
 $and^ $Under the second classification, y = = 1 y==1 for this sample$and==$Probability of$1$
Poor prediction
But be aware that it may overfit
Fisher Linear Discriminant Analysis — LDA
 Grouping variables: group according to the value of which variable
To define the scope
Statistics button
Sort button
 Summary table: the accuracy of classification can be obtained
Save button
 Prediction group members: whether the result of the twoclass prediction is 0 or 1
 Group membership probability: the probability of belonging to 1 and the probability of belonging to 0
Multicategory
Fisher Linear Discriminant Analysis — LDA
Suppose there are four categories
Only need to modify the definition scope
Logistic regression
Save button
 Estimated response probability: the probability for each category
Option button
 Can be adjusted after using stepwise regression
Condition button
Result analysis
 Canonical discriminant function coefficients
if classification is n, the plane is divided to have the number of n1, there are n1 functions
Detect overfitting