01 Machine Learning: Evaluation Metrics

01 cross-validation

Splitting the data into training and test sets, using the training set to build a model, and using the test set to evaluate the model to suggest changes is known as cross-validation.

02 classification problem - confusion matrix

Confusion matrix for binary classification as an example
insert image description here

Accuracy

The predictions are more accurate than all the data.
insert image description here

recall rate

The number of positive samples in the prediction pair is greater than the number of positive samples in the upper sample.
insert image description here
Significance: find positive examples as much as possible

Accuracy

The ratio of the number of positive samples of the predicted pair to the number of all predicted positive samples
insert image description here
Meaning: the accuracy when the prediction is positive.

F value

insert image description here

Summary Q&A

Q: A model has an accuracy rate of 90%. Is the performance of this model necessarily good?

  • uncertain
  • Assuming that the probability of a certain disease is 10%, then we predict all samples as not having the disease, then the accuracy of the model can reach 90%. But this model is useless
  • At this time, it is necessary to consider the recall rate and precision rate. Assume positive cases are diseased. Then the recall rate and precision rate of such a model are equal to 0, because A=0.
  • Therefore, the accuracy rate alone is not enough to judge the performance of a model. especially encountereddata imbalancewhen.

Q: What is the relationship between recall and precision?

  • When identifying a sick person, try to find the sick person as much as possible. At this time, we want the recall rate to be as high as possible.
  • To achieve a recall rate of 100%, one way is to predict all cases as being sick. It is better to kill by mistake than let it go, but do you think that this model has good performance?
  • Therefore, it is also necessary to consider the accuracy rate and make as few wrong kills as possible.
  • Precision rate and recall rate are not contradictory indicators, but different emphases.

Q: What kind of scenarios give priority to the precision rate, and then consider the recall rate?

  • Scenario: Find out the real positive examples and add points, and judge the non-positive examples as positive examples to subtract points.

ROC curve

The relative change between the two quantities of FPR and TPR.
TPR true positive rate: the recall rate, the closer to 1, the better
FPR false positive rate: C/(C+D),

In ROC, the meaning of several special points

insert image description here
The closer the model is to z1, the better.
The ROC curve is the TPR and FPR of the model asJudgment thresholdchanging curves.

AUC

Usually we use the area of ​​the lower right corner under the ROC curve to evaluate. Why is the value range of the area 0.5-1
insert image description here
not 0-1? Because for a binary classification problem, a model with an accuracy rate of 0 is an accuracy rate of 1. Model.

The figure below shows that the area AUC (Area under couver) is 1, that is, point z1.
insert image description here

scikit-learn code

index scikit-learn
Precision from sklearn.metrics import precision_score
Recall from sklearn.metrics import recall_score
F1 from sklearn.metrics import f1_score
Confusion Matrix from sklearn.metrics import confusion_matrix
ROC from sklearn.metrics import roc_curve
AUC from sklearn.metrics import auc

03 regression problem

Regression problems require the error to be as small as possible. But the errors cannot be added directly,Because there are positive and negative errors, usually taking the absolute value or square of the error.

mean absolute error

insert image description here
Value range: 0-positive infinity

mean squared error

insert image description here
Value range: 0-positive infinity

R2

insert image description here
TSS is the mse of a model that all predicts the mean (1/m is optional). What R2 means is that your model should be at least better than a simple model (a model where all predictions are the mean).
R2 value range (negative infinity ~ 1)

A few special points are explained below

  • R2=0, your model performance (RSS) is equivalent to a model that only predicts the mean (TSS)
  • R2=1, your model predicts perfectly.The closer to 1 the better
  • R2<0, your model is very poor, not as good as a model that predicts the mean.
  • R2=minus infinity, your model may be oscillating and not converging.

scikit-learn code

index scikit-learn
MSE,RMSE from sklearn.metrics import mean_squared_error
MAE from sklearn.metrics import mean_absolute_error
R2 from sklearn.metrics import r2_score

Guess you like

Origin blog.csdn.net/qq_42911863/article/details/125681991