Why the ROC curve can be used to accurately evaluate the model performance, the relationship between auc and roc when the positive and negative sample gaps are relatively large, and why, auc can judge the model is good or bad
Confusion matrix, actual positive samples on the horizontal axis, actual negative samples, positive samples on the vertical axis, and negative samples on the prediction
What is the PR curve and what are the benefits of using PR when the samples are balanced
Reasons why ROC benefits unbalanced samples are more accurately measured
a and c are ROC curves, and b and d are PR curves. (a) and (b) show the results of classification in the original test set (positive and negative sample distribution balance), (c) (d) is to increase the number of negative samples in the test set to 10 times the original, the classifier It can be clearly seen that the ROC curve basically keeps the original appearance, while the PR curve changes greatly
auc is the integral value of the roc curve, shown in the relationship between tpr and fpr, the most ideal must be tpr = 1, fpr = 0, so auc is to judge the relationship between tpr and fpr, the value is also called the threshold auc overlap to take the maximum threshold of the true value