Common evaluation indicators for machine learning: ACC, AUC, ROC curve


One, confusion matrix

Insert picture description here

Based on whether the predicted value of the sample matches the true value, 4 results can be obtained:
TP (True Positive): The predicted value of the sample matches the true value and both are positive, that is, the true positive
FP (False Positive): The predicted value of the sample is positive and true The value is negative, that is, false positive
FN (False Negative): the predicted value of the sample is negative and the true value is positive, that is, false negative
TN (True Negative): The predicted value of the sample matches the true value and both are negative, that is, true negative

Popular explanation: to judge whether the predicted value matches the true value;

2. Evaluation Index

1. Accuracy

A C C = T P + T N T P + F P + F N + T N ACC=\frac{TP+TN }{TP+FP+FN+TN} ACC=TP+FP+FN+TNTP+TN

2.AUC

ROC curve

Here we are discussing a binary classifier. The ROC curve example diagram is as follows:
ROC curve example graph
As we can see in this ROC curve example diagram, the abscissa of the ROC curve is FPR (False Positive Rate), and the ordinate is TPR (True Positive Rate)
FPR = FPFP + TN FPR=\frac{FP}{FP+TN}FPR=FP+TNFP
T P R = T P T P + F N TPR=\frac{TP}{TP+FN} TPR=TP+FNTP
Popular explanation: FPR represents the probability of a negative sample being wrong; TPR represents the probability of a positive sample

For example

Suppose we have 100 samples, of which 90 are real positive samples and 10 are negative samples; now we have
the prediction results of classifier A and classifier B and classifier A:

True True negative
Forecast is positive 90 10
Forecast negative 0 0

FPR=10/10=1; TPR=90/90=1
The prediction result of classifier B:

True True negative
Forecast is positive 70 5
Forecast negative 20 5

FPR=5/10=0.5; TPR=70/90=0.78
The prediction results of classifier A and classifier B are expressed in the ROC curve as follows
Insert picture description here

Next we consider four points and a line in the ROC curve graph.
The first point (0,1), that is, FPR (False Positive Rate)=0, TPR (True Positive Rate)=1; this is a perfect classifier, the probability of a negative sample is 0, and a positive sample is paired The probability of is 1, that is, all the sample predicted values ​​match the true values.
The second point (1,0), that is, FPR (False Positive Rate) = 1, TPR (True Positive Rate) = 0;
this is the worst classifier, the probability of a negative sample is 1, and a positive sample is The probability of right is 0, that is, all the predicted values ​​of the samples do not match the true values.
The third point (0,0), that is, FPR (False Positive Rate) = 0, TPR (True Positive Rate) = 0;
the probability of a negative sample is 0, the probability of a positive sample is 0, that is, all The sample predicted values ​​are all negative.
The fourth point (1,1), that is, FPR (False Positive Rate) = 1, TPR (True Positive Rate) = 1;
the probability of a negative sample is 1, the probability of a positive sample is 1, that is, all The sample predicted values ​​are all
after the above analysis, we can assert that the closer the ROC curve is to the upper left corner, the better the performance of the classifier .
Consider the dotted line y=x in the ROC graph. The point on this diagonal actually represents the result of a classifier using a random guessing strategy, such as (0.5,0.5), which means that the classifier randomly guesses that half of the samples are positive samples, and the other half of the samples are Negative sample.

How to draw ROC curve

For a specific classifier and test data set, obviously only one classification result can be obtained, that is, a set of FPR and TPR; to get a curve, we actually need a series of FPR and TPR values, how do we get What about?
We can use the sigmoid function to calculate the probability that the sample predicted output is positive, and then set different thresholds to obtain multiple sets of FPR and TPR values.
If we use the sigmoid function, we have obtained the probability that all sample outputs are positive. Now the problem How to change the threshold, we can sort the probabilities of all samples output as positive in descending order; the figure below is an example, there are 20 test samples in the figure, and the “Class” column represents the true label of each test sample (p Represents a positive sample, n represents a negative sample), "Score" represents the probability of each test sample outputting a positive sample

Insert picture description here
Next, we set the "Score" value as the threshold threshold from high to low. When the probability of the test sample outputting a positive sample is greater than or equal to this threshold, we consider it a positive sample, otherwise it is a negative sample.
For example: for the fourth sample in the figure, its "Score" value is 0.6, then samples 1, 2, 3, and 4 are all considered positive samples because their "Score" values ​​are all greater than or equal to 0.6, while other samples are All are considered negative samples.
At this time, FPR=1/10=0.1; TPR=3/10=0.3 are
similar in sequence. Each time a different threshold is selected, we can get a set of FPR and TPR, that is, a point on the ROC curve. In this way, we got a total of 20 sets of FPR and TPR values, and the results of drawing them on the ROC curve are as follows:
Insert picture description here
When we set the threshold to 1 and 0, we can get (0,0) on the ROC curve respectively. And (1,1) two points. Connect these (FPR, TPR) pairs to get the ROC curve. When the threshold is larger, the ROC curve is smoother.

AUC calculation

AUC (Area Under Curve) is defined as the area under the ROC curve. Obviously, the value of this area will not be greater than 1. And because the ROC curve is generally above the line y=x, the value range of AUC is between 0.5 and 1 . The AUC value is used as the evaluation criterion because in many cases the ROC curve does not clearly indicate which classifier performs better, and as a value, a classifier with a larger AUC performs better.

What does AUC mean

So what is the meaning of the AUC value? According to (Fawcett, 2006), the meaning of the value of AUC is:> The AUC value is equivalent to the probability that a randomly chosen positive example is ranked higher than a randomly chosen negative example.
Popular explanation: First, the AUC value is a probability value. When you randomly select a positive sample and a negative sample, the current classification algorithm ranks the positive sample ahead of the negative sample according to the calculated Score value. The probability is the AUC value. Of course, the larger the AUC value, the more likely the current classification algorithm will rank the positive samples in front of the negative samples, that is, better classification.

Why use ROC curve

Now that there are so many evaluation criteria, why use ROC and AUC? Because the ROC curve has a very good characteristic: when the distribution of positive and negative samples in the test set changes, the ROC curve can remain unchanged. In actual data sets, class imbalance often occurs, that is, there are many more negative samples than positive samples (or vice versa), and the distribution of positive and negative samples in the test data may also change over time. The figure below is a comparison of ROC curve and Precision-Recall curve:
Insert picture description here
In the figure above, (a) and © are ROC curves, (b) and (d) are Precision-Recall curves. (a) and (b) show the results of the classification in the original test set (balanced distribution of positive and negative samples). © and (d) are the results of the classifier after increasing the number of negative samples in the test set to 10 times the original result. It can be clearly seen that the ROC curve basically maintains its original appearance, while the Precision-Recall curve has changed significantly.

reference

1.https://blog.csdn.net/qq_40006058/article/details/89432773
2.https://blog.csdn.net/u013385925/article/details/80385873
3.https://my.oschina.net/liangtee/blog/340317

Guess you like

Origin blog.csdn.net/shiaiao/article/details/108936801