Accuracy, accuracy and recall

1. What can these three rates do?

These three rates can compare the quality of a model.

For a chestnut, I have 10 bananas, 1 for good bananas, 0 for bad bananas, they are arranged in order as follows:

1  1  1  1  1  0  0  0  0  0

I let the a model help me sort out good bananas, it gives such a result

1  1  1  1  0  0  0  0  0  1

Okay, let's analyze the work done by the a model.

We can roughly be divided into the following four cases:

Originally a good banana, a is also considered a good banana, there are 4

Originally a good banana, a is considered a bad banana, there is 1

Originally a bad banana, a is also considered a bad banana, there are 4

Originally a bad banana, a is considered a good banana, there is 1

Okay, dizzy, it's time to introduce some scientific ideas! We systematically classify the four types according to actual values ​​and predicted values:

    prediction prediction  
    1 0 total
actual 1 TP FN actual positive
actual 0 FP TN actual negative
total   predicted positive predicted negative Total number of samples

The data of the a model is as follows:

tp:4

fn:1

fp:1

tn:4

The classification is good, let's analyze

2. Accuracy

The accuracy rate is relative to the prediction result , which indicates how many of the samples with positive predictions are correct;

The numerator of the accuracy rate is the number of positive samples identified correctly, and the denominator is the number of positive samples considered by the model

P=TP/(TP+FP)

The accuracy rate of a model should be: 4 / (4 + 1) = 4/5

3. Recall rate

The recall rate is relative to the sample, that is, how many positive samples in the sample are predicted correctly;

The numerator of the recall rate is to identify the correct number of positive samples, and the denominator is how many positive samples actually there are.

R= TP / (TP+FN)

The recall rate of the a model should be: 4 / (4 + 1) = 4/5

4. Accuracy

Accuracy refers to how many of the judgments are correct, that is, the positive judgment is positive, and the negative judgment is negative;

The numerator of accuracy is the number of positive samples identified as positive plus the number of negative samples identified as negative, and the denominator is the total number of samples.

P=(TP+TN) / (TP+TN+FN+FP)

The accuracy of a model should be: (4 + 4) / 10 = 4/5

 

Okay, the accuracy, recall and accuracy of the a model are all 4/5,

I asked the b model to help me divide bananas. Bananas are still those bananas. The arrangement has not changed.

1  1  1  1  1  0  0  0  0  0

b gives the result

1  0  0  0  0  0  0  0  0  0  

According to the above algorithm, we get the three rate of b:

Accuracy: 1

Recall rate: 1/5

Accuracy: 3/5

So the question is, which is better, model a or model b?

Seeing the accuracy rate is b high, and the recall rate is a high, how can this be adjusted?

5. Average algorithm

The accuracy rate is not considered for the time being, let us analyze the accuracy rate and recall rate of a and b

1), arithmetic mean , ( calculate the arithmetic mean of precision and recall)

a model: 4/5

b model: 3/5

The a model scored 80 and the b model scored 60. But it is obvious. b The model is not worthy of 60 points. We consider another algorithm

2), the geometric mean, that is, n items are multiplied and then nth power

a model: 0.8 * 0.8 and then square root, get 0.8

b model: 1 * 0.2 and then prescribing to get 0.44

The a model is still 80, and the b model only got 44 points this time. Feeling more comfortable, what other algorithms are there?

3) The harmonic mean, also known as the reciprocal mean, is the reciprocal of the arithmetic mean of the reciprocal of each statistical variable.

To explain, the n samples are first reciprocal, then the sum is divided by n, and then the result is reciprocal.

a model: (5/4 + 5/4) / 2 then find the reciprocal and get 4/5

b model: (1 + 5) / 2 and then find the reciprocal to get 1/3

The a model is still 80, and the b model is 33 points this time, which is much more comfortable this time.

 

By the way, the sensitivity of the three averages to the minimum value is: harmonic average> geometric average> arithmetic average

Guess you like

Origin www.cnblogs.com/0-lingdu/p/12671727.html