Model Evaluation Metrics for Machine Learning

  Predictive value
Normal example Negative number
actual value Normal example  True Example (A)  False negative (B)
Negative number  False Positives (C)  True Negatives (D)

 

  Predictive value
Normal example Negative number
actual value Normal example  TP  FN
Negative number  FP TN

Model testing is generally measured by four metrics:

Accuracy: The number of correct samples extracted/total number of samples

Recall: the number of correct positive samples/the number of positive samples in the sample, also defined as the recall rate

Precision: the number of correct positive samples/the number of samples predicted to be positive, also defined as the precision rate

F-value: equal to the harmonic mean of recall and precision

It should be noted here that the relationship between recall rate and precision rate is mutually exclusive. The reasons are as follows:

1. For the recall rate, its denominator is fixed, which is the number of positive examples in the true value. We hope that the recall rate (that is, the recall rate) is as high as possible. The limit condition is when all values ​​are considered positive. For example, the recall rate is 1, but the disadvantage of this is that it cannot distinguish between positive and negative examples. For example, positive examples are sick people, and negative examples are healthy people. If the recall rate is 1, it is considered that the test samples are All are sick, and it is clearly unreasonable to do so;

2. If we want the recall rate to increase, that is, the number of positive cases in the measured value is increased, as shown in the following table: (where positive cases represent sickness, and negative cases represent health, the purpose is to find out the number of sick people in the sample)

Case 1

  Predictive value
Normal example: 70 Example: 30
actual value Normal example: 80  TP:60  FN:20
Example: 20  FP:10 TN:10

Recall rate: 3/4 (recall rate)

Accuracy rate: 6/7 (precise rate)

Case 2

  Predictive value
Normal example: 75 Example: 25
actual value Normal example: 80  TP:64  FN:10
Example: 20  FP:11 TN:15

 Recall: 64/80

Accuracy: 64/75

3. It can be seen from the above analysis that since the ratio of true cases and false positive cases is constant , when the number of positive cases in the predicted value increases, the values ​​of true cases and false positive cases will increase, but the numerator is different from the denominator. The value of a false positive example, so in this case, the recall rate is increased, that is, the recall rate is increased, but the error rate is correspondingly decreased.

A simple understanding is that if the majority of people are considered to be sick, the probability of recall is increased, but the results of most people will lead to a decrease in precision.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324486232&siteId=291194637