[Summary of model evaluation indicators in deep learning (confusion matrix, recall, precision, F1, AUC area, ROC curve, ErrorRate)]

Summary of all evaluation indicators of the quality of the model in deep learning (confusion matrix, recall, precision, F1score, AUC area, ROC curve, ErrorRate)

navigation

  • 0. Confusion matrix
  • 1. AUC area
  • 2. ROC curve
  • 3、F1score

0. Confusion matrix

Please add a picture description

  • true positives (TP) : In these cases, we predict "yes" (they have the disease), and they do. //正确的将其预测为正样本
  • true negatives (TN): We predict "no" when in fact they do not have the disease. //正确的将其预测为负样本
  • false positives (FP) : We predict "yes", but they don't actually have the disease. (Also known as "Type I Error".) //错误的将其预测为正样本
  • false negatives (FN) : We predict "no", but they do have the disease. (Also known as "Type II error".) //错误的将其预测为负样本

  • False positive rate/false positive rate FPR : The proportion of samples predicted to be positive but actually negative to all negative samples (the real result is a negative sample). //假阳性率:错误的将其预测为正样本的个数占所有负样本的比例
    • FPR=FP / (FP+TN)

  • Recall rate recall/sensitivity Sensitivity/true rate TPR : The proportion of samples predicted to be positive and actually positive to all positive samples (the real result is a positive sample). //正确的将其预测为正样本的个数占所有正样本的比例
    • TPR=TP / (TP+FN)

  • Specificity Specificity :正确的将其预测负样本的个数占所有负样本的比例
    • Specificity=TN / (TN+FP)

  • Positive predictive value Positive predictive value PPV/ precision : 正确的将其预测为正样本的个数占所有预测为正样本的比例// How many of the predicted positive samples are true positive samples
    • PPV / Precision=TP / (TP+FP)

  • Negative predictive value Negative predictive value NPV : 正确的将其预测为负样本的个数占所有预测为负样本的比例// How many of the predicted negative samples are real negative samples
    • NPV=TN / (FN+TN)

  • parse the above table
    • There are a total of 40 positive samples and 20 negative samples;
    • Among them, 38 positive samples are predicted as positive samples, and 2 positive samples are predicted as negative samples;
    • Among them, 18 negative samples are predicted as negative samples, and 2 negative samples are predicted as positive samples;
    • Among them, the false positive rate FPR is 2/(2+18)=0.1
    • Among them, the recall rate/sensitivity/true rate TPF is 38/(38+2)=0.95

  • Medical field
    • 敏感度/召回率Pay more attention to the missed diagnosis rate (sick people should not be missed)
    • 特异度Pay more attention to the misdiagnosis rate (people without disease cannot be mistaken)
    • 假正率 / 假阳性率= 1 - specificity, more false positives, more misdiagnoses
    • 阳性预测值 / 精确率, is to see how many of the predicted positives are true positives
    • 阴性预测值It depends on how many of the predicted negatives are true negatives

1、AUC(Area under curve)

  • Often used in binary classification models

  • Understanding 1: The area under the ROC curve

  • Understanding 2: Randomly extract a pair of samples (a positive sample and a negative sample), and then use the trained classification model to predict the two samples, and the probability of predicting a positive sample is greater than the probability of a negative sample

  • advantage:

    • It is not affected by the class imbalance problem, and different sample ratios will not affect the evaluation results of AUC.
    • During training, you can directly use AUC as the loss function
  • Calculation method 1:

    • In a data set with M positive samples and N negative samples. There are a total of M N pairs of samples (a pair of samples, that is, a positive sample and a negative sample). Count the number of M N pairs of samples, the predicted probability of the positive sample is greater than the predicted probability of the negative sample
    • Calculation method 1
    • Please add a picture description
    • Suppose there are 4 samples. 2 positive samples, 2 negative samples, then M*N=4.
      That is, there are 4 sample pairs in total. They are:
      (d, b), (d, a), (c, b), (c, a)
      in the (d, b) sample pair, the probability predicted by the positive sample d is greater than the probability predicted by the negative sample b (that is, the score of d is higher than that of b), recorded as 1. The same is true for (c, b)
      . The probability predicted by the positive sample c is less than the probability predicted by the negative sample b, which is recorded as 0.
      Therefore, AUC=(1+1+1+0)/4 = 0.75
  • Calculation method 2:

    • Sort predicted probabilities from high to low

    • Set a rank value for each probability value (the highest probability rank is n, the second highest is n-1)

    • rank实际上代表了该score(预测概率)超过的样本的数目
      为了求的组合中正样本的score值大于负样本,如果所有的正样本score值都是大于负样本的,那么第一位与任意的进行组合score值都要大,我们取它的rank值为n,但是n-1中有M-1是正样例和正样例的组合这种是不在统计范围内的(为计算方便我们取n组,相应的不符合的有M个),所以要减掉,那么同理排在第二位的n-1,会有M-1个是不满足的,依次类推,故得到后面的公式M*(M+1)/2,我们可以验证在正样本score都大于负样本的假设下,AUC的值为1

    • Divide by M*N
      -Please add a picture description
      Please add a picture description

    • Tips: For samples with equal probability scores, no matter whether they are positive or negative, it doesn't matter who is in the front or who is in the back.

    • The positive sample is a dog: the number is 4;

    • Negative samples are other: the number is 3

    • Since only the rank value of the positive sample is considered:

    • For positive sample b, its rank value is (5+4+3+2)/4 = 7/2

    • For positive sample c, its rank value is (5+4+3+2)/4 = 7/2

    • For the positive sample f, its rank value is 6

    • 对于正样本g,其rank值为 7

    • AUC={ 6+7+7/2+7/2- [ 4*(4+1) ] /2 } / (4*3) =0.834
  • Python implementation

    import numpy as np
    from sklearn.metrics import roc_curve
    from sklearn.metrics import auc
    
    y = np.array([1,1,0,0,1,0,1,0,])
    pred = np.array([0.77, 0.8, 0.6, 0.1,0.4,0.9,0.66,0.7])
    
    fpr, tpr, thresholds = roc_curve(y, pred, pos_label=1)
    print("AUC:",auc(fpr, tpr))
    
    AUC: 0.5625
    

2、ROC曲线(receiver operating characteristic curve)

  • Used to measure the quality of a two-category learner;
  • If the ROC curve of one learner can completely cover the ROC curve of another learner, it means that the performance of this learner is better than that of another learner;
  • Ordinate: TPR= TP/(TP+FN) (true rate/recall rate/sensitivity)
  • Abscissa: FPR= FP/(FP+TN) (false positive rate/false positive rate)
  • Python implementation


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sklearn.metrics as metrics

def plot_ROC(labels,preds,savepath):
    """
    Args:
        labels : ground truth
        preds : model prediction
        savepath : save path 
    """
    # fpr1, tpr1, threshold1 = metrics.roc_curve(labels, preds)  ###计算真正率和假正率
    
    fpr, tpr, thresholds = roc_curve(y, pred, pos_label=1)
    
    roc_auc1 = metrics.auc(fpr, tpr)  ###计算auc的值,auc就是曲线包围的面积,越大越好
    plt.figure()
    lw = 2
    plt.figure(figsize=(10, 10))
    plt.plot(fpr, tpr, color='darkorange',
            lw=lw, label='AUC = %0.2f' % roc_auc1)  ###假正率为横坐标,真正率为纵坐标做曲线
    plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')
    plt.xlim([-0.05, 1.05])
    plt.ylim([-0.05, 1.05])
    plt.xlabel('1 - Specificity')
    plt.ylabel('Sensitivity')
    # plt.title('ROCs for Densenet')
    plt.legend(loc="lower right")
    # plt.show()
    plt.savefig(savepath) #保存文件
if __name__=="__main__":
	y = np.array([1,1,0,0,1,0,1,0,])
    pred = np.array([0.77, 0.8, 0.6, 0.1,0.4,0.9,0.66,0.7])
    savepath="./ROC.jpg"
    plot_ROC(y, pred, savepath)

The result is shown in the figure below:
insert image description here

绘制两个模型的ROC曲线

def plot_ROC_2(labels1, preds1, labels2, preds2,savepath):
    """
    Args:
        labels1 : ground truth
        preds1 : model prediction
        savepath : save path
    """
    plt.figure()
    plt.figure(figsize=(10, 10))

    fpr1, tpr1, threshold1 = metrics.roc_curve(labels1, preds1)  ###计算真正率和假正率
    roc_auc1 = metrics.auc(fpr1, tpr1)  ###计算auc的值,auc就是曲线包围的面积,越大越好
    plt.plot(fpr1, tpr1, color='darkorange', lw=2, label='AUC = %0.4f' % roc_auc1)  ###假正率为横坐标,真正率为纵坐标做曲线

    fpr2, tpr2, threshold2 = metrics.roc_curve(labels2, preds2)  ###计算真正率和假正率
    roc_auc2 = metrics.auc(fpr2, tpr2)  ###计算auc的值,auc就是曲线包围的面积,越大越好
    plt.plot(fpr2, tpr2, color='red', lw=2, label='AUC = %0.4f' % roc_auc2)  ###假正率为横坐标,真正率为纵坐标做曲线


    plt.xlim([-0.05, 1.05])
    plt.ylim([-0.05, 1.05])
    plt.xlabel('1 - Specificity')
    plt.ylabel('Sensitivity')
    # plt.title('ROCs for Densenet')
    plt.legend(loc="lower right")
    plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
    plt.show()
    # plt.savefig(savepath)  # 保存文件
if __name__=="__main__":
    y1 = np.array([1, 1, 0, 0, 1, 0, 1, 0, ])
    pred1= np.array([0.77, 0.8, 0.6, 0.1, 0.4, 0.9, 0.66, 0.7])

    y2 = np.array([0, 1, 1, 1, 1, 1, 0, 0, ])
    pred2 = np.array([0.87, 0.91, 0.6, 0.67, 0.3, 0.9, 0.16, 0.8])
    savepath="./"
    plot_ROC_2(y1,pred1,y2,pred2, savepath)

insert image description here


3、F1score

  • Used to balance 精准度precisionand 召回率recall / 敏感度Sensitivity / 真正率these two indicators, only when these two indicators are high, F1 will be high
  • The python script is as follows
"""
Precision = tp/tp+fp
Recall = tp/tp+fn
进而计算得到:
F1score = 2 * Precision * Recall /(Precision + Recall)

"""
import numpy as np
import matplotlib.pyplot as plt
 
fig = plt.figure()  #定义新的三维坐标轴
ax3 = plt.axes(projection='3d')
 
#定义三维数据
precision = np.arange(0.01, 1, 0.1)
recall = np.arange(0.01, 1, 0.1)
X, Y = np.meshgrid(precision, recall)   # 用两个坐标轴上的点在平面上画网格
Z = 2*X*Y/(X+Y)
 
# 作图
ax3.plot_surface(X, Y, Z, rstride = 1, cstride = 1, cmap='rainbow')
plt.xlabel('precision')
plt.ylabel('recall')
plt.title('F1 score')
plt.show()

insert image description here

Guess you like

Origin blog.csdn.net/crist_meng/article/details/125640618