Chapter 7 model assessment

7.1 classification model assessment

1, dichotomous

  Confusion matrix diagonal element represents the number of correctly classified;

  Off-diagonal elements represent the number of misclassified.

So the most ideal model (take a look at the test set), should be a diagonal matrix. If you can not get on the diagonal, and the sum of the numbers on the diagonal if the dominant is also possible.

One-sided pursuit of precision may decrease the recall rate

 

 2, multi-classification

 recall parameter average () values:

  binary representation dichotomous

  micro shows a first method of polyhydric confusion matrix

  macro represented by an unweighted average

  weighted represents the weighted average

3, FIG classification results and the reaction ROC selected threshold curve

1ROCAUC

 

 Selection Criteria: Let TPR as large as possible, FPR as small as possible, so select its inflection point

AUC represents the ROC area under the curve, can be reacted directly ROC curve near the top left of image extent.

 

How do ROC curve?

 

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html#sklearn.metrics.roc_curve

= xy_lst [(X_train, Y_train), (X_validation, Y_validation), (X_test, android.permission.FACTOR.)] 

Import matplotlib.pyplot AS PLT 
from sklearn.metrics Import roc_curve, AUC, roc_auc_score 
F = plt.figure () 

for I in Range (len (xy_lst)): 
    X_part xy_lst = [I] [0] 
    Y_part xy_lst = [I] [. 1] 
    y_pred = mdl.predict (X_part) 
    # = mdl.predict_classes y_pred (X_part) with predict output when # () is a continuous value, using predict_classes (output) is a category labeled 
    # Print (I) 
    Print (y_pred) 
    y_pred = np.array (y_pred [:,. 1]) the RESHAPE ((. 1, -1)) [0]. 
    # from sklearn Import accuracy_score .metrics, recall_score, f1_score 
    # Print (i, '---:', 'Nural Network', 'accuracy:', accuracy_score (Y_part, y_pred ), 
    # 'recall:', recall_score (Y_part, y_pred), 
    # 'Fl score:', f1_score (Y_part, y_pred ))
    f.add_subplot(1,3,i+1)
    fpr,tpr,thresholds = roc_curve(Y_part,Y_pred)
    plt.plot(fpr,tpr)
    plt.shaow()
    #这两个函数功能一样
    print('Nural Network','AUC',auc(fpr,tpr))
    print('Nural Network','AUC Score',roc_auc_score(Y_part,Y_pred))
    '''
    Nural Network AUC 0.9610879734019506
    Nural Network AUC Score 0.9610879734019506
    Nural Network AUC 0.961721658936862
    Nural Network AUC Score 0.961721658936862
    Nural Network AUC 0.9637020039792525
    Nural Network AUC Score 0.9637020039792525
    '''

( 2 ) the gain map and KS FIG.

Which KS figure is concerned, TPR curve and the gap between the FPR curve, this gap reflects the positive discrimination class sample.

 

 

 

 

 

 

Guess you like

Origin www.cnblogs.com/Cheryol/p/11442812.html