2019-05-30 (Model Evaluation Method)

Reference text

Confusion matrix

The actual positive cases Actually counter-example
Forecast as a positive example TP FP
Forecast for the counter-example FN TN

Evaluation index

Basic assessment indicators

Recall (Recall) \ sensitivity (sensible) : TP / (TP + FN)
accuracy rate (Precise) : TP / (TP + the FP)
accuracy (Accuracy) : (TP + the TN) / (TP + the FP + the TN + FN)
Fl value : 2 P R & lt / (P + R & lt), harmonic mean of precision and recall, is their comprehensive evaluation.
: F1 variant

Other evaluation index

  • ROC: Before assessments we need to all be (the predicted result ranking model, turn to each value threshold), each threshold classification results can get a confusion matrix as a threshold value for the probability threshold in each confusion matrix calculated in the following two quantities:
    the ROC curve is the x-axis to FPR, TPR curve generated for the y-axis.

    • TPR: TP / (TP + FN), the rate of true positives, i.e., the analysis of the confusion matrix in the first column.
    • FPR:, false positive cases of FP / (FP + TN), i.e., the analysis of the confusion matrix in the second column.
  • PRC: ROC and PRC are generally similar, PRC is P (precise) and R (recall) curve (ROC curve for the TPR and FPR)

  • AUC: AUC is the area under the ROC curve curve
    when the curve AUC is 0.5, and represents a prediction results indistinguishable from random guessing, an AUC less than 0.5, we need to check if the anti-tag standard.

  • Gini coefficient: 2 * AUC - 1, its geometric sense are: neutral and ROC curves which pattern area and the center line more portions ratio (neutral area: ROC coordinate system, x-axis, y-axis, x = 1, y = diagonal of the square surrounded by 1)

  • KS (Kolmogorov-Smirnov): max (TPR - FPR), is the maximum difference between the TPR and FPR, KS may effect the reaction optimal classification model, the threshold value may be generally used as the optimal value KS threshold model classification. Threshold selection scenarios need to be considered, for TPR-sensitive applications can be increased to a threshold value greater recall.

discuss

Since it is equal to the TPR = recall and ROC PRC ordinate abscissa.
Comparative TPR, FPR, Precision, Recall, can be found, TPR, Recall embodiment the denominator is positive, the denominator of the FPR anti embodiment, so long as the data set is determined, with three indicators increased monotonically increasing molecule. Precision denominator is to predict the number of positive cases, which will change with the threshold, so the change is more uncertain precision (TP and FP value of restricted sample structure, when the unbalanced structure of a sample of them values will be large and the other is small), and therefore as compared PRC, the SOC will be much more stable, the samples were sufficiently conquered, the SOC curve judgment sufficient reaction model.
SOC and PRC can have some explanation of the performance of the model, so the assessment model can be drawn them all.

Model selection method

Aside method, a hair cross-validation, bootstrap approach (on Bootstrap)
data set small, may be divided using the bootstrap method is not easy to divide the training set and test set.
Small data sets, but by leaving a divided effective method when divided
data is large when the cross-validation set is better

Guess you like

Origin blog.csdn.net/weixin_33924312/article/details/91034191