scikit-learn multi-classification confusion matrix

Note: Some markdown syntax is not rendered, the book can be simplified view: scikit Learn multi-classification confusion matrix

front

sklearn.metrics.multilabel_confusion_matrixIt is scikit-Learn 0.21 a new function. Look know the name confusion matrix is used to calculate the multi-tag. But you can also use it to calculate the confusion matrix multi-classification. MCM will classify data into multi-2 classification, using one-vs-rest strategy, that is some sort of a positive sample, the remaining categories for the negative samples. Each category as positive samples, confusion matrix calculation. All return order labels.
MCM returns a binary classification each confusion matrix, TN in [0, 0], FN [1, 0], the TP [1,1], FP [1, 0], i.e.,
| TN | FP |
| - | - |
| FN | TP |

Examples of official

## 如果导入报错,检查一下 sk-learn version >= 0.21
>>> from sklearn.metrics import multilabel_confusion_matrix
>>> y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
>>> y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
>>> mcm = multilabel_confusion_matrix(y_true, y_pred,
...                             labels=["ant", "bird", "cat"])
>>> mcm
array([[[3, 1],
        [0, 2]],
       [[5, 0],
        [1, 0]],
       [[2, 1],
        [1, 2]]])

In the first category 'ant' as an example, there are two prediction, its negative samples, 'bird' and 'cat' is 3 ( 'bird' prediction into 'cat', can be considered for a prediction of the because they are a class, are negative samples.) predicted negative samples have a positive sample.

Evaluation index

Each class TP, FP and the like can be extracted by:

>>> tp = mcm[:, 1, 1]
>>> tn = mcm[:, 0, 0]
>>> fn = mcm[:, 1, 0]
>>> tp, tn
(array([2, 0, 2], dtype=int64), array([3, 5, 2], dtype=int64))

Here are a few common assessment indicators:

  1. Sensitivity (sensitivity), also known as the recall (recall), also called the recall. This indicator is to look at the proportion of the total positive samples of positive samples in the forecast. It can be said as a predictor of sensitivity of the positive samples, the larger, indicating more sensitive predictor of positive samples.
    $$ sn = \ frac {tp} {tp + fn} $$
  2. Instead specificity (specificity) and sensitive this sensitivity considered positive samples, and the specific count is negative samples. In other words, it refers to the sensitivity of the negative samples. After all, you predictor, not just the sensitive positive samples and negative samples, he relaxed. It is necessary to assess the sensitivity predictor of negative samples.
    $$ sp = \ frac {tn} {tn + fp} $$
  3. Precision (precision), which is to look at your forecast for the proportion of positive samples positive samples correctly predicted the total forecast.
    $$ precision = \ frac {tp} {tp + fp} $$
  4. f1 value, in general, recall and precision are not at the same time are great. For example: you now have 100 A and 100 B, you are now training with a good model to predict A, predicted to have 80 A. But this one was right 75 A. That precision is $ 75 / $ 80 = 0.9375%, the recall rate is $ 75 / $ 100 = 0.75. Do you think the recall rate is too low, you continue to improve the model. Has conducted a prediction, this prediction to 95 A. Wherein the prediction is correct by 85, i.e. the recall: $ 85 / $ 100 = 0.85, an increase of 0.1, but the precision: $ 85 / $ 95 = 0.895 decreased. You want to check the more, the more prone to error. In order to take care of two, so that the two indicators have good Found, there f1 value:
    $$ = Fl \ {FRAC * 2 (* Precision Recall)} {(Precision Recall +)} $$

It is easy to obtain a multi-category evaluation value of each class by the code:

>>> sn = tp / (tp + fn) ## 其它同理
>>> sn
 array([1.        , 0.        , 0.66666667])x xz

Using the one-vs-rest when the multi-classification into a two-class problem, they tend to lose some of the information. There are several categories in the negative sample, but no matter the label of its own to predict whether in the negative sample, the sample is positive as long as not predict the label is correct. It is not a good evaluation of rest in prediction. Evaluation want better multi-classification should be considered at the macro or micro average average.

reference

sklearn.metrics.multilabel_confusion_matrix

Original: scikit Learn multi-classification confusion matrix

Guess you like

Origin www.cnblogs.com/huanping/p/10959271.html