Machine Learning Metric Calculation

1. As shown in the figure below

insert image description here

  1. accuracy

 accuracy  = T P + T N T P + T N + F P + F N \text { accuracy }=\frac{T P+T N}{T P+T N+F P+F N}  accuracy =TP+TN+FP+FNTP+TNAccuracy refers to the ratio of the number of correctly predicted samples to the total number of predicted samples. It does not consider whether the predicted samples are positive or negative, but considers all samples.

  1. precision (precision rate)

 precision  = T P T P + F P \text { precision }=\frac{T P}{T P+FP}  precision =TP+FPTP

Precision refers to the ratio of the number of correctly predicted positive samples to the number of all predicted positive samples, that is to say, how many of all predicted positive samples are true positive samples. From this we can see that precision only pays attention to the part that is predicted to be a positive sample.

  1. recall (recall rate)

 recall  = T P T P + F N \text { recall }=\frac{T P}{T P+FN}  recall =TP+FNTP

It refers to the ratio of the number of correctly predicted positive samples to the total number of real positive samples, that is, how many positive samples I can correctly find out from these samples.

  1. F-score

F −  score  = 2 1 /  precision  + 1 /  recall  F-\text { score }=\frac{2}{1 / \text { precision }+1 / \text { recall }} F score =1/ precision +1/ recall 2F-score is equivalent to the harmonic average of precision and recall, and the intention is to refer to two indicators. We can see from the formula that if any value of recall and precision decreases, the F-score will decrease, and vice versa.

  1. specificity

 specificity  = T N T N + F P \text { specificity }=\frac{T N}{T N+F P}  specificity =TN+FPTNThe specificity indicator is not often seen. It is relative to sensitivity (recall), which refers to the ratio of the number of correctly predicted negative samples to the total number of real negative samples, that is, how many of them can I correctly find out from these samples? Negative samples.

  1. sensitivity(TPR)

 sensitivity  = T P T P + F N =  recall  \text { sensitivity }=\frac{T P}{T P+F N}=\text { recall }  sensitivity =TP+FNTP= recall 

  1. PR curve

We set the vertical axis to precision, the horizontal axis to recall, and change the threshold to obtain a series of pairs and draw a curve. For the prediction effect of different models on the same data set, we can draw a series of PR curves. In general, if one curve completely "surrounds" the other, we can consider that model to classify better than the comparison model.

As shown in the figure below:
insert image description here
Indicators under sample imbalance

background:

In most cases, the classification costs of different categories are not equal, that is, the cost of classifying samples as positive or negative examples cannot be compared. For example, in spam filtering, we hope that important emails will never be misjudged as spam, and in cancer detection, we would rather misjudge than miss. In this case, it is not sufficient to use the classification error rate as a metric alone, as such a metric error conceals the fact of how the sample was misclassified. Therefore, in classification, when the importance of a certain category is higher than other categories, you can use Precision and Recall multiple new indicators that are better than the classification error rate.

  1. roc(Receiver Operating Characteristic Curve)

Class imbalance often occurs in real data sets, that is, there are many more negative samples than positive samples (or vice versa), and the distribution of positive and negative samples in the test data may also change over time. In this case, the ROC curve can remain unchanged. At the same time, we can assert that the closer the ROC curve is to the upper left corner, the better the performance of the classifier, which means that the classifier obtains a high true positive rate while the false positive rate is very low.

The following is an example of a ROC curve:
insert image description here
where the abscissa of the curve is the false positive rate (False Positive Rate, FPR), N is the number of real negative samples, and FP is the positive sample predicted by the classifier among the N negative samples The number of , P is the number of real samples. where FPR = FPFP + TN FPR = \frac{FP}{FP + TN}FPR=FP+TNFP, T P R = T P T P + F N TPR=\frac{TP}{TP+FN} TPR=TP+FNTP

For example, if there are 20 samples for 2 classifications, the classification results are as follows:
insert image description here
Now we specify a threshold of 0.9, then only the first sample (0.9) will be classified as a positive example, and all other samples will be classified as Classified as a negative example, therefore, for the threshold of 0.9, we can calculate that the FPR is 0 and the TPR is 0.1 (because there are a total of 10 positive samples, and the number of correct predictions is 1), then we know that there must be a point on the curve is (0, 0.1). Select different thresholds (or "truncation points") in turn, draw all the key points, and then connect the key points to finally get the ROC curve as shown in the figure below.

insert image description here
In fact, there is a more intuitive way to draw the ROC curve, which is to set the scale interval of the horizontal axis to 1 N \frac{1}{N}N1, the scale interval of the vertical axis is set to 1 P \frac{1}{P}P1, N, P are the number of negative samples and positive samples respectively. Then arrange in descending order according to the output results of the model, traverse the samples in turn, and draw the ROC curve starting from 0. Whenever a positive sample is encountered, a curve with a scale interval is drawn along the vertical axis, and every time a negative sample is encountered, it is along the horizontal axis. Draw a curve with a scale interval. After traversing all the sample points, the curve is drawn.

ROC curve drawing using sklearn:

>>> from sklearnimport metrics
>>> import numpy as np
>>> y = np.array([1, 1, 2, 2]) #假设4个样本
>>> scores = np.array([0.1, 0.4, 0.35, 0.8])
>>> fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)
>>> fpr #假阳性
array([ 0. ,  0.5,  0.5,  1. ])
>>> tpr #真阳性
array([ 0.5,  0.5,  1. ,  1. ])
>>> thresholds #阈值
array([ 0.8 ,  0.4 ,  0.35,  0.1 ])
>>> #auc(后面会说)
>>> auc = auc = metrics.auc(fpr, tpr)
>>> auc
0.75

Plot the curve:

import matplotlib.pyplot as plt
plt.figure()
lw = 2
plt.plot(fpr, tpr, color='darkorange',
         lw=lw, label='ROC curve (area = %0.2f)' % auc)
plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristic example')
plt.legend(loc="lower right")
plt.show()

The drawn image is as shown in the figure:
insert image description here
9. auc(Area under curve)

auc refers to the area of ​​the calculated roc. The AUC value is a probability value. When you randomly select a positive sample and a negative sample, the probability that the current classification algorithm ranks the positive sample ahead of the negative sample according to the calculated Score value is the AUC value. The larger the AUC value, the greater the current classification. The more likely the algorithm is to rank positive samples ahead of negative samples, the better it can classify.

def AUC(label, pre):
  """
  适用于python3.0以上版本
   """
  #计算正样本和负样本的索引,以便索引出之后的概率值
    pos = [i for i in range(len(label)) if label[i] == 1]
    neg = [i for i in range(len(label)) if label[i] == 0]
 
    auc = 0
    for i in pos:
        for j in neg:
            if pre[i] > pre[j]:
                auc += 1
            elif pre[i] == pre[j]:
                auc += 0.5
 
    return auc / (len(pos)*len(neg))
 
 
if __name__ == '__main__':
    label = [1,0,0,0,1,0,1,0]
    pre = [0.9, 0.8, 0.3, 0.1, 0.4, 0.9, 0.66, 0.7]
    print(AUC(label, pre))

Of course, the formula can also be used for calculation: AUC = ∑ i ∈ positiveClass rank ⁡ i − M ( 1 + M ) 2 M × NAUC=\frac{\sum_{i \in \text { positiveClass }} \operatorname{rank }_{i}-\frac{M(1+M)}{2}}{M \times N}AUC=M×Ni positiveClass ranki2M(1+M)

code show as below:

import numpy as np
def auc_calculate(labels,preds,n_bins=100):
    postive_len = sum(labels)
    negative_len = len(labels) - postive_len
    total_case = postive_len * negative_len
    pos_histogram = [0 for _ in range(n_bins)]
    neg_histogram = [0 for _ in range(n_bins)]
    bin_width = 1.0 / n_bins
    for i in range(len(labels)):
        nth_bin = int(preds[i]/bin_width)
        if labels[i]==1:
            pos_histogram[nth_bin] += 1
        else:
            neg_histogram[nth_bin] += 1
    accumulated_neg = 0
    satisfied_pair = 0
    for i in range(n_bins):
        satisfied_pair += (pos_histogram[i]*accumulated_neg + pos_histogram[i]*neg_histogram[i]*0.5)
        accumulated_neg += neg_histogram[i]

    return satisfied_pair / float(total_case)
 
 y = np.array([1,0,0,0,1,0,1,0,])
 pred = np.array([0.9, 0.8, 0.3, 0.1,0.4,0.9,0.66,0.7])
print("----auc is :",auc_calculate(y,pred))
  1. AUROC (Area Under the Receiver Operating Characteristic curve)

Most of the time, AUC refers to AUROC, which is a bad practice, AUC has ambiguity (could be any curve), while AUROC has no ambiguity.

The rest are consistent with AUC.

2. Summary of image segmentation indicators

  1. pixel accuracy (marked correctly/total number of pixels)

For the sake of explanation, assume the following: a total of k + 1 k+1k+1 class (fromL 0 L_{0}L0to L k L_{k}Lk, which contains an empty class or background), pij p_{ij}pijIndicates that this belongs to class iii but predicted as classjjThe number of pixels for j . That is,pii p_{ii}piiIndicates the true positive sample, and pij , pji p_{ij},p_{ji}pij,pjiThe representations are interpreted as false positives and false negatives, respectively.

Its calculation formula is as follows: PA = ∑ 0 kpii ∑ i = 0 k ∑ j = 0 kpij PA = \frac{\sum_{0}^{k}p_{ii}}{\sum_{i=0}^{k }\sum_{j=0}^{k}p_{ij}}PA=i=0kj=0kpij0kpii
There are a total of k + 1 k+1 in the imagek+1 ,P ii P_{ii}PiiIndicates that the iiclass i is divided into class iiThe number of pixels of class i (the number of pixels correctly classified),P ij P_{ij}PijIndicates that the iiClass i is divided into jjthThe number of pixels of class j (the number of all pixels)
so the ratio represents the proportion of correctly classified pixels to the total number of pixels.

For PA PAAs far as PA is concerned, the advantage is simplicity! Disadvantages: If a large area of ​​the image is the background and the target is small, even if the entire picture is predicted as the background, there will be a high PA score, so this indicator is not suitable for evaluating the image segmentation effect of small targets.

  1. MPA(Mean Pixel Accuracy)

Its calculation formula is as follows: MPA = 1 1 + K ∑ 0 kpii ∑ j = 0 kpij MPA = \frac{1}{1+K}\sum_{0}^{k}\frac{p_{ii}}{\ sum_{j=0}^{k}p_{ij}}MPA=1+K10kj=0kpijpii

Calculate the accuracy of each category and take the mean!

  1. MIou(Mean Intersection over Union)

Computes the ratio of the intersection to the union of two sets, in semantic segmentation, the true and predicted values. MI o U = 1 k + 1 ∑ i = 0 kpii ∑ j = 0 kpij + ∑ j = 0 kpji − pii MI o U=\frac{1}{k+1} \sum_{i=0}^{k } \frac{p_{ii}}{\sum_{j=0}^{k} p_{ij}+\sum_{j=0}^{k} p_{ji}-p_{ii}}M I or U=k+11i=0kj=0kpij+j=0kpjipiipii

  1. FWIoU(Frequency Weighted Intersection over Union)

An improvement of MIou, this method can set weights for each class according to its frequency of occurrence: FWI o U = 1 ∑ i = 0 k ∑ j = 0 kpij ∑ i = 0 kpii ∑ j = 0 kpij + ∑ j = 0 kpji − pii FWI o U=\frac{1}{\sum_{i=0}^{k} \sum_{j=0}^{k} p_{ij}} \sum_{i=0}^ {k} \frac{p_{ii}}{\sum_{j=0}^{k} p_{ij}+\sum_{j=0}^{k} p_{ji}-p_{ii}}F W I o U=i=0kj=0kpij1i=0kj=0kpij+j=0kpjipiipii

3. Summary of target detection indicators

The following indicators are mainly used:

m A P mAP mAP: m e a n A v e r a g e P r e c i s i o n mean Average Precision m e an A v er a g e P rec i s i o n , that is, each categoryAP APAP Average AP
APA P :PR PRThe area under the PR curve will be explained in detail later
PR PRPR曲线: P r e c i s i o n − R e c a l l Precision-Recall PrecisionRecall曲线
P r e c i s i o n : T P / ( T P + F P ) Precision: TP / (TP + FP) Precision:TP/(TP+FP)
R e c a l l : T P / ( T P + F N ) Recall: TP / (TP + FN) Recall:TP/(TP+FN)
T P : I o U > 0.5 TP: IoU>0.5 TP:IoU>The number of detection frames of 0.5 (the same G round T ruth Ground TruthG ro u n d T r u t h is calculated only once)
FP : I o U <= 0.5 FP: IoU<=0.5FP:IoU<=0.5 detection frame, or detect the sameGT GTThe number of redundant detection frames of GT
FN FNFN : Number of GTs not detected
IOU IOUI O U : Calculate the ratio of the intersection and union of two sets
NMS NMSNMS : Non-Maximum Suppression
AP APA P calculation

To calculate AP APA P , the first thing to calculate isTP, FP, FN TP, FP, FNTPFPFN.

For a single picture, first traverse the ground ground in the pictureground t r u t h truth t r u t h object, and then extract the gt gtof a category we want to calculategt o b j e c t s objects o bj ec t s , then read the detection frame of this category detected by the detector (don’t care about other categories), and then filter out the confidence score lower than the confidence threshold, and some confidence thresholds are not set. Sort the remaining detection frames according to the confidence score from high to low, and first judge the detection frame with the highest confidence score andgt gtgt b b o x bbox bb o xiou ioui o u yes or noiou ioui o u嘈值,youngiou ioui o u is greater than the setiou iouThe i o u threshold is judged asTP TPTP , put thisgtbbox gt_bboxgtbb o x marked as detected (subsequent sameGT GTThe redundant detection frames of GT are regarded as FP FPFP , which is why it is first sorted from high to low according to the confidence score, and the detection frame with the highest confidence score first goes toiou ioui o u threshold comparison, if greater thaniou ioui o u threshold, regarded asTP TPTP , subsequent samegt gtThe detection boxes of g t objects are all regarded as FP FPFP ),iou iouIf i o u is less than the threshold, it isFP FPFP . The total number of GTs in a certain category in the picture is fixed, subtract the number of TPs, and the rest is the number of FNs

When there are TP , FP , FN TP,FP,FNTP,FP,After the FN value, we can calculatethe precision precisionprecision r e c a l l recall rec a ll . Thus calculatingAP APAP

In VOC 2010 VOC2010Before V OC 2010 , you only need to select whenR ecall > = 0 , 0.1 , 0.2 , . . . , 1 Recall >= 0, 0.1, 0.2, ..., 1Recall>=0,0.1,0.2,...,1 out of11 11P precision Precision at 11 pipsP rec i s i o n max, thenAP APA P is this11 1111 P r e c i s i o n Precision Average value of Prec i si o n .
InVOC 2010 VOC2010For V OC 2010 and later, Recall is required for each differentR ecallR ec a ll value (including 0 and 1), select it is greater than or equal to theseRecall RecallP precision Precisionat R ec a ll valueP rec i s i o n maximum value, then calculatePR PRArea under the PR curve as AP APA P value.
COCOCOCOCOCO data set, set multipleIOU IOUI O U Threshold (0.5 − 0.95 0.5-0.950.50.95, 0.05 0.05 0.05 is the step size), in eachIOU IOUThere is a certain type of AP AP under the I O U thresholdA P value, and then find differentIOU IOUAP APat I O U ThresholdA P average is the finalAP APA P value.
m AP m APm A P calculation

As the name suggests, all classes of AP APA P -value average ism AP mAPmAP

4. Model efficiency measurement

FLOPs(floating point operations)

It is assumed that the implementation of the convolution operation is in the form of a sliding window, and the nonlinear function does not consume computing resources. Then for the FLOP s FLOPs of the convolution kernelFLOPs为: F L O P s = 2 H W ( C i n K 2 + 1 ) C o u t FLOPs = 2HW(C_{in}K^{2}+1)C_{out} FLOPs=2HW(CinK2+1)Coutwhere HHH, W W W C i n C_{in} Cinis the height, width and channel number of input features, KKK is the width and length of the convolution kernel,C out C_{out}Coutis the number of output channels. At the same time, it is assumed that the size of the input and output is the same.

For fully connected layers: FLOPs = ( 2 I − 1 ) O FLOPs = (2I - 1)OFLOPs=( 2I _1 ) O ,III is the dimension of the input,OOO is the output dimension.

Guess you like

Origin blog.csdn.net/qq_52302919/article/details/131652253