YOLO performance indicators

the term

full name

explain

True

Indicates that the reasoning is correct

False

Indicates a reasoning error, the conclusion obtained after comparing with the ground truth (position, category)

positive

Inference is positive, iou > threshold, category probability > threshold

negative

The reasoning is a negative example and does not meet the conditions of a positive example

TP    

True  positive

The reasoning is positive  and the reasoning is correct

FP 

False positive

The reasoning is positive , but the reasoning is wrong. Why does this happen?

The target detection network is different from the classification network. There are two basis for judging the results, (position, category). Even if a certain box is inferred to be a positive example, the result may not be correct.

For example: a box iou>threshold, and the category probability is greater than the threshold, it is inferred as a positive example, but it does not actually contain objects or the objects contained are not inferred categories.

TN

True negative

The reasoning is a negative example, and the reasoning is correct - it means that the effective frame does not contain any objects (position < IOU threshold, or all category probabilities < threshold)

FN 

False negative

The reasoning is a negative example, but the reasoning is wrong - it means that the target object is contained in the box , but it is not detected or the classification is wrong

P

precision

Accuracy rate, which indicates the proportion of correct reasoning in the effective frame (positive) inferred by the model. Precision = TP / (TP+FP)

R

recall

The recall rate/recall rate indicates how many inferences are successful among the real target objects.
recall = TP / (TP + FN)

Accuracy

Accuracy

Accuracy rate, the most intuitive indicator among the three indicators, indicates the proportion of the data that the model reasoned correctly to the total data.
accuracy = (TP + TN) / (TP + FP + TN + FN)

The difference between the accuracy of the object detection network and the accuracy of the classification network is that there is only one case where the result of the classification network is correct, and there is only one item for the molecule

——Because there must be an object in the classification network picture

IOU

Intersection over union

Intersection ratio, an indicator to measure the degree of overlap between two regions, blue box: ground truth; yellow box: prediction box

NMS

Non Maximum suppression

Non-maximum suppression, the optimal solution is obtained by screening local maxima. -- Location Algorithm

Give an example to illustrate the functions of the three branches of Prediction:

76x76 original image, assuming that the original image is divided into cells according to 1x1, 2x2, and 4x4

The smaller the split box, the smaller the located object. The reason is that in the 4x4 box, the small object has a high probability of being less than the threshold when calculating the IOU, so it is screened out:

      -> 1x1, split into 76x76 boxes for small object positioning

      -> 2x2, split into 38x38 boxes for positioning medium objects

      -> 4x4, split into 19x19 boxes for positioning large objects

1. Divide the picture into NxN anchor boxes, and then select 3 boxes for each anchor box, and get a total of N*N*3 boxes (denoted as boxes_0)

2. Calculate the IOU between boxes_0 and ground truth separately, leaving a box greater than a certain threshold (denoted as boxes_1);

3. Only one box with the largest iou is kept for each ground truth, and the rest is thrown away (denoted as boxes_2)

*** boxes_0 number:  19x19x3

***  boxes_1 number: The boxes to be selected in boxes_0 whose IOU is greater than the threshold

***  boxes_2 number: the number of boxes_1 after deduplication

[ How to perform the deduplication operation? NMS ]

方法:IOU_value是待选框与ground truth的IOU值,现在计算boxes_1之间的IOU(记作IOU_ab),认为IOU_ab大于阈值的两个待选框对应的是同一个ground truth,两个待选框择优选择,留下IOU_value最大的那一个。

boxes_1 iou value降序排列: 

  box_id     0 1  2  3 4   5  6  7  8  

是否有效    1  0 1  0  0  1  0  0  0

如上,假设是下述情况:

【step1】 从box0开始,box1-box8依次与box0比对,

box1/3/4/8与box0计算的IOU均大于阈值,认为是同一个框,因此标记成无效框(box0 iou_value最大,为最优解),剩下box被认为不同于box0;

【step2】移动到下一个有效框box2,

box6/7与box2计算 IOU大于阈值,标记成无效框;

【step3】移动到下一个有效框box5,

后面没有有效框了,结束;

【结果】

得到该图片上所有 box,分别对应不同的目标物体,之后再做物体分类;

预测结果包含:bounding_box(x1, y1,x2,y2), 置信率之一Pc (是有效框的概率, 即IOU_Value)

AP

Average precision

平均精确度,针对单类别。

这里会得到置信率之二: confidence ( 该框是某个类别的概率)

-> 推理正确的标志:

1. Pc > iou_threshold;

2. confidence > p_threshold

AP中的average指的就是p_threshold取不同值时,结果的平均值

-> yolo nms之后得到的数据如下,根据19x19x255为例, 19x19x3x85, 19x19x3为box数量,每一个box对应85个元素输出。

详细过程见下面的“AP计算”及相关解释。

mAP

Mean Average precision

每个类别都计算出AP,计算平均值

mAP50

IOU=0.5

mAP

(IoU=0.5:0.95)

[0.5,0.95], 每间隔0.05取一次Iou,计算该阈值的mAP,最终计算平均值

AP计算

Rank

Box ID

Pc

x1

 y1

X2

Y2

类别1概率

类别2概率

类别80概率

1

IOU_value

[是有效框的概率]

[从大到小排序]

P max

2

P second_max

19x19x3

P min

1. P_threshold 从大到小取值,coco库是如何取值的?
    COCO根据rank从小到大,依次递推,代表着p_threshold依次递减。这里p_threshold不是用某个特定的值表示的,而是根据“符合条件的概率”所占的数量rank来体现。例如:rank=2时

  • 认为rank>2的box,其P值小于阈值,均为反例,P值是该box属于某个类别的概率; 
  • rank<=2的box,其P值大于阈值,继续判断其IOU是否大于iou_threshold, 且是最优的候选框
    • 什么是”最优候选框“? 前rank个框中若存在重复框,择IOU最大的作为正例,其余为反例(具体实现方式为NMS去重)
    • IOU小于阈值的为反例

2. 计算precision和recalll, 绘制出P-R曲线

根据上个步骤得到的正例反例,计算出precision和recall,绘制P-R曲线。

AP =(平滑后PR曲线上,Recall分别等于0,0.1,0.2,… , 1.0等11处Precision的平均值)

AP = (1 + 1 + 1 + 0.5 + 0.5 + 0.5 + 0.5 + 0.5 + 0 + 0 + 0) / 11 = 0.5

备注:此为个人学习后总结,借用了其他博主的图片忘记链接。内容有不对或者有争议的地方,欢迎大家指出。

Guess you like

Origin blog.csdn.net/zmj1582188592/article/details/127449521