What is the meaning of mAP in target detection?

What is the meaning of mAP in target detection?

1. mAP definition and related concepts

mAP: mean Average Precision, that is, the average of each category of AP

AP: The area under the PR curve will be explained in detail later

PR curve: Precision-Recall curve

Precision: TP/(TP + FP)

Recall: TP/(TP + FN) = TP/ all ground truths

TP: The number of detection frames with IoU>0.5 (the same Ground Truth is only calculated once)

FP: IoU<=0.5 detection frame, or the number of redundant detection frames detecting the same GT

FN: The number of GTs not detected

Intersection Over Union (IOU)

The Intersection Ratio (IOU) is a measure of the degree of overlap between two detection frames (for target detection), and the formula is as follows:

Insert picture description here

B_gt represents the actual frame (Ground Truth, GT) of the target, and B_p represents the predicted frame. By calculating the IOUs of the two, it can be judged whether the predicted detection frame meets the conditions. The IOU is shown with pictures as follows:

Insert picture description here

2. The meaning of precision and recall

​ The meaning of precision and recall, preicision refers to what proportion of the positive samples you think is really positive, and recall refers to how many of the real positive samples you have found.

The core of the question: We need a threshold for the score, why? For example, in a bounding box, I identified a duck with the highest score, but it is only 0.1. Is it really a duck? It is likely that he is still a negative sample. So we need a threshold. If a duck is identified and the score is greater than this threshold, then he is really said to be a positive sample, otherwise he is a negative sample

So how does threshold affect precision and recall? We still use the duck example

If the threshold is too high, the prediction is very strict, so we think that ducks are basically ducks, and the precision is high; but because the screening is too strict, we also let go of some ducks with lower scores, so the recall is low.

If the threshold is too low, everything will be treated as a duck, the precision will be low, and the recall will be high

3. Specific calculation of mAP

From the previous definition, we can know that to calculate mAP, you must first draw the PR curve of each category and calculate the AP.

Before VOC2010, you only need to select the maximum Precision when Recall >= 0, 0.1, 0.2, …, 1 with a total of 11 points, and then AP is the average value of these 11 Precision.

In VOC2010 and later, for each different Recall value (including 0 and 1), select the maximum Precision when it is greater than or equal to these Recall values, and then calculate the area under the PR curve as the AP value.

mAP calculation example

Let's use an example to illustrate the calculation of AP and mAP

To the provisions of the two formulas, one for Precision, one Recall, these two formulas, like the above, we extend open to them, on display with another form, which all detctionsrepresents the number of boxes of all forecasts, all ground truthson behalf of all the GT Quantity.

Insert picture description here

AP is to calculate the area under the PR curve of a certain type, and mAP is to calculate the average of the area under the PR curve of all types.

Suppose we have 7 pictures (Images1-Image7), these pictures have 15 targets (green box, the number of GT, mentioned above all ground truths) and 24 prediction frames (red box, AY number indicates, and there is one Confidence value)

Insert picture description here

According to the above figure and description, we can list the following table, where Images represents the number of the picture, Detections represents the number of the predicted frame, Confidences represents the confidence of the predicted frame, TP or FP represents whether the predicted frame is marked as TP or FP (think The prediction frame and the GT's IOU value greater than or equal to 0.3 are marked as TP; if a GT has multiple predicted frames, the prediction frame with the largest IOU and greater than or equal to 0.3 is considered as TP, and the others are marked as FP, that is, a GT can only There is a prediction box labeled TP), where 0.3 is a random value .

Insert picture description here

From the above table, we can draw the PR curve (because AP is the area under the PR curve), but before that, we need to calculate the coordinates of each point on the PR curve, and sort all the prediction boxes from large to small according to the confidence level. Then you can calculate the values ​​of Precision and Recall, see the table below. (You need to remember a concept called accumulation, which is ACC TP and ACC FP in the figure below )

Insert picture description here

  • The calculation method of Precision and Recall labeled 1: Precision=TP/(TP+FP)=1/(1+0)=1, Recall=TP/(TP+FN)=TP/( all ground truths)=1/15= 0.0666 ( all ground truths 上面有定义过了)
  • Label 2: Precision=TP/(TP+FP)=1/(1+1)=0.5, Recall=TP/(TP+FN)=TP/( all ground truths)=1/15=0.0666
  • Label 3: Precision=TP/(TP+FP)=2/(2+1)=0.6666, Recall=TP/(TP+FN)=TP/( all ground truths)=2/15=0.1333
  • And so on

Then you can draw the PR curve

Insert picture description here

AP (area under the PR curve) can be calculated by obtaining the PR curve. To calculate the area under the PR, the interpolation method is generally used, taking 11 points [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7 , 0.8, 0.9, 1] of the obtained interpolation

Insert picture description here

The results of getting a category of AP are as follows:

Insert picture description here

Guess you like

Origin blog.csdn.net/better_boy/article/details/109334234