Evaluation criteria mAP target detection, Precision, Recall, Accuracy

metrics evaluation method

Note that in the multi-classification problems, the evaluation method is calculated by one class, if not a category all together, are only counted for one class, each class has its own index value!

TP  FP  TN  FN

concept

TP = predictable and consistent with the predicted ground-truth, and FP is a positive prediction is positive and inconsistent = ground-truth and Prediction

TN = predicted and consistent negative ground-truth, and FN = predicted prediction is inconsistent and ground-truth, and negative predictive

Calculation process

In all forecasts for the cat box class, how about having the features of TP and FP frame is it?

Calculation process

  1. Prediction cat category under, for a determined score threshold
  • Prediction sorted according to the score
  • Prediction of the score is greater than defined threshold score is Positive
  1. Positive Prediction cat category , the IOU threshold for a determined
  • And GT cat category of IOU Prediction greater than the threshold, and the other is unsubstituted Prediction GT match, marked True Positive (TP), corresponding matching GT marked as
  • And GT cat category of IOU Prediction threshold of less than, or the GT has been matched, the flag is False Positive (FP)

Accuracy  Precision  Recall

Provided samples \ (I \) actual label \ (x_i \) , the output of confidence network y_i $ $

Accuracy Accuracy

\[ \frac{CorrectNum}{TotalNum}=\frac{TP+TN}{TotalNum} \]

Precision accuracy rate of the judgment of how many [are models judged as positive, which sentenced to a number of]

\[ p(t)=P(x_i \in C\ |\ y_i \ge t ) = \frac{TP}{TP+FP}=\frac{TP}{all\ detections} \]

Recall Recall positive number found in all samples [real labeled positive, find how many]

\[ r(t)=P(y_i \ge t \ | \ x_i \in C) = \frac{TP}{TP+FN}=\frac{TP}{all\ groundtruth} \]

https://blog.csdn.net/asasasaababab/article/details/79994920 There can not be directly used to explain why the accuracy of accuracy:

  • Mainly because the category imbalance, if the most negative and most models are easy to distinguish out, that accuracy rate is very high, there is no discrimination, there is no practical significance (negative not because we are interested in)

Precision vs Accuracy

Precision precision, for a certain category, did not specify the type of precision does not make sense, default in binary classification is positive samples of precision (such as target detection which is positive for the category)

Accuracy accuracy, the calculation is the number of correctly classified samples of all proportion accounted for all categories of

Average Precision

The average accuracy rate Average Precision

\[ AP = \frac{1}{11}\sum_{r\in \{0, 0.1, 0.2, ..., 1\}} p_{interp}(r) \\ p_{interp}(r) = max_{\hat r:\hat r \ge r} p(\hat r) \]

  • The second explanation equation: r values ​​from 0 to 1 at a pitch of 0.1, a total of 11 values. Select a start from a r t is calculated as a threshold from all precision greater than the value of r, the maximum precision returned

  • \ (AP_ {2D} \) : calculated on an image plane denoted as AP \ (AP_ {2D} \)

3D detection are also covered in the following indicators:

  • \ (BV AP_ {} \) : The projected BEV detected 3D perspective and calculate \ (BV AP_ {} \) , the projection can be avoided to the different objects could overlap 2D

  • \ (AP_ {3D} \) : direct calculation and 3D bbox the ground-truth bbox IoU, but still can not accurately measure the precise direction of bbox

PR curve

Precision-Recall Curve

  • Recall is the horizontal axis, Precision is the vertical axis, with the increase of Recall, Precision will drop, because in order to make the model look more full (high recall), may need to detect more objects, so there may be many false positive (making precision decline)

  • In Case rank is calculated for each sample after a new, current sequence and precision Recall https://blog.csdn.net/asasasaababab/article/details/79994920 Here is given an example of image

  • How the painting? Interpolation

  • ] [Compare only limitation curve is sometimes difficult to distinguish different models, because often different curves may intersect (to be more intuitive comparison is actually curves and the area surrounded by coordinate axes, a numerical value) with consideration AP

Calculation Average Precision AP

In two ways: 11-point interpolation, or to calculate the area (AUC Area Under Precision-Recall Curve)

  • 11 Interpolation: recall the endpoint taken every 0.1 from 0 to 1 of 11, corresponding to each endpoint \ (p_ {interp} (r ) \) is calculated as
  • Area calculation: The smooth curves at right angles to the fold line, to thereby calculate the area of ​​the curve and the axis of each small rectangular area approximates the summation

https://github.com/rafaelpadilla/Object-Detection-Metrics

Target detection index is a measure of recognition accuracy mAP (mean average precision)

A plurality of detecting objects in categories , each category may draw a curve from Precision and recall, AP is the area under the curve, is the average of a plurality of categories mAP an AP

Guess you like

Origin www.cnblogs.com/notesbyY/p/11565102.html