table of Contents
metrics evaluation method
Note that in the multi-classification problems, the evaluation method is calculated by one class, if not a category all together, are only counted for one class, each class has its own index value!
TP FP TN FN
concept
TP = predictable and consistent with the predicted ground-truth, and FP is a positive prediction is positive and inconsistent = ground-truth and Prediction
TN = predicted and consistent negative ground-truth, and FN = predicted prediction is inconsistent and ground-truth, and negative predictive
Calculation process
In all forecasts for the cat box class, how about having the features of TP and FP frame is it?
Calculation process
- Prediction cat category under, for a determined score threshold
- Prediction sorted according to the score
- Prediction of the score is greater than defined threshold score is Positive
- Positive Prediction cat category , the IOU threshold for a determined
- And GT cat category of IOU Prediction greater than the threshold, and the other is unsubstituted Prediction GT match, marked True Positive (TP), corresponding matching GT marked as
- And GT cat category of IOU Prediction threshold of less than, or the GT has been matched, the flag is False Positive (FP)
Accuracy Precision Recall
Provided samples \ (I \) actual label \ (x_i \) , the output of confidence network y_i $ $
Accuracy Accuracy
\[ \frac{CorrectNum}{TotalNum}=\frac{TP+TN}{TotalNum} \]
Precision accuracy rate of the judgment of how many [are models judged as positive, which sentenced to a number of]
\[ p(t)=P(x_i \in C\ |\ y_i \ge t ) = \frac{TP}{TP+FP}=\frac{TP}{all\ detections} \]
Recall Recall positive number found in all samples [real labeled positive, find how many]
\[ r(t)=P(y_i \ge t \ | \ x_i \in C) = \frac{TP}{TP+FN}=\frac{TP}{all\ groundtruth} \]
https://blog.csdn.net/asasasaababab/article/details/79994920 There can not be directly used to explain why the accuracy of accuracy:
- Mainly because the category imbalance, if the most negative and most models are easy to distinguish out, that accuracy rate is very high, there is no discrimination, there is no practical significance (negative not because we are interested in)
Precision vs Accuracy
Precision precision, for a certain category, did not specify the type of precision does not make sense, default in binary classification is positive samples of precision (such as target detection which is positive for the category)
Accuracy accuracy, the calculation is the number of correctly classified samples of all proportion accounted for all categories of
Average Precision
The average accuracy rate Average Precision
\[ AP = \frac{1}{11}\sum_{r\in \{0, 0.1, 0.2, ..., 1\}} p_{interp}(r) \\ p_{interp}(r) = max_{\hat r:\hat r \ge r} p(\hat r) \]
The second explanation equation: r values from 0 to 1 at a pitch of 0.1, a total of 11 values. Select a start from a r t is calculated as a threshold from all precision greater than the value of r, the maximum precision returned
\ (AP_ {2D} \) : calculated on an image plane denoted as AP \ (AP_ {2D} \)
3D detection are also covered in the following indicators:
\ (BV AP_ {} \) : The projected BEV detected 3D perspective and calculate \ (BV AP_ {} \) , the projection can be avoided to the different objects could overlap 2D
\ (AP_ {3D} \) : direct calculation and 3D bbox the ground-truth bbox IoU, but still can not accurately measure the precise direction of bbox
PR curve
Precision-Recall Curve
Recall is the horizontal axis, Precision is the vertical axis, with the increase of Recall, Precision will drop, because in order to make the model look more full (high recall), may need to detect more objects, so there may be many false positive (making precision decline)
In Case rank is calculated for each sample after a new, current sequence and precision Recall https://blog.csdn.net/asasasaababab/article/details/79994920 Here is given an example of image
How the painting? Interpolation
] [Compare only limitation curve is sometimes difficult to distinguish different models, because often different curves may intersect (to be more intuitive comparison is actually curves and the area surrounded by coordinate axes, a numerical value) with consideration AP
Calculation Average Precision AP
In two ways: 11-point interpolation, or to calculate the area (AUC Area Under Precision-Recall Curve)
- 11 Interpolation: recall the endpoint taken every 0.1 from 0 to 1 of 11, corresponding to each endpoint \ (p_ {interp} (r ) \) is calculated as
- Area calculation: The smooth curves at right angles to the fold line, to thereby calculate the area of the curve and the axis of each small rectangular area approximates the summation
https://github.com/rafaelpadilla/Object-Detection-Metrics
Target detection index is a measure of recognition accuracy mAP (mean average precision)
A plurality of detecting objects in categories , each category may draw a curve from Precision and recall, AP is the area under the curve, is the average of a plurality of categories mAP an AP