[Target detection] YOLOv5: Add missed detection rate and false detection rate output

foreword

In the field of target detection, the index to measure the pros and cons of a model is often mAP. However, in actual engineering, sometimes it is more inclined to look at the missed detection rate and false detection rate. The original code of YOLOv5 does not have the output of these two indicators, so I want to use the confusion matrix of the original code to output the values ​​of these two indicators.

Indicator Interpretation

Missed detection means that there is a target but it is not detected. In other words, it is the target but it is detected as the background.
False detection (false alarm) means that there is no target but it is mistaken for a target. In other words, it is originally a background but it is detected as a target.

First look at the confusion matrix originally output by YOLOv5. The areas covered in gray in the figure are the original output categories, that is, the output positive examples. The last row and column are the background classes.
The columns are the results predicted by the model, and the rows are the true results for the labels. It can be seen that the value in the last row indicates that there is a missed detection; the value in the last column indicates that there is a false detection.

insert image description here

code improvement

Now look at the confusion matrix code part of YOLOv5 output, the code is mainly located metrics.pyin ConfusionMatrixthe class.

class ConfusionMatrix:
    # Updated version of https://github.com/kaanakan/object_detection_confusion_matrix
    def __init__(self, nc, conf=0.25, iou_thres=0.45):
        """
        params nc: 数据集类别个数
        params conf: 预测框置信度阈值
        Params iou_thres: iou阈值
        """
        self.matrix = np.zeros((nc + 1, nc + 1))  # +1的目的是添加背景类
        self.nc = nc  # number of classes
        self.conf = conf
        self.iou_thres = iou_thres
        self.lou = 0
        self.total = 0
        self.xu = 0

    def process_batch(self, detections, labels):
        """
        Return intersection-over-union (Jaccard index) of boxes.
        Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
        Arguments:
            detections (Array[N, 6]), x1, y1, x2, y2, conf, class
            labels (Array[M, 5]), class, x1, y1, x2, y2
        Returns:
            None, updates confusion matrix accordingly
        """
        detections = detections[detections[:, 4] > self.conf]  # 筛除置信度过低的预测框(和nms差不多)
        gt_classes = labels[:, 0].int()
        detection_classes = detections[:, 5].int()
        iou = general.box_iou(labels[:, 1:], detections[:, :4])

        x = torch.where(iou > self.iou_thres)
        if x[0].shape[0]:
            matches = torch.cat((torch.stack(x, 1), iou[x[0], x[1]][:, None]), 1).cpu().numpy()
            if x[0].shape[0] > 1:
                matches = matches[matches[:, 2].argsort()[::-1]]
                matches = matches[np.unique(matches[:, 1], return_index=True)[1]]
                matches = matches[matches[:, 2].argsort()[::-1]]
                matches = matches[np.unique(matches[:, 0], return_index=True)[1]]
        else:
            matches = np.zeros((0, 3))

        n = matches.shape[0] > 0
        m0, m1, _ = matches.transpose().astype(np.int16)
        for i, gc in enumerate(gt_classes):
            j = m0 == i
            if n and sum(j) == 1:
                #  如果sum(j)=1 说明gt[i]这个真实框被某个预测框检测到了
                self.matrix[gc, detection_classes[m1[j]]] += 1  # correct
            else:
                #  如果sum(j)=0 说明gt[i]这个真实框没用被任何预测框检测到 也就是说这个真实框被检测成了背景框
                self.matrix[self.nc, gc] += 1  # background FP

        if n:
            for i, dc in enumerate(detection_classes):
                if not any(m1 == i):
                    self.matrix[dc, self.nc] += 1  # background FN

        self.lou = sum(self.matrix[-1, :])
        self.total = sum(sum(self.matrix))
        self.xu = sum(self.matrix[:, -1])

    def matrix(self):
        return self.matrix

    def plot(self, save_dir='', names=()):
        try:
            import seaborn as sn
            # 按照每一列进行归一化
            array = self.matrix / (self.matrix.sum(0).reshape(1, self.nc + 1) + 1E-6)  # normalize
            array[array < 0.005] = np.nan  # don't annotate (would appear as 0.00)

            fig = plt.figure(figsize=(12, 9), tight_layout=True)
            sn.set(font_scale=1.0 if self.nc < 50 else 0.8)  # for label size
            labels = (0 < len(names) < 99) and len(names) == self.nc  # apply names to ticklabels
            sn.heatmap(array, annot=self.nc < 30, annot_kws={
    
    "size": 8}, cmap='Blues', fmt='.2f', square=True,
                       xticklabels=names + ['background FP'] if labels else "auto",
                       yticklabels=names + ['background FN'] if labels else "auto").set_facecolor((1, 1, 1))
            fig.axes[0].set_xlabel('True')
            fig.axes[0].set_ylabel('Predicted')
            fig.savefig(Path(save_dir) / 'confusion_matrix.png', dpi=250)
        except Exception as e:
            pass

    def print(self):
        for i in range(self.nc + 1):
            print(' '.join(map(str, self.matrix[i])))

Reading the code, we can find that when the confusion matrix is ​​redrawn, each column is normalized separately. Before redrawing, the confusion matrix stores the number of each predicted result and the real result.

So I added three attributes self.lou, self.total = 0, self.xu = 0, to count the number of missed targets, the total number of targets and the number of false targets respectively.

The number of missed targets only needs to add the last row of the confusion matrix, the number of false detected targets only needs to add the last column of the confusion matrix, and the total number of targets needs to add all the numbers of the confusion matrix.

Then test.pyadd in:

    # Print speeds
    t = tuple(x / seen * 1E3 for x in (t0, t1, t0 + t1)) + (imgsz, imgsz, batch_size)  # tuple
    if not training:
        print('Speed: %.1f/%.1f/%.1f ms inference/NMS/total per %gx%g image at batch-size %g' % t)

    # 计算漏检率
    print("漏检样本数为:")
    print(int(confusion_matrix.lou))
    print("漏检率为:")
    print(confusion_matrix.lou / confusion_matrix.total)
    # 计算虚检率
    print("虚检样本数为:")
    print(int(confusion_matrix.xu))
    print("虚检率为:")
    print(confusion_matrix.xu / confusion_matrix.total)

    # Plots
    if plots:
        confusion_matrix.plot(save_dir=save_dir, names=list(names.values()))
        if wandb_logger and wandb_logger.wandb:
            val_batches = [wandb_logger.wandb.Image(str(f), caption=f.name) for f in sorted(save_dir.glob('test*.jpg'))]
            wandb_logger.log({
    
    "Validation": val_batches})
    if wandb_images:
        wandb_logger.log({
    
    "Bounding Box Debugger/Images": wandb_images})

Output effect:

insert image description here


2022.8.8 More

Bug fixes

It suddenly occurred to me that there is a bug in the previous code. The calculation of the missed detection rate should not use the entire content of the confusion matrix, but only the number of positive samples in the confusion matrix, otherwise the denominator will also mix the false detection target, resulting in a small result .

Intuitive understanding, output confusion matrix visualization: the denominator should be everything in the red box
insert image description here

metrics.pyRevise:

class ConfusionMatrix:
    # Updated version of https://github.com/kaanakan/object_detection_confusion_matrix
    def __init__(self, nc, conf=0.25, iou_thres=0.45):
        """
        params nc: 数据集类别个数
        params conf: 预测框置信度阈值
        Params iou_thres: iou阈值
        """
        self.matrix = np.zeros((nc + 1, nc + 1))  # +1的目的是添加背景类
        self.nc = nc  # number of classes
        self.conf = conf
        self.iou_thres = iou_thres
        self.lou = 0
        self.total = 0
        self.xu = 0
        self.class_total = 0

    def process_batch(self, detections, labels):
        """
        Return intersection-over-union (Jaccard index) of boxes.
        Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
        Arguments:
            detections (Array[N, 6]), x1, y1, x2, y2, conf, class
            labels (Array[M, 5]), class, x1, y1, x2, y2
        Returns:
            None, updates confusion matrix accordingly
        """
        detections = detections[detections[:, 4] > self.conf]  # 筛除置信度过低的预测框(和nms差不多)
        gt_classes = labels[:, 0].int()
        detection_classes = detections[:, 5].int()
        iou = general.box_iou(labels[:, 1:], detections[:, :4])

        x = torch.where(iou > self.iou_thres)
        if x[0].shape[0]:
            matches = torch.cat((torch.stack(x, 1), iou[x[0], x[1]][:, None]), 1).cpu().numpy()
            if x[0].shape[0] > 1:
                matches = matches[matches[:, 2].argsort()[::-1]]
                matches = matches[np.unique(matches[:, 1], return_index=True)[1]]
                matches = matches[matches[:, 2].argsort()[::-1]]
                matches = matches[np.unique(matches[:, 0], return_index=True)[1]]
        else:
            matches = np.zeros((0, 3))

        n = matches.shape[0] > 0
        m0, m1, _ = matches.transpose().astype(np.int16)
        for i, gc in enumerate(gt_classes):
            j = m0 == i
            if n and sum(j) == 1:
                #  如果sum(j)=1 说明gt[i]这个真实框被某个预测框检测到了
                self.matrix[gc, detection_classes[m1[j]]] += 1  # correct
            else:
                #  如果sum(j)=0 说明gt[i]这个真实框没用被任何预测框检测到 也就是说这个真实框被检测成了背景框
                self.matrix[self.nc, gc] += 1  # background FP

        if n:
            for i, dc in enumerate(detection_classes):
                if not any(m1 == i):
                    self.matrix[dc, self.nc] += 1  # background FN

        self.lou = sum(self.matrix[-1, :])
        self.total = sum(sum(self.matrix))
        self.xu = sum(self.matrix[:, -1])
        self.class_total = sum(sum(self.matrix)[: -1])


    def matrix(self):
        return self.matrix

    def plot(self, save_dir='', names=()):
        try:
            import seaborn as sn
            # 按照每一列进行归一化
            array = self.matrix / (self.matrix.sum(0).reshape(1, self.nc + 1) + 1E-6)  # normalize
            array[array < 0.005] = np.nan  # don't annotate (would appear as 0.00)

            fig = plt.figure(figsize=(12, 9), tight_layout=True)
            sn.set(font='SimHei', font_scale=1.0 if self.nc < 50 else 0.8)  # for label size
            labels = (0 < len(names) < 99) and len(names) == self.nc  # apply names to ticklabels
            sn.heatmap(array, annot=self.nc < 30, annot_kws={
    
    "size": 8}, cmap='Blues', fmt='.2f', square=True,
                       xticklabels=names + ['background FP'] if labels else "auto",
                       yticklabels=names + ['background FN'] if labels else "auto").set_facecolor((1, 1, 1))
            fig.axes[0].set_xlabel('True')
            fig.axes[0].set_ylabel('Predicted')
            fig.savefig(Path(save_dir) / 'confusion_matrix.png', dpi=250)
        except Exception as e:
            pass

    def print(self):
        for i in range(self.nc + 1):
            print(' '.join(map(str, self.matrix[i])))

test.pyRevise:

# 计算漏检率
print("漏检样本数为:")
print(int(confusion_matrix.lou))
print("漏检率为:")
print(confusion_matrix.lou / confusion_matrix.class_total)
# 计算虚检率
print("虚检样本数为:")
print(int(confusion_matrix.xu))
print("虚检率为:")
print(confusion_matrix.xu / confusion_matrix.total)

2022.8.10 more

Training and development

Suddenly thought of a question: In fact, the output indicators of YOLOv5 itself include the accuracy § and recall ®.
It is mentioned in some blog posts: Missing detection rate = 1-recall rate, can this be understood in YOLOv5?
Let’s review the formula for calculating the recall rate: R = TP / (TP+FN). In layman’s terms, the recall rate is to measure the correct proportion of the real samples.
The TP (true positive) here means the predicted correct frame, that is, the frame predicted by the model, and the iou is calculated one by one with the label frame of the image. If the maximum iou generated by the label frame is greater than the previously set iou threshold, and The label corresponding to this predicted box is consistent with the labeled box label found by the iou operation.
In other words, the TP of the recall rate molecule of YOLOv5 only has the value on the diagonal line, and it is detected but the classification error is still regarded as FN. Therefore, the missed detection rate and recall rate calculated by itself are not strictly complementary.

Thinking further, is there any way to train the model to reduce the missed detection rate, that is, to save the model with the highest recall rate.

Let's take a look at the preservation logic of the YOLOv5 model:
in train.py, a fi indicator is defined:

# Update best mAP
fi = fitness(np.array(results).reshape(1, -1))  # weighted combination of [P, R, [email protected], [email protected]]
if fi > best_fitness:
    best_fitness = fi
wandb_logger.end_epoch(best_result=best_fitness == fi)

This metric metrics.pyis defined in to get:

def fitness(x):
    # Model fitness as a weighted combination of metrics
    w = [0.0, 0.0, 0.1, 0.9]  # weights for [P, R, [email protected], [email protected]:0.95]
    return (x[:, :4] * w).sum(1)

That is to say, the model saving logic of YOLOv5 is actually 0.1 [email protected] x [email protected]:0.95. Here, four weights are defined. To target R, only need to modify the corresponding weights.

Guess you like

Origin blog.csdn.net/qq1198768105/article/details/126214241