[YOLOv7/YOLOv5 series algorithm improvement NO.8] Non-maximum suppression NMS algorithm improvement Soft-nms

​Foreword: As the current advanced deep learning target detection algorithm YOLOv5, a large number of tricks have been assembled, but there is still room for improvement and improvement. Different improvement methods can be used for detection difficulties in specific application scenarios. The following series of articles will focus on how to improve YOLOv5 in detail. The purpose is to provide meager help and reference for those students who are engaged in scientific research who need innovation or friends who are engaged in engineering projects to achieve better results.

YOLOv7 improved to soft-nms code:

Link: https://pan.baidu.com/s/1N9D5xjbhQjBoH12BxVsgsw 
Extraction code: Get it after paying attention to private messages

Problem solving: YOLOv5 uses the NMS algorithm by default, mainly through IoU to filter out candidate boxes. NMS mainly uses the iterative form to continuously perform IoU operations with other frames with the largest score frame, and filter those frames with larger IoU (that is, larger intersection). Disadvantages of NMS: 1. The biggest problem in the NMS algorithm is that it forces the scores of adjacent detection frames to zero (that is, removes detection frames whose overlapping parts are greater than the overlap threshold Nt). In this case, if a real object appears in the overlapping area, it will cause the detection of the object to fail and reduce the average detection rate of the algorithm. 2. The threshold of NMS is not easy to determine. If it is set too small, there will be false deletions, and if it is set too high, it will easily increase false detections. Use soft nms to improve.

Principle:

The NMS algorithm is a bit rough, because NMS will directly delete all boxes with IoU greater than the threshold. Soft-NMS has learned the lessons of NMS. In the process of algorithm execution, it is not simply to delete the detection frame whose IoU is greater than the threshold, but to reduce the score. The algorithm flow is the same as that of NMS, but the original confidence score is calculated using function operations, the goal is to reduce the confidence score. 1. Soft-NMS can be easily introduced into the object detection algorithm without retraining the original model, and the code is easy Realized without increasing the amount of computation (compared to the entire object detection algorithm, the amount of computation is negligible). And it is easy to integrate into all current target detection algorithms using NMS. 2. soft-NMS adopts the traditional NMS method in training, and only implements soft-NMS in the inference code. 3. NMS is a special form of Soft-NMS. When the score reset function uses a binarization function, Soft-NMS and NMS are the same. The soft-NMS algorithm is a more general non-maximum suppression algorithm.

method:

The first step is to modify general.py and add the soft nms module, the code is as follows.

def my_soft_nms(bboxes, scores, iou_thresh=0.5, sigma=0.5, score_threshold=0.25):

    bboxes = bboxes.contiguous()

    x1 = bboxes[:, 0]
    y1 = bboxes[:, 1]
    x2 = bboxes[:, 2]
    y2 = bboxes[:, 3]
    # 计算每个box的面积
    areas = (x2 - x1 + 1) * (y2 - y1 + 1)
    # 首先对所有得分进行一次降序排列,仅此一次,以提高后续查找最大值速度. oeder为降序排列后的索引
    _, order = scores.sort(0, descending=True)
    # NMS后,保存留下来的边框
    keep = []

    while order.numel() > 0:
        if order.numel() == 1:  # 仅剩最后一个box的索引
            i = order.item()
            keep.append(i)
            break
        else:
            i = order[0].item()  # 保留首个得分最大的边框box索引,i为scores中实际坐标
            keep.append(i)
        # 巧妙使用tersor.clamp()函数求取order中当前框[0]之外每一个边框,与当前框[0]的最大值和最小值
        xx1 = x1[order[1:]].clamp(min=x1[i])
        yy1 = y1[order[1:]].clamp(min=y1[i])
        xx2 = x2[order[1:]].clamp(max=x2[i])
        yy2 = y2[order[1:]].clamp(max=y2[i])
        # 求取order中其他每一个边框与当前边框的交集面积
        inter = (xx2 - xx1).clamp(min=0) * (yy2 - yy1).clamp(min=0)
        # 计算order中其他每一个框与当前框的IoU
        iou = inter / (areas[i] + areas[order[1:]] - inter)  # 共order.numel()-1个

        idx = (iou > iou_thresh).nonzero().squeeze()  # 获取order中IoU大于阈值的其他边框的索引
        if idx.numel() > 0:
            iou = iou[idx]
            newScores = torch.exp(-torch.pow(iou, 2) / sigma)  # 计算边框的得分衰减
            scores[order[idx + 1]] *= newScores  # 更新那些IoU大于阈值的边框的得分

        newOrder = (scores[order[1:]] > score_threshold).nonzero().squeeze()
        if newOrder.numel() == 0:
            break
        else:
            newScores = scores[order[newOrder + 1]]
            maxScoreIndex = torch.argmax(newScores)

            if maxScoreIndex != 0:
                newOrder[[0, maxScoreIndex],] = newOrder[[maxScoreIndex, 0],]
            # 更新order.
            order = order[newOrder + 1]

    # 返回保留下来的所有边框的索引值,类型torch.LongTensor
    return torch.LongTensor(keep)

Step 2: Change NMS to soft nms in general.py.

     i = soft_nms(boxes, scores, iou_thres)  # 

Results: I have done a lot of experiments on multiple data sets, and the effect is different for different data sets, and there is a slight improvement.

A preview: the next article will share the optimization algorithm of K-Means++ anchor box . Interested friends can pay attention to me, if you have any questions, you can leave a message or chat with me privately

PS: The improved method of NMS is not only suitable for improving YOLOv5, but also can improve other YOLO networks, such as YOLOv4, v3, etc.

Finally, I hope that we can follow each other, be friends, and learn and communicate together.

Guess you like

Origin blog.csdn.net/m0_70388905/article/details/125448230