YoloV8 improvement strategy: Shape-IoU, considering the measurement of border shape and scale

Summary

This article attempts to use the latest Shape-IoU to improve YoloV8 and achieve an increase in my own data set.

Paper: "Shape-IoU: Metrics Considering Border Shape and Scale"

https://arxiv.org/pdf/2312.17663.pdf
As an important part of the detector localization branch, bounding box regression loss plays an important role in object detection tasks. Existing bounding box regression methods usually consider the geometric relationship between the ground truth box (GT box) and the predicted box, and use the relative position and shape of the bounding box to calculate the loss, while ignoring the inherent properties of the bounding box (such as shape and scale). ) on bounding box regression. In order to make up for the shortcomings of existing research, this paper proposes a bounding box regression method that focuses on the shape and scale of the bounding box itself. First, we analyzed the regression characteristics of the bounding box and found that the shape and scale factors of the bounding box itself will affect the regression results. Based on the above conclusions, we propose the Shape IoU method, which can calculate the loss by focusing on the shape and scale of the bounding box itself, thereby making bounding box regression more accurate. Finally, we verified our method through a large number of comparative experiments. The experimental results show that our method can effectively improve detection performance and outperform existing methods, achieving state-of-the-art performance in different detection tasks. Code is available at https://github.com/malagoutou/Shape-IoU.

Index terms: Object detection, loss function, and bounding box regression

1 Introduction

Object detection is one of the fundamental tasks in computer vision, where the goal is to locate and identify objects in images. Depending on whether anchor points are generated or not, object detection can be divided into anchor-based and anchor-less methods. Anchor-based algorithms include Faster R-CNN [1], YOLO series (You Only Look Once) [2], SSD (Single Shot MultiBox Detector)

Guess you like

Origin blog.csdn.net/m0_47867638/article/details/135435658