[Improvement of IoU in target detection] Detailed introduction of GIoU, DIoU, CIoU

1、IoU

  • IoU is the intersection and union ratio, that is, for pred and Ground Truth: intersection/union
    Insert image description here

1. IoU can be used as an evaluation index or used to construct IoU loss = 1 - IoU
Disadvantages:
2. For the case where pred and GT intersect, IoU loss can be back-propagated, because IoU is not 0, and the gradient can be calculated . But if the two do not intersect, the gradient will be 0 and cannot be optimized.
3. When pred and GT do not intersect, the IoU is 0, so it is impossible to judge whether the two objects are far or close.
Insert image description here
4. IOU cannot reflect how the two objects overlap (intersection method). The IOU in both cases is 0.14, but the intersection of the two boxes in (a) is neater than that in (b).
Insert image description here

2、GIoU(Generalized Intersection over Union)

  • Taking into account the above shortcomings of IoU, improvements were made in GIoU.
    Insert image description here
  • C is the smallest box containing A and B, that is, the bounding rectangle of the two boxes
    Insert image description here

1. When the GIoU pairs are disjoint, they are not 0, so using GIoU loss = 1 - GIoU can perform backpropagation of the gradient. 2. GIoU can reflect the intersection of the
two. When the pairs are more consistent, the GIoU will be larger, for example In the second picture
3 above, GIoU can better reflect the distance between pred and GT.
Insert image description here

3、DIoU

  • DIoU mainly considers the following situations:
    Insert image description here
  • That is, pred is completely inside GT, IoU and GIoU are the same, and it cannot be judged whether the center of pred is close to the center of GT.
  • Therefore, DIoU Loss introduces a distance penalty term based on IoU Loss, which is defined as follows:
    Insert image description here
  • In the above loss function, b and bgt represent the center points of the anchor box and the target box respectively, and $$ represents the calculation of the Euclidean distance between the two center points. c represents the diagonal distance of the smallest rectangle that can cover both the anchor and the target frame, which is equivalent to normalization: d/c, as shown in the figure below.
    Insert image description here

1. Similar to GIoU loss, when DIoU loss does not intersect with the target frame, the gradient is not 0 and can be optimized.
2. Faster convergence: DIoU loss can directly minimize the distance between two target frames, while GIOU loss optimizes the area between the two target frames, so it converges much faster than GIOU loss.
3. For the case of containing two boxes in the horizontal and vertical directions, DIoU loss can make regression very fast, while GIoU loss almost degenerates into IoU loss.

  • DIoU is more in line with the bbox regression mechanism than GIoU. It takes into account the distance, overlap rate and scale between GT and pred, making the target box regression more stable and not diverging during the training process like IoU and GIoU. question.

4、CIoU

  • Considering that the aspect ratio between pred and GT is also very important, CIOU Loss introduces a box aspect ratio penalty term:
    Insert image description here
    Insert image description here

Reference link: https://blog.csdn.net/leonardohaig/article/details/103394369

Guess you like

Origin blog.csdn.net/m0_48086806/article/details/132363118