【目标检测系列:五】2018 CVPR IoU-Net 论文阅读解析总结

2018 CVPR

Acquisition of Localization Confidence for Accurate Object Detection

PreciseRoIPooling 代码
ECCV 2018 | 旷视科技 Oral 论文解读:IoU-Net 让目标检测用上定位置信度

参考的一篇博客

建议先自己看一遍论文,然后再看下面的总结

IoU-Net

解决问题 : nms 过程中,是挑选 分类置信度最大的值的框,但是它不一定框的准

Two drawbacks in object localization

  • the misalignment between classification confidence and localization accuracy
  • the non-monotonic bounding box regression

joint training

  • Backbone
    ResNet-FPN

  • FPN

  • Precise RoI Pooling

  • Head
    works in parallel
    based on the same visual feature from the backbone

    • IoU predictor
    • R-CNN
      • classification and regression brance take 512 RoIs per image from RPNs

Training

  • img (800,1200)
  • batch size 16
  • lr 0.01
  • iteration 160k
  • warm up 0.004 ,10k

Training the IoU detector

  • smooth-L1 loss
  • IoU labels
    normalized , distributed over [-1,1]

Inference

  • first apply bounding box regression for the initial coordinates
  • IoU-guide NMS
    on all detected bounding boxes
  • refine using optimization-based algorithm
    100 bounding boxes with highest classification confidence

Predict IoU

IoU predictor

  • aim

    • takes features from the FPN
    • estimates the localization accuracy (IoU) for each bounding box
  • data generation

    • generate candidate bounding box set
      generate bounding boxes and labels for training the IoU-Net : augmenting the ground-truth,instead of taking proposals from RPNs
      for all ground-truth bounding box in training set , manually transform them with a set of randomized parameters

    • remove the bounding box having an IoU < 0.5 with the matched ground-truth

  • feature

    • extracted from the output of FPN with the proposed PrRoI-Pooling layers
    • then fed into a two-layer feedforward network for the IoU prediction
  • use class-aware IoU predictors

IoU-guided NMS

  • use the predicted IoU instead of the classification confidence as the ranking keyword for bounding boxes.

  • to determine the classification scores

    • select the box having the highest IoU with a ground-truth
    • eliminate all other boxes having an overlap greater than threshold nms
    • for a group of bounding boxes matching the same ground-truth, we take the most confident prediction for the class label.
      highest IoU 的框的分类置信度 是其和他匹配同一gt的并大于阈值被滤掉的框的分类置信度的最大值
  • Algorithm

    • ① 从bounding box集合 B 中依次选取预估IOU(localization confidence)最高的bounding box(记为 b m b_m bm
    • ② 将与其IOU高于一定阈值的bounding box一个个选出来,并将这些bounding box(包括最开始选的 b m b_m bm )的最高classification confidence记为 s s s
    • ③ 将 ( b m , s ) (b_m,s) (bm,s) 二元组记录到集合 D 中 (本质是 bounding box和cls conf的重新分配)

Optimization-based bounding box refinement

  • Algorithm
    • 对于检测到的bounding box,利用 PrPool 提取内部特征并算出 IOUnet 预测的IOU,记其梯度为grad,这个IOU记为PrevScore
    • 然后更新bounding box
    • 更新之后重新进行IOU预测结果为NewScore
    • 如果 prevscore 和 newscore 相差小于一个early-stop阈值或者 newscore 比 prevscore 低于一个“定位退化容忍度”,则认为该bounding box更新完毕。

PrPool

  • 连续
  • 可导

猜你喜欢

转载自blog.csdn.net/qq_31622015/article/details/100776672