论文 Cascade R-CNN: High Quality Object Detection and Instance Segmentation

https://arxiv.org/abs/1906.09756
In object detection, the intersection over union (IoU) threshold isfrequently used to define positives/negatives. The threshold used to train adetector defines its \textit{quality}. While the commonly used threshold of 0.5leads to noisy (low-quality) detections, detection performance frequentlydegrades for larger thresholds. This paradox of high-quality detection has twocauses: 1) overfitting, due to vanishing positive samples for large thresholds,and 2) inference-time quality mismatch between detector and test hypotheses. Amulti-stage object detection architecture, the Cascade R-CNN, composed of asequence of detectors trained with increasing IoU thresholds, is proposed toaddress these problems. The detectors are trained sequentially, using theoutput of a detector as training set for the next. This resamplingprogressively improves hypotheses quality, guaranteeing a positive training setof equivalent size for all detectors and minimizing overfitting. The samecascade is applied at inference, to eliminate quality mismatches betweenhypotheses and detectors. An implementation of the Cascade R-CNN without bellsor whistles achieves state-of-the-art performance on the COCO dataset, andsignificantly improves high-quality detection on generic and specific objectdetection datasets, including VOC, KITTI, CityPerson, and WiderFace. Finally,the Cascade R-CNN is generalized to instance segmentation, with nontrivialimprovements over the Mask R-CNN. To facilitate future research, twoimplementations are made available at\url{this https URL} (Caffe) and\url{this https URL} (Detectron).

Guess you like

Origin blog.csdn.net/x1131230123/article/details/120792962