A thesis
Cascade R-CNN: Delving into High Quality Object Detection
https://arxiv.org/abs/1712.00726
https://github.com/zhaoweicai/cascade-rcnn
Second, the paper notes
1. Background
A), a threshold value is set too small, the imbalance between positive and negative samples, negative samples too, so that the block error detector for detecting a greater threshold value is not sensitive enough.
B), the threshold value is set too large, resulting in less data drop detector performance, and over-fitting.
2, the idea
Using a multi-stage RCNN (based two stages Faster-RCNN ), using the output of the previous stage training model a model, because output iou each model is always better than its input iou
3, the details of the return loss RCNN L2 function is used, using the Fast- RCNN is smoothed L1
2 norm is equivalent to using a portion of less than 1 (more Smooth, and guide the sake of convenience), using a norm in a portion greater than 1 (gradient avoid explosion, while reducing Outliers with ( outliers Effects)), some outlier If you use the L2 point, then the loss will be great, while, away from the real worth for output, if used, then L2, then lost a little bit of deviation is the square level.
4, model structure
The last, i.e. the use of a plurality of cascaded head (each level iou use different thresholds, and rising)
b structure is the use of my head I want is the same, shared parameters?
5, the process flow