论文SDP + RCNN | Exploit All the Layers: Fast and Accurate CNN Object Detector with SDP and CRC

Abstract

本文提出了两种目标检测的措施，兼具精度与效率：1.scale-dependent pooling （精度）2. layer wise casaded rejection classifiers（效率）

1 Introduction

首先作者简要介绍了RCNN等方法，FRCNN的缺点：
1. Fast-RCNN由于是从pooling层bounding box所以不能准确的识别小物体，不能判断，不能判断bbox是否太小。
2. Multi-scale输入因为储存和计算量的问题不容易在非常深的网络应用。
3. 将数以百计的bounding box池化并送入高维fc层，会非常的耗时
本文通过scale-dependent pooling（SDP）层来处理目标的scale variation等问题。具体思路是：不同尺寸的物体可能在不同的层上得到不同的反应，比如可能小的object会在浅层得到一个strong activation，但是大的物体可能在深层得到strong activation。
本文的第二个贡献是提出了cascaded rejection classifier。根据boosting classifiers 的原理，作者因为前面基层是一个弱分类器，他可以快速的否定一个easy negative。于是得到了下面的框架：
这里写图片描述

P. A. Viola and M. J. Jones. Rapid object detection using a boosted
cascade of simple features. In CVPR, pages 511–518, 2001. 2, 4

Z. Cai, M. Saberian, and N. Vasconcelos. Learning complexity- aware
cascades for deep pedestrian detection. In ICCV, 2015. 6

M. Mathias, R. Benenson, R. Timofte, and L. J. V. Gool. Han- dling
occlusions with franken-classifiers. In ICCV, pages 1505–1512, 2013.2, 4

A. Angelova, A. Krizhevsky, V. Vanhoucke, A. Ogale, and D. Fergu- son.
Real-time pedestrian detection with deep network cascades. In BMVC,
2015. 2, 4

本文方法与他们不同的是这种级联是在同一个网络中的级联，几乎不增加额外的计算量。
Using convolutional features

S. Xie and Z. Tu. Holistically-nested edge detection. CoRR,
abs/1504.06375, 2015. 3, 4 C. Lee, S. Xie, P. W. Gallagher, Z. Zhang,
and Z. Tu. Deeply- supervised nets. CoRR, abs/1409.5185, 2014. 3

本文没有直接的将各层特征融合，而是建立了各自的分类器

3 Scale-Dependent pooling

这里写图片描述

如上图所示，SDP是将Fast rcnn中不同尺寸的proposal输入不同的sdp层中，如0-64像素高的proposal就给第三个卷积层的sdp，如果大于128就输入到第五个。每个sdp层后都有fc层。这样做的好处就是不需要大量的resize图片，节省了计算，另外不同的特征层处理不同的proposal可以得到更consistent signal。
本文共有3个SDP，每个有2个fc+relu dropout.

4 Cascaded Rejection classifiers

加了SDP的Fast rcnn也加入了额外的4个fc层，所以为了减少计算量作者提出了CRC来减少proposal的数量，结构如下：
这里写图片描述

作者根据adaboost来学习分类器：

2248, 2010. 2, 3 [10] Y. Freund, R. Schapire, and N. Abe. A short introduction to boosting. Journal-Japanese Society For Artificial Intelligence, 14(771780):1612, 1999. 2, 4, 5

具体思路是：当一个proposal经过卷基层时会的到一组特征，使用fastrcnn里的roipooling策略得到mxmxc个CRC Rls的特征，对于所有的proposal如果是foreground其分类标签是1，否则是0，于是转化为了adaboost的训练形式，每个Rls训练50个弱分类器，训练之后再进入Rl+1s继续训练。
如上图所示，第一个fc层用来转变特征，第二个fc用来集成weak-leraners。测试时使用CRC可以加快速度3.2倍，如果再使用SVD可加快4.6倍。

5 Experiments

作者使用的是Edgebox proposal ，augment with ACF。

ACF P. Doll ́ar, R. Appel, S. Belongie, and P. Perona. Fast feature
pyra- mids for object detection. Pattern Analysis and Machine
Intelligence, IEEE Transactions on, 36(8):1532–1545, 2014. 3, 5

转载自：
https://blog.csdn.net/bea_tree/article/details/51880175