Article Directory
Conference: ECCV 2018
标题:《DetNet: A Backbone network for Object Detection》
Papers Link: https://arxiv.org/abs/1804.06215
table of Contents
The main idea of
the network structure of
experimental comparison
Conclusion
DetNet: Designed network framework for object detection born
The main idea
Traditional network framework mainly for image classification and design. As for what the network framework more suitable for object detection, this is one of the areas of exploration. The most recent target detection algorithm based on convolution neural network, whether it is one-stage algorithm, like YOLO, SSD and RetinaNet or two-stage of Faster RCNN, R-FCN, FPN algorithms in image classification are pre-trained model tuning is based on, but for object detection is not optimal. Since the target detection and classification of the image, there are many different characteristics. 1) The latest target detection algorithm as FPN and RetinaNet, usually requires one more stage to solve the problem of multi-scale image classification target. 2) only need to recognize the type of target detection object belongs, but also need to locate the specific location. The big money has brought down the sampling of large receptive field, which is very good for object classification, but would compromise the spatial resolution, making it impossible to pinpoint large objects, identify small objects.
Here, the response to these problems, we propose DetNet. The key is to target detection DetNet design a new backbone.
Details speaking, DetNet for different sizes of objects and use the same as FPN more stage. Even so, it is better than pre-training model ImageNet place is successful in retaining the spatial resolution characteristics, but also increase the cost of computing and memory neural network. In order to ensure efficiency DetNet, this paper introduces the dilated bottleneck of low complexity. Thus, DetNet achieve a higher resolution and larger receptive field of both.
Network Architecture
Since the classification model design principle does not apply to the detection tasks, such as spatial resolution characteristic ResNet VGG 16 and FIG gradually decreases in some standard network. Therefore, some techniques such as the FPN (shown in FIG. 1 A) and the dilation is applied in order to ensure the spatial resolution of these networks. But there is still three problems:
a different number of stage 1.backbone network and detection network.
2. large objects visibility is poor: too much down-sampling lead to large objects see the border area.
3. Small objects invisible: oversized down-sampling may also lose small objects information
to be continued. . .