(CVPR2019) semantic image segmentation (17) -DFANet: Live semantic feature for deep divided polymeric network

Disclaimer: This article is a blogger original article, shall not be reproduced without the bloggers allowed. https://blog.csdn.net/kevin_zhao_zl/article/details/90200955

Papers Address: DFANet: Deep Aggregation for the Feature Real-Time Semantic Segmentation
Project address: GitHub link

0. Summary

  This paper presents an extremely efficient network framework for real-time semantic segmentation, starting from a lightweight framework backbone network, through a series of subsidiary stage polymerized discriminative features. Based on the propagation characteristics of the multi-scale, DFANet reduced model parameters while maintaining a good feel and enhance the ability to learn field model to find a good tradeoff between speed and accuracy of the segmentation. In general the data and experimental results show that, DFANet eight times the amount of calculation is reduced and at the same time twice the speed achieved better accuracy.

1 Introduction

  Most solutions paper presents semantic segmentation tasks in this section can not find a better accuracy and speed tradeoff, most of the approaches in dealing with the shallow high score feature map of consuming too much, there are ways to try to reduce the input image size or the number of channels to reduce the speed, but the effective loss of the boundary location details and small objects, but there is insufficient information on the discriminative characteristic diagram obtained shallow network, in order to overcome these problems, some methods employ a multi-branch structure, but over the shallow high score feature map or the speed is too low, and each branch independent of each other, resulting in lack of learning ability.
  Smaller amount calculating semantic segmentation task is usually carried out using a funnel-shaped pre-trained on a backbone network, such as image classification tasks ResNet, Xception, DenseNet etc., required for real-time estimation, it is necessary to give lightweight backbone network through better results. Mainstream semantic segmentation architecture, wherein FIG pyramid space pyramid fusion as shallow pools of semantic information can be used to enhance features, thereby reducing the amount of calculation, but most of the approaches to enhance FIG feature is output on a branch, not adequately before using the feature, and based on this, questions of how to design a lightweight method for multi-layer integration of semantics coding features.
  Paper mentioned two ways of cross-polymerizable layer wherein A is a fused structure information and semantic information at the output of the backbone network, one is at different stages of the network to enhance the features characterizing FIG capacity, as shown below From left to right are: multi-branch, space pyramid pooling, network-level features with complex features and different stages of network multiplexing.

  Wherein the depth of paper presented DFANet polymeric network comprises three parts, lightweight polymeric backbone network modules, sub-modules and cross Fabric engagement stage.
  Given the efficient operation of the depth of separable convolution, the paper Xception fine-tune the network as a backbone network, combined with the end of a full connection module retain the maximum attention of the receptive field.
  Fabric sub-module is bonded shallow wherein FIG sampled as an input to the next layer network adjustment prediction result. From another perspective, the sub-net - together may also be seen as a process from rough to fine.
  Subphase aggregation module combines characteristics of different stages of said receptive fields and high passed information by the details of construction layers results combining the same dimensions.
  After three parts is a simple decoding module, to generate the final prediction result.
  The main contribution of the paper is to:

  • Live semantic segmentation of SOTA effect
  • New split semantic network, stream coding information of a plurality of interconnected high-level semantic information fusion
  • Adjusting scales take advantage of different characteristics of the receptive field of view and a high-level characteristics of FIG.
  • Modify Xception, attention module FC tail was added a receptive field enhancement feature FIG.

2. Related Work

  • Real-time semantic segmentation: SegNet (pool of index structure), ENet (reducing the number of samples), ESPNet (new space pyramid pooled), ICNet (multiscale image as an input), BiSeNet (dual network architecture)
  • Separable convolution depth
  • Location information-rich structures of shallow semantic information
  • Semantic encoding: FC attention module
  • Polymeric

3. DFANet network architecture

  As shown below, the network is actually a codec structure, the encoder structure is a three Xception polymeric backbone network, and some of these sub-phase connection information, the decoder is simply upsampled reconstruction module.
Here Insert Picture Description

4. Experimental

  Different backbone modifications

  different structures

  in different degrees of polymerization wherein

  DFANet three backbone network output

  shallow and deep features polymerization

  Ciryscapes BenchMarl

  CamVid BenchMark

Welcome to scan two-dimensional code concern micro-channel public number depth study of mathematics [get a day free of big data, AI and other related learning resources, classics and the latest deep learning related papers reading, arithmetic and other Internet skills to learn, probability theory, Recalling the knowledge of higher mathematics linear algebra]
Here Insert Picture Description

Guess you like

Origin blog.csdn.net/kevin_zhao_zl/article/details/90200955