Detailed Interpretation of Vernacular (6) ----- BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation

1. Introduction to the paper

Link to the paper: https://openaccess.thecvf.com/content_ECCV_2018/html/Changqian_Yu_BiSeNet_Bilateral_Segmentation_ECCV_2018_paper.html
Insert picture description here
So far, it has been cited nearly 500 times, indicating that this article is still very interesting, then we will list it today !

2. Innovative points and problems to be solved

  • 1. The paper proposes a Spatial path that can retain rich spatial information. Existing methods are nothing more than expanding the receptive field with hole convolution to obtain sufficient spatial information, or using spatial pyramid pooling and hole pyramid pooling to obtain a large enough receptive field to obtain spatial information, or by using large volumes Product core. Receptive field and spatial information are very important to obtain high accuracy. However, these methods are difficult to meet both requirements (both speed and accuracy). In particular, in the case of real-time semantic segmentation, existing modern methods use small input images or lightweight base models to speed up. The smaller input image loses most of the spatial information of the original image, and the lightweight model destroys the spatial information through channel pruning. In order to solve this spear and shield problem, the paper proposes a spatial path that is fast and can retain spatial information.
  • 2. The paper proposes a Context path that can provide enough receptive fields. In the field of semantic segmentation, the receptive field is a very important concept. Most modern methods use hollow pyramid pooling and pyramid pooling to increase the receptive field. However, this method will greatly consume memory and require a lot of computational cost. Causes low rates. Considering the larger receptive field and higher computational efficiency, we proposed the context path.

Three, network structure

Insert picture description here

  • main idea
    • 1. The paper proposes a small-step spatial information path (Spatial path) to retain spatial information to obtain high-pixel features.
    • 2. The paper proposes a context path to provide sufficient receptive fields.
    • 3. The paper proposes a feature fusion module (Feature Fusion Module) to efficiently combine different features from Spatial path and Context path.

Four, experimental hyperparameters

optimization mini-batch stochastic gradient descent (SGD)
Momentum 0.9
Initial Learning rate 2.5e − 2
learning rate strategy Poly

Please refer to the original text for details, here are just a few common hyperparameters.

Five, experiment

(1) Cityscapes data set

  • 1.Ablation for spatial path
    Insert picture description here
  • 2.Accuracy and parameter analysis of our baseline model
    Insert picture description here
  • 3.Speed comparison of our method against other state-of-the-art methods.
    Insert picture description here
  • 4.Accuracy and speed comparison of our method against other state-of-the-art methods on Cityscapes test dataset.
    Insert picture description here
  • 5.Accuracy comparison of our method against other state-of-the-art methods on Cityscapes test dataset.
    Insert picture description here
  • 6.Accuracy result on CamVid test dataset.
    Insert picture description here
  • 7.Accuracy result on COCO-Stuff validation dataset.

Insert picture description here

Guess you like

Origin blog.csdn.net/dongjinkun/article/details/114482265