Connectivity related papers, code collection

2018

Non-local Neural Networks

code: https://paperswithcode.com/paper/non-local-neural-networks

Abstract : Both convolution and loop operations are building blocks that process one local neighbor at a time. In this paper, we propose non-local operations as a general family of building blocks for capturing long-term dependencies. Inspired by the classic non-local mean method [4] in computer vision, our non-local operation computes the response of a location as a weighted sum of all location features. This building block can be plugged into many computer vision architectures. In the task of video classification, even without any fancy features, our non-local model can compete or outperform current competition winners on the Kinetics and Anagram datasets. In still image recognition, our non-local model improves object detection/segmentation and pose estimation.

insert image description here

2019

ConnNet: A Long-Range Relation-Aware Pixel-Connectivity Network for Salient Segmentation

Abstract : The purpose of saliency segmentation, a critical yet challenging task that underlies many advanced computer vision applications. It requires grouping semantically aware pixels into salient regions and benefits from leveraging global multi-scale context for good local inference. Previous work has often solved it as a two-class segmentation problem, utilizing complex multi-step procedures including refined networks and complex graphical models. We argue that semantic saliency segmentation can be effectively addressed by redefining it as a simple and intuitive pixel-pair-based connection prediction task. Following the intuition that salient objects can be naturally grouped by semantic-aware connections between adjacent pixels, we propose a purely connection network (ConnNet). ConnNet predicts the connectivity probability of each pixel with its neighbors by exploiting the multi-level cascaded context and long-range pixel relationships embedded in the image. We investigate our approach on two tasks, salient object segmentation and salient instance segmentation, and show that consistent improvements can be obtained by modeling these tasks as connections, rather than binary segmentation tasks for various network architectures. We achieve state-of-the-art performance that outperforms or is comparable to existing methods, while reducing inference time due to our less sophisticated approach.
insert image description here
Contribution of the paper:

  • We illustrate that connectivity modeling can be a good alternative to salient segmentation for traditional segmentation tasks. Comparing our method with the same architecture trained for the segmentation task, we find that ConnNet outperforms segmentation network on a wide range of benchmark datasets.
  • We develop a method for salient object segmentation that outperforms previous state-of-the-art methods on several datasets, but also greatly reduces inference time due to its simplicity. We also extend this idea to the task of instance-level saliency segmentation.
  • We investigate the impact of different pixel connectivity modeling methods on the overall performance.

A Cost Effective Solution for Road Crack Inspection using Cameras and Deep Neural Networks

Abstract: Automatic detection of pavement cracks is an important research field in the development of intelligent transportation infrastructure systems. This paper presents a cost-effective solution for road crack detection enabled by the installation of a commercial-grade action camera GoPro on the rear of a moving vehicle. Meanwhile, a road crack detection method combining conditional Wasserstein generative adversarial network and connection graph is also proposed. The method uses a multi-layer feature fusion neural network with 121 deconvolution layers as the generator and a 5-layer fully convolutional network as the discriminator. To overcome the scattered output problem associated with deconvolution layers, a connectivity graph is introduced to represent crack information in the proposed connections. The proposed method is tested on a publicly available dataset and data collected by us. The results show that the proposed method achieves state-of-the-art performance in terms of precision, recall and F1 score compared to other existing methods.
insert image description here

Asymmetric Non-local Neural Networks for Semantic Segmentation

code: https://paperswithcode.com/paper/asymmetric-non-local-neural-networks-for
Abstract: Non-local modules serve as a particularly useful semantic segmentation technique, while suffering from prohibitive computation and GPU memory footprint criticize. In this paper, we propose semantic segmentation for asymmetric nonlocal neural networks, which has two prominent components: Asymmetric Pyramid Nonlocal Block (APNB) and Asymmetric Fusion Nonlocal Block (AFNB). APNB utilizes a pyramid sampling module, which greatly reduces computation and memory consumption without sacrificing performance. AFNB is derived from APNB and fuses features from different levels with full consideration of long-term dependencies, resulting in significantly improved performance. Extensive experiments on semantic segmentation benchmarks demonstrate the effectiveness and efficiency of our work. In particular, we report a state-of-the-art performance of 81.3mIoU on the cityscape test set. For 256×128 input, APNB is 6x faster than non-native blocks on GPU and 28x faster than GPU running memory hog.
insert image description here

2020

Real-time Semantic Segmentation with Fast Attention

code: https://github.com/feinanshan/FANet
Abstract: In CNN-based deep semantic segmentation models, high accuracy relies on rich spatial context (large receptive field) and fine spatial details (high resolution), Both of these require a high computational cost. In this paper, we propose a novel architecture that addresses both challenges and achieves state-of-the-art performance for high-resolution image and video semantic segmentation. The proposed architecture relies on our fast spatial attention, a simple yet efficient modification of the popular self-attention mechanism that captures the same rich spatial context at a fraction of the computational cost by changing the order of operations. Furthermore, to efficiently handle high-resolution inputs, we apply additional space reduction to the intermediate feature stage of the network with minimal loss of accuracy due to the use of a fast attention module to fuse features. We validate our method through a series of experiments, which show superior performance, better accuracy, and speed on multiple datasets compared to existing real-time semantic segmentation methods. On cityscapes, our network achieves 74.4% mIoU at 72FPS and 75.5% mIoU at 58FPS on a Titan XGPU, which is ∼50% faster than the state-of-the-art ∼ while maintaining the same accuracy.
insert image description here

Disentangled Non-Local Neural Networks

code: https://github.com/yinh17/DNL-Semantic-Segmentation
Abstract: Non-local blocks are commonly used modules to enhance the context modeling capabilities of regular convolutional neural networks. This paper first conducts an in-depth study on non-local blocks and finds that its attention computation can be divided into two items, one representing the relationship between two pixels and the other a unary term representing the saliency of each pixel. We also observe that two terms trained separately tend to model distinct visual cues, e.g., white pairwise terms learn intraregional relationships, while unary terms learn salient boundaries. However, the two terms are tightly coupled in a non-local block, which hinders the learning of each term. Based on these findings, we propose disentangled non-local blocks, in which the two terms are decoupled to facilitate learning of the two terms. We demonstrate the effectiveness of the decoupled design on various tasks such as semantic segmentation on cityscapes, ADE20K and Pascal context, object detection on COCO, and action recognition on dynamics. The code will be used publicly.
insert image description here

Non-local U-Nets for Biomedical Image Segmentation

code: https://paperswithcode.com/paper/global-deep-learning-methods-for

Abstract: Deep learning has shown great promise in various biomedical image segmentation tasks. Existing models are usually based on U-Net and rely on an encoder-decoder architecture with stacked local operators to progressively aggregate long-range information. However, using only local operators limits efficiency and effectiveness. In this work, we propose non-local u-nets equipped with flexible global aggregation blocks for biomedical image segmentation. These blocks can be inserted into U-Net as a size-preserving process, as well as downsampling and upsampling layers. We conducted thorough experiments on the 3D multimodal equi-strong Mr image segmentation task of the infant brain to evaluate non-local u-shaped networks. The results show that the model has the best performance with fewer parameters and computational speed.
insert image description here

2021

Vectorization of Historical Maps Using Deep Edge Filtering and Closed Shape Extraction

code: https://github.com/soduco/ICDAR-2021-Vectorization

Abstract: Maps have been a unique source of knowledge for centuries. These historical documents provide valuable information for analyzing complex spatial changes in the landscape over important time frames. This is especially true for urban areas that contain multiple intersecting fields of study (social sciences, economics, etc.). The large and significant diversity of map sources requires automatic image processing techniques to extract relevant objects under vector shapes. The complexity of maps (text, noise, digitization artifacts, etc.) has hindered the ability to come up with a general and efficient raster-to-vector approach for decades. We propose a learnable, repeatable and reusable solution for automatic conversion of raster maps to vector objects (blocks, streets, rivers). It is built on the complementary strengths of mathematical morphology and convolutional neural networks through efficient edge filtering. Furthermore, we modify ConnNet and combine the deep edge filtering architecture to exploit pixel connection information to build an end-to-end system that does not require any post-processing techniques. In this paper, we focus on comprehensive benchmarks of various architectures on multiple datasets, as well as a new vectorization step. Our experimental results on a new public dataset using the COCO floodlight metric show very encouraging results, confirmed by qualitative analysis of the success and failure cases of our method.

BiconNet: An Edge-preserved Connectivity-based Approach for Salient Object Detection

code: https://github.com/Zyun-Y/BiconNets

Abstract: Traditional deep learning-based methods treat salient object detection (SOD) as a pixel-level saliency modeling task. A limitation of the current SOD model is the insufficient utilization of inter-pixel information, which usually results in imperfect segmentation of near-edge regions and low spatial coherence. As we have demonstrated, using the saliency mask as the only label is suboptimal. To address this limitation, we propose a connectivity-based method, called Bilateral Connectivity Network (BiconNet), which uses connectivity masks and saliency masks as labels to efficiently classify pixel-to-pixel relationships and object saliency modeling. Furthermore, we propose a bilateral voting module to enhance the output connectivity graph, and a novel edge feature enhancement method that effectively utilizes edge-specific features. Through comprehensive experiments on 5 benchmark datasets, we demonstrate that our proposed method can be plugged into any existing state-of-the-art saliency-based SOD framework to improve its performance with negligible parameter increase.

insert image description here
insert image description here
Contribution of the paper:

  • We propose a connectivity-based SOD framework, called BiconNet, to explicitly model pixel connectivity, enhance edge modeling, and preserve the spatial coherence of salient regions. BiconNet can be easily plugged into any existing SOD model with negligible parameter increase.
  • We propose an efficient, connectivity-based edge feature extraction method that directly emphasizes edge-specific information in the network output. We also introduce a new loss function, the Bicon loss, to further improve the utilization of edge features and maintain the spatial consistency of the output.
  • We build biconnet with a backbone of seven state-of-the-art SOD models. By comparing these dual network networks with corresponding baselines, we show that our model outperforms the latter model on five widely used benchmarks using different evaluation metrics.

Introducing the Boundary-Aware loss for deep image segmentation(BMVC)

code: https://github.com/onvungocminh/MBD_BAL

Abstract: Most contemporary supervised image segmentation methods do not preserve the initial topology (such as the proximity of contours) given an input. When comparing binary predictions and ground truth values, one can usually notice that edge points have been inserted or removed. This can be critical when precise localization of multiple interconnected objects is required. In this paper, a new loss function, Boundary Aware Loss (BALoss), is proposed using a minimum barrier distance (MBD)-based cutting algorithm. It is able to locate what we call leaky pixels and encode boundary information from a given ground truth. Due to this adaptive loss, we are able to significantly improve the quality of the prediction boundaries during the learning process. Furthermore, our loss function is differentiable and can be applied to any type of neural network for image processing. We apply this loss function to the standard U-Net and DCU-Net on the electron microscopy dataset. They are known to have high noise levels, making it challenging to cover close-range and even connected objects in image space. Our segmentation performance, in terms of variation in information (Voi) and adaptive rank index (ARI), is very promising, resulting in ∼ 15% higher Voi scores and ∼ 5% higher than the state-of-the-art.

insert image description here
insert image description here

Guess you like

Origin blog.csdn.net/weixin_42990464/article/details/123270660