【Sparse Convolution】Submanifold Sparse Convolutional Networks

Submanifold Sparse Convolutional Network

Publication time: 2019

作者 : Benjamin Graham, Laurens van der Maaten。 Facebook AI Research

Code address

I see that many recent point cloud networks use this method, so let's read it.

Summary

The data processed by convolution (such as pictures) are generally dense, but some data is sparse (such as point cloud data, strokes formed on paper). It dense的卷积网络is very inefficient to use directly on these sparse data . This paper introduces a sparse convolution operation, which 稀疏数据is customized for processing , and is different from the previous work of sparse convolution networks. It runs strictly on submanifolds instead of extending observations to each layer of the network.

( In short, it is a convolution that is very suitable for processing sparse data )

Why can't we use normal convolution operations to process sparse data?

"Submanifold" dilation problem

Left: The original input is a hand-drawn circle, which is a one-dimensional curve embedded on a two-dimensional grid.

Middle: the result after a 3x3 convolution

Right: The result after two 3x3 convolutions

The input in the above figure is an example of handwritten digit recognition. The author called the problem shown in the above figure “submanifold” dilation 问题 . You can see that the original data is very sparse, but after using traditional convolution, the sparsity disappears quickly.

If instead of convolving all pixels, what will happen if only the white pixels with curve are convolved?

As a result, a lot of information will be lost and cannot be classified.

Note: The author calls these meaningful pixels in white the active site

 

What is a (submanifold) submanifold?

We use the term 'submanifold' to refer to input data that is sparse because it has a lower effective dimension than the space in which it lives, for example a one-dimensional curve in 2+ dimensional space, or a two-dimensional surface in 3+ dimensional space.

We use the term "submanifold" to indicate that the input data is sparse, because its effective dimension is lower than the space in which it is located, such as a one-dimensional curve in a two-dimensional space, or a two-dimensional surface in a three-dimensional space.

Sparse convolutional SC and VSC

The author proposes two convolution operations: SC and VSC.

(1) SC convolution operation

(2) VSC convolution operation

(3) Activation function and pooling function

(4) Calculation and memory overhead

The following figure shows the calculation and memory overhead of active site and non-active site in traditional convolution C, SC, and VSC.

Context: The filter size is 3, convolving a d-dimensional single location.

Computation and memory overhead of traditional convolution C, SC, VSC

Among them, a is the active sites number, m is the input channel number, and m is the output channel number.

Construct common networks with the proposed convolution

Modules used for building sub-manifold sparse convolutional networks

The authors used the VSC and SC proposed to build a very popular network of several modules: VGGResNetDenseNetmodule.

(a)  VGG blockis composed of 2 VSCs and a max pooling

(b)  保持输入输出分辨率不变的ResNet blockis to add the output of two VSCs to the input

(c) 减小分辨率的ResNet block

(d) DenseNet block that keeps the input and output resolution unchanged

(e) DenseNet block with reduced resolution

 

Implementation process

(1) Several key parts are involved:

 

(2) The realization process of SC:

(3) The realization process of VSC

The only difference from SC is the construction of the rule book.

experiment

The results on the ModelNet-40 data set.

It can be seen that the accuracy of VSC is not much worse than that of traditional convolution C, but the 速度sum 内存has been greatly improved.

 

At last

If there is a picture like the following in the result visualization, it can better indicate that using SC/VSC will not reduce the data sparsity.

But no, try to visualize it yourself.

This article was not published, but was published later as part of "3D Semantic Segmentation with Submanifold Sparse Convolutional Networks".

If you don’t understand some of the details of this article, it’s okay. Read and write the more clear and detailed "3D Semantic Segmentation with Submanifold Sparse Convolutional Networks". I vomited blood a little bit. I read "3D Semantic Segmentation with Submanifold Sparse Convolutional Networks" after writing this. Many of the details I guessed were clearly stated in the article. Go and read it. It is much clearer.

https://blog.csdn.net/zcgyq/article/details/83088283

Guess you like

Origin blog.csdn.net/u013066730/article/details/108512125