SPP pyramid understand pooled series, ASPP

SPP pyramid understand pooled series, ASPP

problem

Spp out before, all the neural network are input to a fixed size image, such as often encountered in 224 × 224, 224 × 224 must resize the input image before the network, resulting in a distorted image, wherein the information is also deformed, so that recognition accuracy limits.
The SPP and ASPP is to solve this problem, it allows the network without having to enter picture resize.

SPP structure

Here Insert Picture Description
Just to see this chart, many students like me, you ignorant (forgive me stupid), in another blog with text are also relatively simple, not some common vocabulary until I saw the picture below:
Here Insert Picture Descriptionas FIG leftmost FIG graphical representation 256 of FIG dimensional convolution of the feature, for each region (thickness of 256) were pooled in three ways:

(1) Direct FIG pooled for the entire feature, the value of each dimension to obtain a pooled to form a vector of 1x256

(2) wherein the parts of FIG 4 is divided into 2x2, pooled separately for each, to obtain a 1x256 vector, the vector finally obtained is 1x256 2x2 = 4 th

(3) wherein the parts of FIG. 16 is divided into 4x4, pooled separately for each, to obtain a vector of 1x256 ,, finally obtained vector of 1x256 4x4 = 16 th

The results of three kinds of cell division manner was subjected to splicing, to obtain (1 + 4 + 16) 256 = 21 feature 256.

As can be seen from the figure, the input of the whole process completely unrelated to the size, it is possible to process candidate blocks of arbitrary size.

Space Pool layer is actually an adaptive layer, so no matter what the size of your input, the output is fixed (21xchannel)

ASPP structure

在介绍ASPP之前,首先要介绍Atrous Convolution(空洞卷积),它是一种增加感受野的方法。空洞卷积是是为了解决基于FCN思想的语义分割中,输出图像的size要求和输入图像的size一致而需要upsample,但由于FCN中使用pooling操作来增大感受野同时降低分辨率,导致upsample无法还原由于pooling导致的一些细节信息的损失的问题而提出的。为了减小这种损失,自然需要移除pooling层,因此空洞卷积应运而生。
普通卷积这里就不介绍了,我们来看一下空洞卷积的动态图,就一目了然了:
Here Insert Picture Description
空洞卷积从字面上很好理解,是在标准的卷积中注入空洞,以此来增加感受野,相比原来的正常卷积,空洞卷积多了一个称之为 dilation rate 的参数,指的是kernel的间隔数量(一般的卷积 dilation rate=1)。
但是,空洞卷积也有其潜在的一些问题:

潜在问题 1:The Gridding Effect

假设我们仅仅多次叠加 dilation rate 2 的 3 x 3 kernel 的话,则会出现这个问题:
Here Insert Picture Description
我们发现 kernel 并不连续,也就是并不是所有的 pixel 都用来计算了,因此这里将信息看做 checker-board 的方式会损失信息的连续性。这对 pixel-level dense prediction 的任务来说是致命的。

潜在问题 2:Long-ranged information might be not relevant.

We can deduce from this design background dilated convolution of view such a design is used to obtain long-ranged information. But the message of light with large dilation rate might only have the effect of dividing a number of large objects, but for small objects is possible, there is no good the. How to deal with the relationship between different sizes of objects at the same time, is the key to good design dilated convolution of the network.

HDC(Hybrid Dilated Convolution)

For the above problem, the article Tucson group presented its better a solution. They designed a structure called the design of the HDC.
It has several features, we can solve this problem to some extent. Here we do not discuss. We can look at a chart to compare the normal convolution with empty HDC effect:
Here Insert Picture Description
you can see after convolution, HDC can get more information about the image does not appear like a normal convolution empty as small squares.

Atrous Spatial Pyramid Pooling (ASPP)

First look at the block diagram of ASPP
Here Insert Picture Description
here designed hollow convolution several different sampling rates to capture multi-scale information, but we have to understand the sampling rate (dilation rate) is not the bigger the better, because the sampling rate is too large, it will lead filter some will go on the padding device, generating the weight weight meaningless, and therefore to select the appropriate sample rate.

Reference links

https://blog.csdn.net/sinat_33486980/article/details/81902746
https://www.cnblogs.com/bupt213/p/10823653.html

Released six original articles · won praise 1 · views 1840

Guess you like

Origin blog.csdn.net/m0_37798080/article/details/103163397