[Tensorflow] realize how empty tf.nn.atrous_conv2d convolution? Expansion convolution

Introduces
the theory of empty convolution can view the following link, here we are not talking about theory in detail:

1.Long J, Shelhamer E, Darrell T, et al. Fully convolutional networks for semantic segmentation[C]. Computer Vision and Pattern Recognition, 2015.

2.Yu, Fisher, and Vladlen Koltun. “Multi-scale context aggregation by dilated convolutions.” arXiv preprint arXiv:1511.07122 (2015).

3. How to understand the hollow convolution (dilated convolution)?

In fact, one sentence is to expand the receptive field without the use of pooling of (pooling layer can lead to loss of information)

For ease of reading paste some interesting links:

[TensorFlow] tf.nn.conv2d is how to achieve convolution?

[TensorFlow] tf.nn.conv2d_transpose is how to achieve deconvolution?

Practice the first show function:

tf.nn.atrous_conv2d (value, Filters, Rate, padding, name = None)
. 1
removed name parameter to specify the name of the operation, a total of four methods related parameters:

value:
refers to the need to do the convolution of the input image, the Tensor requirement is a 4-dimensional, having [batch, height, width, channels ] Shape such, a specific meaning is the number of the picture batch of [training, picture height, picture width, image channel number]

filters:
corresponds to the convolution kernel CNN, the Tensor requirement is a 4-dimensional, having [filter_height, filter_width, channels, out_channels ] Shape such, specific meaning [height of the convolution kernel, the width of the convolution kernel, image channel number, the number of convolution kernel], where the third dimension empathy channels, the parameter value is the fourth dimension

rate:
requires an int is positive, the normal convolution should have stride (i.e., convolution sliding step), but there is no empty convolution stride parameters, with particular attention to this point. Instead, it uses a new rate parameters, then the rate parameter what use is it? It is defined as the sampling interval when we convolution on the input image, you can be understood as the convolution kernel interspersed among the (rate-1) number of "0", the original convolution kernel to interpolate a lot of "hole" when doing this convolution is equivalent to the sampling interval of the original image becomes large. Specifically how to insert too, you can see more detailed description below. At this point it is easy to draw when 1 rate =, there is no insert 0, then this function becomes common convolution.

padding:
the amount of type string, only "SAME", one of "VALID", the value determined in different ways edge padding.

ok, finished, to which there is no argument, perhaps some small partners will ask "stride" parameter it. In fact, this function has acquiesced in stride = 1, that is, the slide step can not be changed, fixed at 1.

When the result returns a Tensor, fills a "VALID", return [batch, height-2 * (filter_width-1), width-2 * (filter_height-1), out_channels] of Tensor, fills a "SAME", return [batch, height, width, out_channels] of Tensor, this was how the results come out? To not worry, we look empty convolution by showing the image of a program.

Experiment
first create a 2 channel map

img = tf.constant (value = [[ [[1], [2], [3], [4]], [[1], [2], [3], [4]], [[1] , [2], [. 3], [. 4]], [[. 1], [2], [. 3], [. 4]]]], DTYPE = tf.float32)
IMG = tf.concat (values = [IMG , IMG], Axis = 3)
. 1
2
followed by a 3 * 3 convolution kernel to do the convolution

= tf.constant filter (value =. 1, Shape = [3,3,2,5], DTYPE = tf.float32)
out_img tf.nn.atrous_conv2d = (value = IMG, filter Filters =, = Rate. 1)
. 1
2
img and the establishment of good filter, you can do a convolution

= tf.nn.conv2d out_img (INPUT = IMG, filter = filter, Strides = [1,1,1,1], padding = 'the VALID')
. 1
outputs five channel, we set rate = 1, void volume at this time It can be seen as an ordinary convolution product, respectively, and output as at SAME VALID mode:

 

ok, adjusted rate = 2, and continue the program

= tf.nn.atrous_conv2d out_img (value = IMG, filter = Filters, Rate = 2, padding = 'SAME')
. 1
view the output

[[[[ 16. 16. 16. 16. 16.]
[ 24. 24. 24. 24. 24.]
[ 16. 16. 16. 16. 16.]
[ 24. 24. 24. 24. 24.]]

[[ 16. 16. 16. 16. 16.]
[ 24. 24. 24. 24. 24.]
[ 16. 16. 16. 16. 16.]
[ 24. 24. 24. 24. 24.]]

[[ 16. 16. 16. 16. 16.]
[ 24. 24. 24. 24. 24.]
[ 16. 16. 16. 16. 16.]
[ 24. 24. 24. 24. 24.]]

[[16. 16. 16. 16. 16.]
[24. 24. 24. 24. 24.]
[16. 16. 16. 16. 16.]
[24. 24. 24. 24. 24.]] ]]
. 1
2
. 3
. 4
. 5
. 6
. 7
. 8
. 9
10
. 11
12 is
13 is
14
15
16
. 17
18 is
. 19
results how out? Then a map


Here we see 2 rate =, interspersed by "0", the expansion of the convolution kernel to 3 * 3 5 * 5. Look at "VALID" mode, what happens?

 

Direct the error. Because of the size of the convolution kernel has exceeded the original image size

Well, see here, I am sure the hole convolution have a basic understanding of. When, then, fills a "VALID", return to [batch, height-2 * (filter_width-1), width-2 * (filter_height-1), out_channels] of Tensor, this result, I believe we can prove.

Listing
import tensorflow as tf


img = tf.constant(value=[[[[1],[2],[3],[4]],[[1],[2],[3],[4]],[[1],[2],[3],[4]],[[1],[2],[3],[4]]]],dtype=tf.float32)
img = tf.concat(values=[img,img],axis=3)
filter = tf.constant(value=1, shape=[3,3,2,5], dtype=tf.float32)
out_img1 = tf.nn.atrous_conv2d(value=img, filters=filter, rate=1, padding='SAME')
out_img2 = tf.nn.atrous_conv2d(value=img, filters=filter, rate=1, padding='VALID')
out_img3 = tf.nn.atrous_conv2d(value=img, filters=filter, rate=2, padding='SAME')

#error
#out_img4 = tf.nn.atrous_conv2d(value=img, filters=filter, rate=2, padding='VALID')

with tf.Session() as sess:
print 'rate=1, SAME mode result:'
print(sess.run(out_img1))

print 'rate=1, VALID mode result:'
print(sess.run(out_img2))

print 'rate=2, SAME mode result:'
print(sess.run(out_img3))

Error #
#Print 'Rate = 2, the VALID MODE Result:'
#Print (sess.run (out_img4))
. 1
2
. 3
. 4
. 5
. 6
. 7
. 8
. 9
10
. 11
12 is
13 is
14
15
16
. 17
18 is
. 19
20 is
21 is
22 is
23 is
24
25
26 is
27
TensorFlow realize convolution, deconvolution and convolution empty
reading number 1532

TensorFlow realize convolution, deconvolution and convolution empty TensorFlow have achieved convolution (tf.nn.conv2d convolution function), deconvolution (tf.nn.conv2d_transpose deconvolution function) and a hollow convolution (tf. ..
Bowen
from: pan_jinquan's blog


wsdgh: Nice, but `height - [filter_width + (filter_width - 1) * (rate - 1)] + 1` make more sense, when `padding=VALID`.
import tensorflow as tf
import numpy as np

kernel_height = 3
kernel_width = kernel_height

img_height = 9
img_width = img_height
rate = 2
padding = 'VALID'
sz_same = img_height
sz_valid = img_height - ((kernel_height - 1)*(rate - 1) + kernel_height) + 1

img = np.random.randn(1, img_height, img_width, 3)
kernel = np.random.randn(kernel_height, kernel_width, 3, 1)

imgT = tf.constant(img)
kernelT = tf.constant(kernel)

imgO1 = tf.nn.atrous_conv2d(imgT, kernel, rate=rate, padding=padding)
print(imgO1.shape)
print(sz_same if padding.upper() == 'SAME' else sz_valid)
# (1, 5, 5, 1)
# 5

---------------------
Author: xf__mao
Source: CSDN
Original: https: //blog.csdn.net/mao_xiao_feng/article/details/78003730
Disclaimer: This article as a blogger original article, reproduced, please attach Bowen link!

Guess you like

Origin www.cnblogs.com/jfdwd/p/11184384.html