Transposed convolution is used in FCN, U-net, and GAN. Transposed convolution cannot be simply understood as deconvolution.
The forward convolution process of CNN is a process of reducing the size of the image, and the transposed convolution makes the image gradually larger until it is as large as the original image.
How to understand transposed convolution, you can start by converting the calculation of convolution into the calculation of matrix.
Well, here is a general understanding of what transposed convolution is and what it can do. That is to say, the small image obtained by convolution can get a larger image by transposing convolution (this analogy is not appropriate, it is good to understand).
The following is the API for transposed convolution in tensorflow
Here is the usage of the padding parameter in tf:
Well, finally, how to use the transposed convolution in tf:
import tensorflow as tf
# 正向卷积
# 输入尺度 [batch, height, width, in_channels]
inputx=tf.random_normal([100,255,255,3],dtype=tf.float32)
# kernel [height, width, in_channels, output_channels]
w=tf.random_normal(shape=[5,5,3,10],dtype=tf.float32)
# 卷积输出 (100, 126, 126, 10),注意:这儿用的是VALID
outputy=tf.nn.conv2d(input=inputx,filter=w,strides=[1,2,2,1],padding='VALID')
It is easy to calculate the output with convolution, but if I now want to set the size and strides of the convolution kernel according to the size of the output image? code first
# 转置卷积
# 输入的value [batch, height, width, in_channels]
value=tf.random_normal(shape=[100,126,126,10])
# filter [height, width, output_channels, in_channels]
w=tf.random_normal(shape=[4,4,3,10])
# 转置卷积得出的结果
result=tf.nn.conv2d_transpose(value=value,filter=w,output_shape=[100,255,255,3],strides=[1,2,2,1],padding='VALID')
with tf.Session() as sess:
tf.global_variables_initializer().run()
# sess.run(outputy)
# print(outputy.shape)
sess.run(result)
print(result.shape)
In fact, there is a very simple idea, which is to calculate the transposed convolution like a convolution. This sentence means this. For example, in the code, I want to get a picture of 255*255*3 (that is, the picture enlarged by transposed convolution). The first step is to determine what padding method we use, different The padding method determines different calculation modes. In the code, we use the VALID mode, then the input value of the transposed convolution is a picture of 126*126*10, according to the calculation formula
Set kernel equal to 4, and stride calculates exactly 2.
If the padding is set to SAME, according to the formula
参考链接:http://blog.csdn.net/u013250416/article/details/78247818,
https://www.tensorflow.org/api_docs/python/tf/nn/conv2d_transpose,
https://www.tensorflow.org/api_docs/python/tf/nn/convolution,
《a guide to convolution arithmetic for deep learning》