以index访问Tensor元素＋反池化 unpool TensorFlow代码

转载必须著名出处和链接，否则追究法律责任

首先，来到本文的读者肯定跟我一样觉得Unpool操作非常重要，因为在Autoencoder 等需要用反向神经网络来实现invert反转来构建原始数据，例如我们可以用倒过来的卷机神经网络从特征generate出一些椅子的图片，而且生成的图片跟之前的图片一定使不同的。倒过来的神经网络架构图及应用例子如下图的例子，左边给出特征的描述找到对应的特征，最后生成椅子图片。就像是一个脑洞大开的你，输入照片描述：宋仲基聚光的小眼睛，雷佳音丧萌丧萌的眉毛，潘粤明呆萌乖巧的腿....，然后就能给你生成一批符合这些描述的照片。

目前Tensorflow对卷积的反向有现成的函数可以调用，解卷积tf.nn.conv2d_transpose大家可以自行查找。不幸的是unpool操作，也就是max-pool的反向操作并没有现成的函数，本文不仅将已有的方法做一下总结，并将代码放出来，供大家参考。在实现的时候，我翻了很多中英文的在线问答都没有解决方案，因为Tensor cannot be accessed by index，只有 numpy array才可以用index访问，而我们知道tensor 转 numpy array的唯一方式就是执行它，比如sess.run(Tensor)，或者Tessor.eval()，因为Tensor实际上像个容器，没有内容，只有执行一下才有数据，而run Tensor的这件事在我们训练模型定义层结构的时候是不符合逻辑的，当你有好多层，需要定义一个反卷积层服务于整个网络的训练，你不可能单独为某一层拿到sess和feed_dic。大家也可以自己试试，欢迎留言讨论。据我们所知，本博文是首次给出Tensorflow unpool代码，希望对大家有帮助。

1. 文献中关于Unpool的两种方法

方法1: we perform unappealing by simply replacing each entry map by an s*s block with the entry value in the top corner and zeros elsewhere. This increases the width and the height of the feature map s times. We used s=2 in our networks. 如果max-pool是将一个2*2的方格里最大值拿出，那么反向池化可以将该值赋给2*2的左上角元素，其它置为0.

方法 2: In the convnet, the max pooling opera- tion is non-invertible, however we can obtain an ap- proximate inverse by recording the locations of the maxima within each pooling region in a set of switch variables.http://www.matthewzeiler.com/wp-content/uploads/2017/07/arxive2013.pdf it records the locations of maximum activations selected dur- ing pooling operation in switch variables, which are em- ployed to place each activation back to its original pooled location https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Noh_Learning_Deconvolution_Network_ICCV_2015_paper.pdf 这个方法比方法1应该更准确了一些，因为在反向池化的时候将max pool 选择的最大值的 index拿到，那么将值再放回原位置，其它置为0。

https://arxiv.org/pdf/1506.02351.pdf

2. 代码实现

2.1 如何实现 Access tensor by index 修改／读取 Tensor的某个元素

为了完成代码，第一个摆在我们面前的事实就是：tensor cannot be accessed by index。而我们必须要做的就是，Access tensor by index 修改／读取 Tensor中的某个元素，我们把这一节单独拿出来，是因为这个问题相信是很多其它需要通过index访问tensor的代码都会遇到的。

解决方案：既然不能用 tensor[1][2]来读取index为（1，2）的元素（这里tensor是一个创建的tensor对象的名字）那么我们可以借助其它方式巧妙的完成它，也就是拆分为两个步骤：指定index来读取tensor的值 (tf.expand_dims)，为tensor 指定index的位置赋指定的值（使用tf.SparseTensor(indices, values, shape)）这两个方法大家如果要用，自行百度/google很多通俗易懂的博文可以参考。

2. 最后，放上实现上述方法1 反池化的代码


def unpool2(pool, ksize, stride, padding = 'VALID'):
    """
    simple unpool method

    :param pool : the tensor to run unpool operation
    :param ksize : integer
    :param stride : integer
    :return : the tensor after the unpool operation

    """
    pool = tf.transpose(pool, perm=[0,3,1,2])
    pool_shape = pool.shape.as_list()
    if padding == 'VALID':
        size = (pool_shape[2] - 1) * stride + ksize
    else:
        size = pool_shape[2] * stride
    unpool_shape = [pool_shape[0], pool_shape[1], size, size]
    unpool = tf.Variable(np.zeros(unpool_shape), dtype=tf.float32)
    for batch in range(pool_shape[0]):
        for channel in range(pool_shape[1]):
            for w in range(pool_shape[2]):
                for h in range(pool_shape[3]):
                    diff_matrix = tf.sparse_tensor_to_dense(tf.SparseTensor(indices=[[batch,channel,w*stride,h*stride]],values=tf.expand_dims(pool[batch][channel][w][h],axis=0),dense_shape = [pool_shape[0],pool_shape[1],size,size]))
                    unpool = unpool + diff_matrix

相信有了这个代码，方法2大家实现起来就很容易了。

但是这种方法非常慢，非常非常慢，所以实现反池化的过程，可以用下面的方法代替：

# PI is the 4-dimension Tensor from above layer
unpool1 = tf.image.resize_images(PI, size = [resize_width, resize_width], method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)

另外还有两种unpool的方法：

def max_unpool_2x2(x, output_shape):
    out = tf.concat_v2([x, tf.zeros_like(x)], 3)
    out = tf.concat_v2([out, tf.zeros_like(out)], 2)
    out_size = output_shape
    return tf.reshape(out, out_size)

＃ max unpool layer 改變輸出的 shape 為兩倍
＃ output_shape_d_pool2 = tf.pack([tf.shape(x)[0], 28, 28, 1])
＃ h_d_pool2 = max_unpool_2x2(h_d_conv2, output_shape_d_pool2)

def max_unpool_2x2(x, shape):
    inference = tf.image.resize_nearest_neighbor(x, tf.pack([shape[1]*2, shape[2]*2]))
    return inference

其他参考：

https://github.com/jon-sch/tensorflow/commit/16cd223508fa234cd4ae0f2fa4d63ee812d6cada

https://ithelp.ithome.com.tw/articles/10188326