Keras/TFLearn 之 Time Distributed

引子

前段时间写过一篇有关于TFLearn中time_distributed()函数的博客，在这篇博客中，我们提到过这个函数并不能那么直接地实现参数共享。最近看Keras的时候，发现Keras的TimeDistributed() Wrapper却是默认参数共享的。

TFLearn

首先是不传入scope：

import tflearn
import tensorflow as tf
from tflearn.layers.core import input_data, fully_connected, time_distributed

input_layer = input_data(shape=[2, 3], name="input")
print input_layer
net = time_distributed(input_layer, fully_connected, [8])

sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())

with sess.as_default():
    vs = tf.trainable_variables()
    for v in vs:
        print v
        print tflearn.variables.get_value(v)

输出结果为：

如果我们传入scope参数，也即修改一行代码：

net = time_distributed(input_layer, fully_connected, [8], scope="td")

此时输出结果为：

从上面的结果来看，TFLearn的time_distributed()函数不论是否传入scope参数，都不会去共享参数，至于如何共享参数可以参考我的前面一篇博文《TFLearn之Time Distributed》。

Keras

下面对Keras进行测试：

import keras
from keras.models import Sequential
from keras.layers.core import Dense
from keras.layers import TimeDistributed

model = Squential()
model.add(TimeDistributed(Dense(8), input_shape=(10, 16)))

print model.summary()
print model.get_weights()[0].shape
print model.get_weights()[1].shape

输出结果为：

这里，我们的输入Tensor的shape为(None, 10, 16)，10为timesteps，参数数目为136，也即16*8+8，显然，这是一个连接16个神经元到8个神经元的全连接层的参数数量，此外，我们也可以用model.get_weights()输出参数列表，我们将得到一个list，包含两个array，一个是(16, 8)的W，一个是(8,)的b，换句话说，这10个timesteps共享同一个全连接层，然后计算得到timesteps=10的输出，其shape为(None, 10, 8)。所以我们说，Keras中的TimeDistributed Wrapper是默认共享参数的。

总结

现在对本文进行总结：

TFLearn time_distributed()函数：不论是否传入scope参数，都不会共享fn的参数。
Keras TimeDistributed Wrapper：默认共享fn的参数。