TensorFlow入门深度学习–09.tf.contrib.slim用法详解

TensorFlow入门深度学习–09.tf.contrib.slim用法详解

tf.contrib.slim可以大大减少复杂网络的代码量，使得构建、训练以及评估变得简单，在下一节中我们将用它来实现VGGNet以及42层深的InceptionV3模型，这里我们将通过一个简化的CNN模型来介绍slim的功能。Slim是放在tensorflow.contrib这个库下的，导入方法：

    import tensorflow.contrib.slim as slim

但是tensorflow.contrib库中的代码没有官方支持，可能会被修改或删除，其中的代码可能会被合并到核心tensorflow中，所以使用之前需进行一些测试。
首先我们通过建立一个卷积-池化-卷积-池化-全连接-全连接的CNN模型来识别MNIST数据。实现源码如下：

    import tensorflow as tf
    from tensorflow.examples.tutorials.mnist import input_data
    def weight_variable(shape):
        initial = tf.truncated_normal(shape, stddev=0.1)
        return tf.Variable(initial)
    def bias_variable(shape):
        initial = tf.constant(0.1, shape=shape)
        return tf.Variable(initial)
    def conv2d(x, W):
        return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
    def max_pool_2x2(x):
        return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
    x = tf.placeholder(tf.float32, shape=[None, 784])
    y_ = tf.placeholder(tf.float32, shape=[None, 10])
    x_image = tf.reshape(x, [-1, 28, 28, 1])
    #conv1
    W_conv1 = weight_variable([5, 5, 1, 32])
    b_conv1 = bias_variable([32])
    h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
    h_pool1 = max_pool_2x2(h_conv1)
    #conv2
    W_conv2 = weight_variable([5, 5, 32, 64])
    b_conv2 = bias_variable([64])
    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
    h_pool2 = max_pool_2x2(h_conv2)
    #fc1
    W_fc1 = weight_variable([7 * 7 * 64, 1024])
    b_fc1 = bias_variable([1024])
    h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
    #dropout
    keep_prob = tf.placeholder(tf.float32)
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
    #fc2
    W_fc2 = weight_variable([1024, 10])
    b_fc2 = bias_variable([10])
    y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
    #Train and Evaluate the Model
    cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_conv, labels=y_))
    train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
    correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    mnist = input_data.read_data_sets('./../MNISTDat', one_hot=True)
    sess = tf.InteractiveSession()
    sess.run(tf.global_variables_initializer())
    for i in range(20000):
        batch = mnist.train.next_batch(50)
        if i%100 == 0:
            train_accuracy = accuracy.eval(feed_dict={
                x:batch[0], y_:batch[1], keep_prob: 1.0})
            print("step %d, training accuracy %g"%(i, train_accuracy))
        train_step.run(feed_dict={x:batch[0], y_:batch[1], keep_prob: 0.5})

从上面的代码可以看出，conv1对应的17-19行与conv2对应的22-25行重复度很高，可以将其抽象成函数，不过我们这里用slim来实现，实现如下：

    with slim.arg_scope([slim.conv2d],
                        activation_fn=tf.nn.relu,
                        weights_initializer=tf.truncated_normal_initializer(stddev=0.1),
                        biases_initializer=tf.constant_initializer(0.1),
    padding='SAME'):
        #conv1
        h_conv1 = slim.conv2d(x_image, 32, [5, 5])
        h_pool1 = slim.max_pool2d(h_conv1, [2, 2] , padding='SAME')
        #conv2
        h_conv2 = slim.conv2d(h_pool1, 64, [5, 5])
    h_pool2 = slim.max_pool2d(h_conv2, [2, 2] , padding='SAME')

arg_scope后的slim.conv2d用中括号括起来，表明我们想要对slim.conv2d这个操作的相关参数进行默认值设置，中括号后面就跟着我们设置的默认值，如激活函数设为relu： activation_fn=tf.nn.relu，卷积核初始化成标准差为0.1的截断正太分布：weights_initializer=tf.truncated_normal_initializer(stddev=0.1)，偏置项初始化为0.1：biases_initializer=tf.constant_initializer(0.1))，而卷积核的维度5*5是在具体用时设置的，卷积核的个数32也是在具体用时设置的（也是偏置项的个数），卷积核的深度由输入项x_image的通道数决定。slim.conv2d中包含了卷积核及偏置项初始化、卷积操作、偏置项求和以及非线性激活这几个操作。此外，下采用改为slim.max_pool2d来实现，对padding方式同样可以用arg_scope设置默认值来实现：

    with slim.arg_scope([slim.conv2d],
                        activation_fn=tf.nn.relu,
                        weights_initializer=tf.truncated_normal_initializer(stddev=0.1),
                        biases_initializer=tf.constant_initializer(0.1),
    padding='SAME'):
        with slim.arg_scope([slim.max_pool2d], padding='SAME'):
            #conv1
            h_conv1 = slim.conv2d(x_image, 32, [5, 5])
            h_pool1 = slim.max_pool2d(h_conv1, [2, 2])
            #conv2
            h_conv2 = slim.conv2d(h_pool1, 64, [5, 5])
            h_pool2 = slim.max_pool2d(h_conv2, [2, 2])

从上面还可以看出用arg_scope设置多个操作op的默认值的设置方法。再接着回头看原来的程序，全连接层26-34行的重复率也很高，并且权重系数也是采用截断正太分布来实现，而偏置项也是常亮化为0.1，激活函数也是relu，所以全连接层可以与卷积层采用同样的默认参数设置，但是原来的卷积层默认参数中多了一个padding=’SAME’这一设置，而全连接中没有，所以需要在默认设置中将这一项去掉，这样，我们可以在上面的声明中添加修改如下：

    with slim.arg_scope([slim.conv2d, slim.fully_connected],
                        activation_fn=tf.nn.relu,
                        weights_initializer=tf.truncated_normal_initializer(stddev=0.1),
                        biases_initializer=tf.constant_initializer(0.1)):
        with slim.arg_scope([slim.max_pool2d], padding='SAME'):
    #conv1
            h_conv1 = slim.conv2d(x_image, 32, [5, 5] , padding='SAME')
            h_pool1 = slim.max_pool2d(h_conv1, [2, 2])
            #conv2
            h_conv2 = slim.conv2d(h_pool1, 64, [5, 5] , padding='SAME')
    h_pool2 = slim.max_pool2d(h_conv2, [2, 2])
            #Densely Connected Layer
            h_pool2_flat = slim.flatten(h_pool2)
            h_fc1 = slim.fully_connected(h_pool2_flat, 1024)
            #dropout
            keep_prob = tf.placeholder(tf.float32)
            h_fc1_drop = slim.dropout(h_fc1, keep_prob)
            #Readout Layer
            y_conv = slim.fully_connected(h_fc1_drop, 10, activation_fn=None)

在arg_scope之后的中括号内添加slim.fully_connected这一项，那么slim.fully_connected将与slim.conv2共同享有后面的默认参数设置。对比slim.max_pool2d的默认参数设置发现，有共同默认设置的操作op可以放在同一个arg_scope中进行声明。此外，在h_pool2转换成一维向量时用的是slim.flatten这一方法。修改后的slim版程序如下：

    import tensorflow as tf
    import tensorflow.contrib.slim as slim
    from tensorflow.examples.tutorials.mnist import input_data
    x = tf.placeholder(tf.float32, shape=[None, 784])
    y_ = tf.placeholder(tf.float32, shape=[None, 10])
    x_image = tf.reshape(x, [-1, 28, 28, 1])
    with slim.arg_scope([slim.conv2d, slim.fully_connected],
                        activation_fn=tf.nn.relu,
                        weights_initializer=tf.truncated_normal_initializer(stddev=0.1),
                        biases_initializer=tf.constant_initializer(0.1)):
        with slim.arg_scope([slim.max_pool2d], padding='SAME'):
            #conv1
            h_conv1 = slim.conv2d(x_image, 32, [5, 5], padding='SAME')
            h_pool1 = slim.max_pool2d(h_conv1, [2, 2])
            #conv2
            h_conv2 = slim.conv2d(h_pool1, 64, [5, 5], padding='SAME')
            h_pool2 = slim.max_pool2d(h_conv2, [2, 2])
            #Densely Connected Layer
            h_pool2_flat = slim.flatten(h_pool2)
            h_fc1 = slim.fully_connected(h_pool2_flat, 1024)
            #dropout
            keep_prob = tf.placeholder(tf.float32)
            h_fc1_drop = slim.dropout(h_fc1, keep_prob)
            #Readout Layer
            y_conv = slim.fully_connected(h_fc1_drop, 10, activation_fn=None)
    #Train and Evaluate the Model
    cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_conv, labels=y_))
    train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
    correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    mnist = input_data.read_data_sets('./../MNISTDat', one_hot=True)
    sess = tf.InteractiveSession()
    sess.run(tf.global_variables_initializer())
    for i in range(20000):
        batch = mnist.train.next_batch(50)
        if i%100 == 0:
            train_accuracy = accuracy.eval(feed_dict={
                x:batch[0], y_:batch[1], keep_prob: 1.0})
            print("step %d, training accuracy %g"%(i, train_accuracy))
        train_step.run(feed_dict={x:batch[0], y_:batch[1], keep_prob: 0.5})

需要补充说明的是slim.conv2的stride默认是[1*1]的，padding方式默认是’SAME’，可通过实验得出：

    import tensorflow as tf  
    import tensorflow.contrib.slim as slim
    x1 = tf.random_normal(shape=[1, 64, 64, 3]) 
    w = tf.fill([5, 5, 3, 64], 1.0)
    y1 = tf.nn.relu(tf.nn.conv2d(x1, w, strides=[1, 1, 1, 1], padding='SAME'))
    y2 = slim.conv2d(x1, 64, [5, 5], weights_initializer=tf.ones_initializer, padding='SAME')
    y3 = slim.conv2d(x1, 64, [5, 5], stride=[3, 3], weights_initializer=tf.ones_initializer, padding='SAME')
    y4 = slim.conv2d(x1, 64, [5, 5], stride=[3, 3], weights_initializer=tf.ones_initializer, padding='VALID')
    y5 = slim.conv2d(x1, 64, [5, 5], stride=[3, 3], weights_initializer=tf.ones_initializer)
    with tf.Session() as sess: 
        sess.run(tf.global_variables_initializer()) 
        y1_value,y2_value,y3_value,y4_value,y5_value=sess.run([y1,y2,y3,y4,y5])
    print("shapes are", y1_value.shape, y2_value.shape, y3_value.shape, y4_value.shape, y5_value.shape)
>>shapes are (1, 64, 64, 64) (1, 64, 64, 64) (1, 22, 22, 64) (1, 20, 20, 64) (1, 22, 22, 64)

有同学可能觉得上面用到的slim的那些功能用没有什么特别大的优势，我自己通过定义函数时设定默认参数同样可以实现，下面我们再来介绍一下slim中最吸引人的地方，repeat和stack操作。
Repeat是用来处理操作参数一致的情况，假定有三个相同连续的卷积层，

    net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')  
    net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')  
    net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')

利用slim中的repeat操作可减少代码量：

扫描二维码关注公众号，回复： 1731240 查看本文章

    net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')

stack是用来处理卷积核或输出维度不一致的情况。假定三层连续的FC层

    x = slim.fully_connected(x, 32, scope='fc/fc_1')  
    x = slim.fully_connected(x, 64, scope='fc/fc_2')  
    x = slim.fully_connected(x, 128, scope='fc/fc_3')

使用stack操作，有

    slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')

假定3层连续的卷积核不一样的卷积操作

    x = slim.conv2d(x, 32, [3, 3], scope='core/core_1')  
    x = slim.conv2d(x, 32, [1, 1], scope='core/core_2')  
    x = slim.conv2d(x, 64, [3, 3], scope='core/core_3')

使用stack操作，有

    slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3])], scope='core')

是不是很方便，当然如果你觉得这个还是自己编写函数或类来实现的好，那也行。在后面的博文中，我们用slim实现VGGNets 16及Inception V3，代码特别简洁。

TensorFlow实现：https://pan.baidu.com/s/14A91inmZmSC55dgDv8ZZ3Q

TensorFlow入门深度学习–09.tf.contrib.slim用法详解

TensorFlow入门深度学习–09.tf.contrib.slim用法详解

猜你喜欢