原

TensorFlow-Slim模块官方教程 * * * * *

置顶 2018年06月10日 00:32:37

阅读数：630

twnsorflow 1.9.0
TF-slim 模块是 TensorFlow 中最好的 API。
尤其是里面引入的 arg_scope、model_variables、repeat、stack

TensorFlow-Slim模块官方教程 * * * * *
TF-slim

TF-slim

TF-Slim是TensorFlow中一个用来定义、训练、评估复杂模型的轻量化库。slim中的部件可以和tensorflow中其他的部件混合使用（例如TF原生的api和tf.contrib.learn等）。

1. slim模块导入方法：

import tensorflow as tf
slim = tf.contrib.slim
      
      
       
       1
       
       2

2. 为什么会有slim这个模块

slim模块可以使模型的构建、训练、评估变简单：

允许用户用紧凑的代码定义模型。这主要由arg_scope、大量的高级的layers和variables来实现。这些工具增加了代码的可读性和维护性，减少了复制、粘贴超参数值出错的可能性，并且简化了超参数的调整。
通过提供常用的regularizers来简化模型的开发。很多常用的计算机视觉模型（例如VGG、AlexNet）在slim里面已经有了实现。这些模型开箱可用，并且能够以多种方式进行扩展（例如，给内部的不同层添加multiple heads）。
slim使得“复杂模型的扩展”及“从一些现存的模型ckpt开始训练”变得容易。

3. slim模块组成：

slim是独由几个独立的模块组成。

arg_scope：允许用户对该scope内的操作定义默认参数。
data：包含了slim模块的dataset definition、data providers, parallel_reader, and decoding utilities。
evaluation：评估模型需要的一些东西。
layers：构建模型需要的一些高级layers。
learning：训练模型需要的一些东西。.
losses：常见的loss函数。
metrics：常见的评估指标。
nets：一些流行网络的定义（例如VGG、AlexNet）。
queues：提供一个容易、简单的开始和关闭QueueRunners的content manager。
regularizers：contains weight regularizers.
variables：provides convenience wrappers for variable creation and manipulation.

4. 定义模型

可以用slim、variables、layers和scopes来十分简洁地定义模型。下面对各个部分进行了详细描述：

4.1 变量（Variables）

在原生的tensorflow中创建Variables需要一个预定义的值或者一个初始化机制（例如，从高斯分布随机采样）。更近一步，如果需要在一个指定的设备上创建一个variable，必须进行显式指定。为了减少创建variable需要的代码，slim模块在variable.py内提供了一系列的wrapper函数，从而使得变量的定义更加容易。

例如，要创建一个权重variable，用一个截断的正态分布初始化它，用l2_loss进行正则，并将它放在CPU上。只需要进行如下的声明即可。

weights = slim.variable('weights',
                        shape=[10, 10, 3 , 3],
                        initializer=tf.truncated_normal_initializer(stddev=0.1),
                        regularizer=slim.l2_regularizer(0.05),
                        device='/CPU:0')
      
      
       
       1
       
       2
       
       3
       
       4
       
       5

注意：在原生的TensorFlow中，有两种类型的variables：一般variables和local（transient）variables。绝大数的变量是一般variables；一旦被创建，他们能够用一个saver保存到disk。local variables只存在于一个session中，不保存到disk。

slim进一步区分了variables通过定义model variables, 这些变量代表一个模型的参数。Model variables are trained or fine-tuned during learning and are loaded from a checkpoint during evaluation or inference（例如，由slim.fully_connected和slim.conv2d创建的variable）。Non-model变量指训练、评估过程中需要但推理过程不需要的变量（例如，global_step训练评估中需要，推理时不需要）。同样，moving average variables might mirror model variables, but the moving averages are not themselves model variables。

通过slim创建和索引（retrieved）model variables和一般的variables很容易：

# Model Variables
weights = slim.model_variable('weights',
                              shape=[10, 10, 3 , 3],
                              initializer=tf.truncated_normal_initializer(stddev=0.1),
                              regularizer=slim.l2_regularizer(0.05),
                              device='/CPU:0')
model_variables = slim.get_model_variables()

# Regular variables
my_var = slim.variable('my_var',
                       shape=[20, 1],
                       initializer=tf.zeros_initializer())
regular_variables_and_model_variables = slim.get_variables()
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12
       
       13

内部是怎么实现的呢？当你通过slim的layers或者直接通过slim.model_variable函数创建model variables时，slim将variable添加到了tf.GrapghKeys.MODEL_VARIABLES容器中。如果你有自定义的layers或者variable创建routine，但是仍然想要使用slim去管理或者想让slim知道你的model variables，slim模块提供了一个很方便的添加model variable到对应的容器中的函数：

my_model_variable = CreateViaCustomCode()

# Letting TF-Slim know about the additional variable.
slim.add_model_variable(my_model_variable)
      
      
       
       1
       
       2
       
       3
       
       4

4.2 层（Layers）

虽然TensorFlow的操作集合相当广泛，但神经网络的开发人员通常会在更高的层次上考虑模型，比如：“layers”、“losses”、“metrics”和“networks”。layer（例如conv层、fc层、bn层）比TensorFlow op更加抽象，并且layer通常涉及多个op。
更进一步，layer通常（但不总是）有很多与之相关的variable（可调参数(tunable parameters)），这一点与大多数的基本操作区别很大。例如，神经网络中的一个conv层由很多低级的op组成：

1. 创建权重和偏差viriable
2. 对权重和输入进行卷积（输入来自前一层）
3. 卷积结果加上偏差
4. 应用一个激活函数

仅使用基础（plain）的TensorFlow代码，这可能相当费力：

input = ...
with tf.name_scope('conv1_1') as scope:
  kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32,
                                           stddev=1e-1), name='weights')
  conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')
  biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32),
                       trainable=True, name='biases')
  bias = tf.nn.bias_add(conv, biases)
  conv1 = tf.nn.relu(bias, name=scope)
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9

为了避免代码的重复。slim提供了很多方便的神经网络layers的高层op。例如：与上面的代码对应的slim版的代码：

input = ...
net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')
      
      
       
       1
       
       2

对于构建神经网络的大量部件，slim都提供了标准的实现。这些实现包括但不限于下表中的op：

Layer	TF-Slim
BiasAdd	slim.bias_add
BatchNorm	slim.batch_norm
Conv2d	slim.conv2d
Conv2dInPlane	slim.conv2d_in_plane
Conv2dTranspose (Deconv)	slim.conv2d_transpose
FullyConnected	slim.fully_connected
AvgPool2D	slim.avg_pool2d
Dropout	slim.dropout
Flatten	slim.flatten
MaxPool2D	slim.max_pool2d
OneHotEncoding	slim.one_hot_encoding
SeparableConv2	slim.separable_conv2d
UnitNorm	slim.unit_norm

slim还提供了两个meta-operations：repeat和stack。 tf.contrib.layers.repeat 和 stack，普通函数可以用这两个函数。它们允许用户去重复的进行（perform）相同的操作（operation）。例如，考虑下面的代码段（来自VGG网络，它的layers在两个pooling层之间进行了很多conv）：

net = ...
net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
      
      
       
       1
       
       2
       
       3
       
       4
       
       5

一个减少代码重复的方法是使用for循环：

net = ...
for i in range(3):
  net = slim.conv2d(net, 256, [3, 3], scope='conv3_%d' % (i+1))
net = slim.max_pool2d(net, [2, 2], scope='pool2')
      
      
       
       1
       
       2
       
       3
       
       4

使用slim.repeat可以使上面的代码变得更清晰明了：

net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
      
      
       
       1
       
       2

注意：slim.repeat不仅对repeated单元采用相同的参数，而且它对repeated单元的scope采用更好的命名方式（加下划线，再加迭代序号）。具体来说，上面例子中的scopes将会命名为 ‘conv3/conv3_1’，’conv3/conv3_2’，’conv3/conv3_3’

更进一步，slim的slim.stack允许去重复多个操作with不同的参数，从而创建一个多层的堆叠结构。slim.stack也为每一个创建的op创造了一个新的tf.variable_scope。例如，创建一个多层感知器（Multi-Layer Perceptron (MLP)）的一个简单方式：

# Verbose way: 冗长的方式
x = slim.fully_connected(x, 32, scope='fc/fc_1')
x = slim.fully_connected(x, 64, scope='fc/fc_2')
x = slim.fully_connected(x, 128, scope='fc/fc_3')

# Equivalent, TF-Slim way using slim.stack:
x = slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7

在这个例子中，slim.stack调用slim.fully_connected三次，并将函数上一次调用的输出传递给下一次调用。但是，在每个调用中，隐形单元（hidden units）的数量分别为32,64,128。相似地，我们可以使用stack去简化多层卷积的堆叠：

# Verbose way: 冗长的方式
x = slim.conv2d(x, 32, [3, 3], scope='core/core_1')
x = slim.conv2d(x, 32, [1, 1], scope='core/core_2')
x = slim.conv2d(x, 64, [3, 3], scope='core/core_3')
x = slim.conv2d(x, 64, [1, 1], scope='core/core_4')

# Using stack:
x = slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3]), (64, [1, 1])], scope='core')
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8

4.3 作用域（Scopes）

除了TensorFlow中的scope机制的几种类型（name_scope，variable_scope），slim增加了一个名为arg_scope的新scope机制。这个新scope允许用户去给一个或多个op指定一套默认参数，这些默认参数将被传给arg_scope里使用的的每一个op。这个功能最好通过例子来说明。考虑一下代码段：

net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv1')
net = slim.conv2d(net, 128, [11, 11], padding='VALID',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv2')
net = slim.conv2d(net, 256, [11, 11], padding='SAME',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv3')
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9

很明显，这三个卷积层共享很多相同的超参数。两个有相同的padding，三个都有相同的weights_initializer和weight_regularizer。这段代码很难读，并且包含了很多重复的值。一个解决方案是使用变量指定默认值：

padding = 'SAME'
initializer = tf.truncated_normal_initializer(stddev=0.01)
regularizer = slim.l2_regularizer(0.0005)
net = slim.conv2d(inputs, 64, [11, 11], 4,
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv1')
net = slim.conv2d(net, 128, [11, 11],
                  padding='VALID',
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv2')
net = slim.conv2d(net, 256, [11, 11],
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv3')
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12
       
       13
       
       14
       
       15
       
       16
       
       17
       
       18

这个解决方案保证了三个卷积层拥有相同的参数值，但代码仍不够清晰。通过使用一个arg_scope，我们能够在保证每一层使用相同参数值的同时，简化代码：

  with slim.arg_scope([slim.conv2d], padding='SAME',
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01)
                      weights_regularizer=slim.l2_regularizer(0.0005)):
    net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')
    net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
    net = slim.conv2d(net, 256, [11, 11], scope='conv3')
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6

如上例所示，使用arg_scope使代码更清晰、简单并且容易去维护。注意，在arg_scope内部指定op的参数值时，指定的参数将取代默认参数。具体来讲，当padding参数的默认值被设置为’SAME’时，第二个卷积的padding参数被指定为’VALID’。

我们也可以嵌套地使用arg_scope，并且在同一个scope中可以使用多个op。例如：

with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      activation_fn=tf.nn.relu,
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                      weights_regularizer=slim.l2_regularizer(0.0005)):
  with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):
    net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')
    net = slim.conv2d(net, 256, [5, 5],
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.03),
                      scope='conv2')
    net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10

在这个例子中，第一个arg_scope中对conv2d、fully_connected层使用相同的weights_initializer。在第二arg_scope中，给conv2d的其它默认参数进行了指定。

4.4 实例：创建VGG网络（Working Example: Specifying the VGG16 Layers）

结合slim模块的variable、operation、scope，我们能够用很少行的代码实现非常复杂的网络。例如，整个VGG架构可以使用下面的代码段实现：

def vgg16(inputs):
  with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      activation_fn=tf.nn.relu,
                      weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
                      weights_regularizer=slim.l2_regularizer(0.0005)):
    net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
    net = slim.max_pool2d(net, [2, 2], scope='pool1')
    net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
    net = slim.max_pool2d(net, [2, 2], scope='pool2')
    net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
    net = slim.max_pool2d(net, [2, 2], scope='pool3')
    net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
    net = slim.max_pool2d(net, [2, 2], scope='pool4')
    net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
    net = slim.max_pool2d(net, [2, 2], scope='pool5')
    net = slim.fully_connected(net, 4096, scope='fc6')
    net = slim.dropout(net, 0.5, scope='dropout6')
    net = slim.fully_connected(net, 4096, scope='fc7')
    net = slim.dropout(net, 0.5, scope='dropout7')
    net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')
  return net
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12
       
       13
       
       14
       
       15
       
       16
       
       17
       
       18
       
       19
       
       20
       
       21

5. 训练模型（Training Models）

模型的训练需要一个model、一个loss function、gradient computation和一个training routine（迭代地计算模型的loss关于权重的梯度，并且根据梯度对权重进行更新）。slim提供了常见的loss函数和一系列训练、评估需要的函数。

5.1 损失函数（Losses）

根据官方提示，slim.losses模块将被去除，请使用tf.losses模块，两者功能完全一致

loss函数定义了一个我们想要优化的量。对于分类问题，loss一般是正确的类别分布（true distribution）和预测的类别分布（predicted probability distribution across classes）之间的交叉熵（cross entropy）。对于回归问题，loss一般是
预测值和真实值之间差值的平方和。

一些模型（比如多任务学习模型）需要同时使用多个loss函数。换言之，loss函数最终最小化的量是使用的多个loss函数的和。例如，在一个模型中，同时预测一张图片的场景（the type of scene in an image）和每个像素的景深（the depth from the camera of each pixel）。这个模型的loss函数将是分类loss和depth prediction loss的和。

slim通过losses模块提供了一个易用的定义、追踪 loss 函数的方法。我们以VGG网络的训练为一个简单的例子来说明：

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets
vgg = nets.vgg

# Load the images and labels.
images, labels = ...

# Create the model.
predictions, _ = vgg.vgg_16(images)

# Define the loss functions and get the total loss.
loss = slim.losses.softmax_cross_entropy(predictions, labels)
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12

在这个例子中，我们首先创建model（使用slim.nets.vgg来实现），并且添加标准的分类损失（loss）。现在，让我们研究下多目标模型（产生多个输出）的情况：

# Load the images and labels.
images, scene_labels, depth_labels = ...

# Create the model.
scene_predictions, depth_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)

# The following two lines have the same effect:
total_loss = classification_loss + sum_of_squares_loss
total_loss = slim.losses.get_total_loss(add_regularization_losses=False)
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12
       
       13

在这个例子中，我们有两个loss（slim.losses.softmax_cross_entropy和slim.losses.sum_of_squares）。我们可以通过将两个loss加起来或者调用slim.losses.get_total_loss()来得到总的loss（total_loss）。slim.losses.get_total_loss的工作原理：当用slim创建一个loss函数时，slim会把loss添加到一个特定的容器中。这使得我们既可以手动管理总的loss，也可以使用slim来管理总loss。

在有一个自定义的loss的情况下，如果想让slim来管理losses，怎么办呢？loss_ops.py也有一个函数去将自定义的loss添加到slim的容器中。例如：

# Load the images and labels.
images, scene_labels, depth_labels, pose_labels = ...

# Create the model.
scene_predictions, depth_predictions, pose_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
pose_loss = MyCustomLossFunction(pose_predictions, pose_labels)
slim.losses.add_loss(pose_loss) # Letting TF-Slim know about the additional loss.

# The following two ways to compute the total loss are equivalent:
regularization_loss = tf.add_n(slim.losses.get_regularization_losses())
total_loss1 = classification_loss + sum_of_squares_loss + pose_loss + regularization_loss

# (Regularization Loss is included in the total loss by default).
total_loss2 = slim.losses.get_total_loss()
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12
       
       13
       
       14
       
       15
       
       16
       
       17
       
       18

在这个例子中，我们既可以手动地产生这个总的loss，也可以让slim知道额外的loss并处理losses。

5.2 训练Loop（Training Loop）

slim为模型的训练提供了很多简单但强有力的工具（见learning.py中）。这包含了一个训练函数（重复地计算loss、计算梯度、将模型保存到disk）和很多操纵梯度的函数。例如，一旦我们我们已经指定模型、loss函数、训练方案，我们能够调用 slim.learning.create_train_op和slim.learning.train去执行优化：

g = tf.Graph()

# Create the model and specify the losses...
...

total_loss = slim.losses.get_total_loss()
optimizer = tf.train.GradientDescentOptimizer(learning_rate)

# create_train_op ensures that each time we ask for the loss, the update_ops
# are run and the gradients being computed are applied too.
train_op = slim.learning.create_train_op(total_loss, optimizer)
logdir = ... # Where checkpoints are stored.

slim.learning.train(
    train_op,
    logdir,
    number_of_steps=1000,
    save_summaries_secs=300,
    save_interval_secs=600)
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12
       
       13
       
       14
       
       15
       
       16
       
       17
       
       18
       
       19

在这个例子中，slim.learning.train中的train_op主要进行两个操作：(a)计算loss；(b)进行梯度更新。logdir指定了checkpoint和event文件保存的目录。我们可以指定梯度下降步的数量。在这个例子中，我们指定只执行1000步梯度下降。save_summaries_secs=300指定每5分钟计算一次summaries。save_interval_secs=600指定每10分钟保存一个model checkpoint。

5.3 实例：训练VGG模型（Working Example: Training the VGG16 Model）

为了说明slim的用法，我们研究下VGG网络的训练：

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets

slim = tf.contrib.slim
vgg = nets.vgg

...

train_log_dir = ...
if not tf.gfile.Exists(train_log_dir):
  tf.gfile.MakeDirs(train_log_dir)

with tf.Graph().as_default():
  # Set up the data loading:
  images, labels = ...

  # Define the model:
  predictions = vgg.vgg_16(images, is_training=True)

  # Specify the loss function:
  slim.losses.softmax_cross_entropy(predictions, labels)

  total_loss = slim.losses.get_total_loss()
  tf.summary.scalar('losses/total_loss', total_loss)

  # Specify the optimization scheme:
  optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)

  # create_train_op that ensures that when we evaluate it to get the loss,
  # the update_ops are done and the gradient updates are computed.
  train_tensor = slim.learning.create_train_op(total_loss, optimizer)

  # Actually runs training.
  slim.learning.train(train_tensor, train_log_dir)
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12
       
       13
       
       14
       
       15
       
       16
       
       17
       
       18
       
       19
       
       20
       
       21
       
       22
       
       23
       
       24
       
       25
       
       26
       
       27
       
       28
       
       29
       
       30
       
       31
       
       32
       
       33
       
       34

6. 现有模型的微调（Fine-Tuning Existing Models）

6.1 从ckpt中恢复变量的简介（Brief Recap on Restoring Variables from a Checkpoint）

在一个模型训练完毕后，能够使用tf.train.Saver()从一个给定的checkpoint中恢复Variables。很多情况下，tf.train.Saver()提供了一个简单的恢复所有或一小部分变量的方法。

# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add ops to restore all the variables.
restorer = tf.train.Saver()

# Add ops to restore some variables.
restorer = tf.train.Saver([v1, v2])

# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
  # Restore variables from disk.
  restorer.restore(sess, "/tmp/model.ckpt")
  print("Model restored.")
  # Do some work with the model
  ...
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12
       
       13
       
       14
       
       15
       
       16
       
       17
       
       18

6.2 部分地恢复模型（Partially Restoring Models）

很多时候，我们想去在一个新数据集或甚至一个新的任务上微调（fine-tune）一个已经训练好的网络。在这些情况下，我们能够使用slim的辅助函数去选择一部分的变量来进行恢复：

# Create some variables.
v1 = slim.variable(name="v1", ...)
v2 = slim.variable(name="nested/v2", ...)
...

# Get list of variables to restore (which contains only 'v2'). These are all
# equivalent methods:
variables_to_restore = slim.get_variables_by_name("v2")
# or
variables_to_restore = slim.get_variables_by_suffix("2")
# or
variables_to_restore = slim.get_variables(scope="nested")
# or
variables_to_restore = slim.get_variables_to_restore(include=["nested"])
# or
variables_to_restore = slim.get_variables_to_restore(exclude=["v1"])

# Create the saver which will be used to restore the variables.
restorer = tf.train.Saver(variables_to_restore)

with tf.Session() as sess:
  # Restore variables from disk.
  restorer.restore(sess, "/tmp/model.ckpt")
  print("Model restored.")
  # Do some work with the model
  ...
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12
       
       13
       
       14
       
       15
       
       16
       
       17
       
       18
       
       19
       
       20
       
       21
       
       22
       
       23
       
       24
       
       25
       
       26

6.3 不同变量名称的模型的恢复（Restoring models with different variable names）

当从一个checkpoint中恢复variables时，Saver会在checkpoint中寻找variable的name，并将它们映射到当前图中的variables。上面，我们在创建一个saver的时候，指定了要恢复的Variable。在这种情况下，会自动调用var.op.name来获得variables的name，然后映射、、、。

当checkpoint文件中的variable names和当前图（graph）中的variable names匹配时，恢复过程很简单。但有时checkpoint中的变量和当前图中的变量有不同的name。在这种情况下，我们必须为Saver提供一个字典，这个字典将checkpoint中的variable name映射到图中的variable。在下面的例子中，我们用了一个简单的函数来获取checkpoint中的variables names：

# Assuming than 'conv1/weights' should be restored from 'vgg16/conv1/weights'
def name_in_checkpoint(var):
  return 'vgg16/' + var.op.name

# Assuming than 'conv1/weights' and 'conv1/bias' should be restored from 'conv1/params1' and 'conv1/params2'
def name_in_checkpoint(var):
  if "weights" in var.op.name:
    return var.op.name.replace("weights", "params1")
  if "bias" in var.op.name:
    return var.op.name.replace("bias", "params2")

variables_to_restore = slim.get_model_variables()
variables_to_restore = {name_in_checkpoint(var):var for var in variables_to_restore}
restorer = tf.train.Saver(variables_to_restore)

with tf.Session() as sess:
  # Restore variables from disk.
  restorer.restore(sess, "/tmp/model.ckpt")
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12
       
       13
       
       14
       
       15
       
       16
       
       17
       
       18

6.4 在一个不同的任务上微调模型（Fine-Tuning a Model on a different task）

上面的例子中，我们有一个预训练好（pre-trained）的VGG16模型。这个模型是在1000类的ImageNet数据集上训练的。但是，我们想要将它应用到只有20类的Pascal VOC数据集上。为了达到这个目的，我们可以用预训练好的模型的参数来初始化我们的新模型（除了最后一层）：

# Load the Pascal VOC data
image, label = MyPascalVocDataLoader(...)
images, labels = tf.train.batch([image, label], batch_size=32)

# Create the model
predictions = vgg.vgg_16(images)

train_op = slim.learning.create_train_op(...)

# Specify where the Model, trained on ImageNet, was saved.
model_path = '/path/to/pre_trained_on_imagenet.checkpoint'

# Specify where the new model will live:
log_dir = '/path/to/my_pascal_model_dir/'

# Restore only the convolutional layers:
variables_to_restore = slim.get_variables_to_restore(exclude=['fc6', 'fc7', 'fc8'])
init_fn = assign_from_checkpoint_fn(model_path, variables_to_restore)

# Start training.
slim.learning.train(train_op, log_dir, init_fn=init_fn)
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12
       
       13
       
       14
       
       15
       
       16
       
       17
       
       18
       
       19
       
       20
       
       21

7. 评估模型（Evaluating Models）

一旦我们训练完一个模型（或者甚至在模型训练过程中），我们想看看模型在实践中的表现如何。这可以通过选择一组评价指标（evaluation metrics）来实现，这将对模型的性能（performance）进行评估（grade），并且评估代码会真正地加载数据（actually loads the data），进行推理（performs inference），将推理结果和真实情况（ground truth）进行比较，记录评估分数（records the evaluation scores）。该步骤可以执行一次或者周期性重复执行（repeated periodically）。

7.1 评价指标（Metrics）

我们定义了一个评价指标来作为性能的衡量，这个评价指标不是一个loss函数（在训练过程中，losses是被优化的量），但是这个评价指标对于我们模型的评估十分重要。例如，我们可能想要去最小化对数损失函数（log loss），但是我们的想要的评价指标可能是F1 score (test accuracy)或者IoU (Intersection Over Union score)（这是不可微分的，所以不能被用作losses）

slim提供了很多评价指标操作（metric operation），这些op使得模型的评估变得容易。理论上，计算评价指标的值能够被分为三部分：

初始化（Initialization）：初始化评价指标相关的一些variables
聚合（Aggregation）：执行很多计算评价指标需要的操作（sum等）
完成（Finalization）：(可选) 执行任何计算评价指标的最终操作。例如，计算均值（means）、最小值（mins）、最大值（maxes）等。

例如，为了计算mean_absolute_error，count和total两个变量被初始化为0。在聚合过程中，我们观测（observe）一些predictions和labels，计算误差的绝对值，并且对其求和total。每一次，我们观察另一个值，count就增加一点。最后，在完成阶段，total除以count从而获得误差绝对值的均值。

下面的例子说明了定义metrics的API的使用方法。因为metrics通常在测试数据集上计算，而测试集与训练集（通常loss是在训练集上计算）是不同的，我们将假设正在使用测试数据：

images, labels = LoadTestData(...)
predictions = MyModel(images)

mae_value_op, mae_update_op = slim.metrics.streaming_mean_absolute_error(predictions, labels)
mre_value_op, mre_update_op = slim.metrics.streaming_mean_relative_error(predictions, labels)
pl_value_op, pl_update_op = slim.metrics.percentage_less(mean_relative_errors, 0.3)
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6

正如例子所述，创建一个metric会返回两个值：一个value_op一个update_op。value_op是一个返回metric当前值的 idempotent op。update_op执行上面提及的聚合步骤（aggregation step）同时返回metric的值。

追踪每一个value_op 及update_op是非常费力的。为了处理这个问题，slim提供了两个很方便的函数：


# Aggregates the value and update ops in two lists:
value_ops, update_ops = slim.metrics.aggregate_metrics(
    slim.metrics.streaming_mean_absolute_error(predictions, labels),
    slim.metrics.streaming_mean_squared_error(predictions, labels))

# Aggregates the value and update ops in two dictionaries:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    "eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
    "eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})

      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12

7.2 实例：追踪多个评价指标（Working example: Tracking Multiple Metrics）

把所有的代码放在一起：

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets

slim = tf.contrib.slim
vgg = nets.vgg


# Load the data
images, labels = load_data(...)

# Define the network
predictions = vgg.vgg_16(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    "eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
    "eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})

# Evaluate the model using 1000 batches of data:
num_batches = 1000

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  sess.run(tf.local_variables_initializer())

  for batch_id in range(num_batches):
    sess.run(names_to_updates.values())

  metric_values = sess.run(names_to_values.values())
  for metric, value in zip(names_to_values.keys(), metric_values):
    print('Metric %s has value: %f' % (metric, value))
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12
       
       13
       
       14
       
       15
       
       16
       
       17
       
       18
       
       19
       
       20
       
       21
       
       22
       
       23
       
       24
       
       25
       
       26
       
       27
       
       28
       
       29
       
       30
       
       31
       
       32

注意：metric_ops.py可以在不使用 layers.py和 loss_ops.py的情况下单独使用。

7.3 评估Loop（Evaluation Loop）

slim提供了一个评估模块(evaluation.py)，这个模块包含了编写模型评估脚本（scripts）的辅助函数（这些函数定义在metric_ops.py模块）。这些函数包括周期性运行评估、在batch上计算metrics、print和summarizing metric结果。例如：

import tensorflow as tf

slim = tf.contrib.slim

# Load the data
images, labels = load_data(...)

# Define the network
predictions = MyModel(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    'accuracy': slim.metrics.accuracy(predictions, labels),
    'precision': slim.metrics.precision(predictions, labels),
    'recall': slim.metrics.recall(mean_relative_errors, 0.3),
})

# Create the summary ops such that they also print out to std output:
summary_ops = []
for metric_name, metric_value in names_to_values.iteritems():
  op = tf.summary.scalar(metric_name, metric_value)
  op = tf.Print(op, [metric_value], metric_name)
  summary_ops.append(op)

num_examples = 10000
batch_size = 32
num_batches = math.ceil(num_examples / float(batch_size))

# Setup the global step.
slim.get_or_create_global_step()

output_dir = ... # Where the summaries are stored.
eval_interval_secs = ... # How often to run the evaluation.
slim.evaluation.evaluation_loop(
    'local',
    checkpoint_dir,
    log_dir,
    num_evals=num_batches,
    eval_op=names_to_updates.values(),
    summary_op=tf.summary.merge(summary_ops),
    eval_interval_secs=eval_interval_secs)
      
      
       
       1
       
       2
       
       3
       
       4
       
       5
       
       6
       
       7
       
       8
       
       9
       
       10
       
       11
       
       12
       
       13
       
       14
       
       15
       
       16
       
       17
       
       18
       
       19
       
       20
       
       21
       
       22
       
       23
       
       24
       
       25
       
       26
       
       27
       
       28
       
       29
       
       30
       
       31
       
       32
       
       33
       
       34
       
       35
       
       36
       
       37
       
       38
       
       39
       
       40
       
       41

8. 作者（Authors）

Sergio Guadarrama and Nathan Silberman

9. 参考资料：

tensorflow.contrib.slim 模块官方说明文件: README.md

    <div class="article-bar-bottom">
                    <div class="tags-box artic-tag-box">
        <span class="label">文章标签：</span>
                    <a data-track-click="{&quot;mod&quot;:&quot;popu_626&quot;,&quot;con&quot;:&quot;TensorFlow&quot;}" data-track-view="{&quot;mod&quot;:&quot;popu_626&quot;,&quot;con&quot;:&quot;TensorFlow&quot;}" class="tag-link" href="http://so.csdn.net/so/search/s.do?q=TensorFlow&amp;t=blog" target="_blank">TensorFlow                        </a><a data-track-click="{&quot;mod&quot;:&quot;popu_626&quot;,&quot;con&quot;:&quot;slim&quot;}" data-track-view="{&quot;mod&quot;:&quot;popu_626&quot;,&quot;con&quot;:&quot;slim&quot;}" class="tag-link" href="http://so.csdn.net/so/search/s.do?q=slim&amp;t=blog" target="_blank">slim                        </a><a data-track-click="{&quot;mod&quot;:&quot;popu_626&quot;,&quot;con&quot;:&quot;官方&quot;}" data-track-view="{&quot;mod&quot;:&quot;popu_626&quot;,&quot;con&quot;:&quot;官方&quot;}" class="tag-link" href="http://so.csdn.net/so/search/s.do?q=官方&amp;t=blog" target="_blank">官方                        </a><a data-track-click="{&quot;mod&quot;:&quot;popu_626&quot;,&quot;con&quot;:&quot;教程&quot;}" data-track-view="{&quot;mod&quot;:&quot;popu_626&quot;,&quot;con&quot;:&quot;教程&quot;}" class="tag-link" href="http://so.csdn.net/so/search/s.do?q=教程&amp;t=blog" target="_blank">教程                        </a><a data-track-click="{&quot;mod&quot;:&quot;popu_626&quot;,&quot;con&quot;:&quot;翻译&quot;}" data-track-view="{&quot;mod&quot;:&quot;popu_626&quot;,&quot;con&quot;:&quot;翻译&quot;}" class="tag-link" href="http://so.csdn.net/so/search/s.do?q=翻译&amp;t=blog" target="_blank">翻译                        </a>
    </div>
                    <div class="tags-box">
        <span class="label">个人分类：</span>
                    <a class="tag-link" href="https://blog.csdn.net/u014061630/article/category/7710004" target="_blank">TensorFlow教程                       </a>
    </div>
                        </div>

<!-- !empty($pre_next_article[0]) -->
        <div class="related-article related-article-prev text-truncate">
    <a href="https://blog.csdn.net/u014061630/article/details/80558672">
        <span>上一篇</span>ResNet in ResNet 论文笔记       </a>
</div>
            <div class="related-article related-article-next text-truncate">
    <a href="https://blog.csdn.net/u014061630/article/details/80677071">
        <span>下一篇</span>WRN 论文笔记        </a>
</div>
</div>

twnsorflow 1.9.0
TF-slim 模块是 TensorFlow 中最好的 API。
尤其是里面引入的 arg_scope、model_variables、repeat、stack

TensorFlow-Slim模块官方教程 * * * * *
TF-slim

TF-slim

1. slim模块导入方法：

import tensorflow as tf
slim = tf.contrib.slim
  
  
   
   1
   
   2

2. 为什么会有slim这个模块

slim模块可以使模型的构建、训练、评估变简单：

允许用户用紧凑的代码定义模型。这主要由arg_scope、大量的高级的layers和variables来实现。这些工具增加了代码的可读性和维护性，减少了复制、粘贴超参数值出错的可能性，并且简化了超参数的调整。
通过提供常用的regularizers来简化模型的开发。很多常用的计算机视觉模型（例如VGG、AlexNet）在slim里面已经有了实现。这些模型开箱可用，并且能够以多种方式进行扩展（例如，给内部的不同层添加multiple heads）。
slim使得“复杂模型的扩展”及“从一些现存的模型ckpt开始训练”变得容易。

3. slim模块组成：

slim是独由几个独立的模块组成。

arg_scope：允许用户对该scope内的操作定义默认参数。
data：包含了slim模块的dataset definition、data providers, parallel_reader, and decoding utilities。
evaluation：评估模型需要的一些东西。
layers：构建模型需要的一些高级layers。
learning：训练模型需要的一些东西。.
losses：常见的loss函数。
metrics：常见的评估指标。
nets：一些流行网络的定义（例如VGG、AlexNet）。
queues：提供一个容易、简单的开始和关闭QueueRunners的content manager。
regularizers：contains weight regularizers.
variables：provides convenience wrappers for variable creation and manipulation.

4. 定义模型

可以用slim、variables、layers和scopes来十分简洁地定义模型。下面对各个部分进行了详细描述：

4.1 变量（Variables）

例如，要创建一个权重variable，用一个截断的正态分布初始化它，用l2_loss进行正则，并将它放在CPU上。只需要进行如下的声明即可。

weights = slim.variable('weights',
                        shape=[10, 10, 3 , 3],
                        initializer=tf.truncated_normal_initializer(stddev=0.1),
                        regularizer=slim.l2_regularizer(0.05),
                        device='/CPU:0')
  
  
   
   1
   
   2
   
   3
   
   4
   
   5

通过slim创建和索引（retrieved）model variables和一般的variables很容易：

# Model Variables
weights = slim.model_variable('weights',
                              shape=[10, 10, 3 , 3],
                              initializer=tf.truncated_normal_initializer(stddev=0.1),
                              regularizer=slim.l2_regularizer(0.05),
                              device='/CPU:0')
model_variables = slim.get_model_variables()

# Regular variables
my_var = slim.variable('my_var',
                       shape=[20, 1],
                       initializer=tf.zeros_initializer())
regular_variables_and_model_variables = slim.get_variables()
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12
   
   13

my_model_variable = CreateViaCustomCode()

# Letting TF-Slim know about the additional variable.
slim.add_model_variable(my_model_variable)
  
  
   
   1
   
   2
   
   3
   
   4

4.2 层（Layers）

1. 创建权重和偏差viriable
2. 对权重和输入进行卷积（输入来自前一层）
3. 卷积结果加上偏差
4. 应用一个激活函数

仅使用基础（plain）的TensorFlow代码，这可能相当费力：

input = ...
with tf.name_scope('conv1_1') as scope:
  kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32,
                                           stddev=1e-1), name='weights')
  conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')
  biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32),
                       trainable=True, name='biases')
  bias = tf.nn.bias_add(conv, biases)
  conv1 = tf.nn.relu(bias, name=scope)
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9

为了避免代码的重复。slim提供了很多方便的神经网络layers的高层op。例如：与上面的代码对应的slim版的代码：

input = ...
net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')
  
  
   
   1
   
   2

对于构建神经网络的大量部件，slim都提供了标准的实现。这些实现包括但不限于下表中的op：

Layer	TF-Slim
BiasAdd	slim.bias_add
BatchNorm	slim.batch_norm
Conv2d	slim.conv2d
Conv2dInPlane	slim.conv2d_in_plane
Conv2dTranspose (Deconv)	slim.conv2d_transpose
FullyConnected	slim.fully_connected
AvgPool2D	slim.avg_pool2d
Dropout	slim.dropout
Flatten	slim.flatten
MaxPool2D	slim.max_pool2d
OneHotEncoding	slim.one_hot_encoding
SeparableConv2	slim.separable_conv2d
UnitNorm	slim.unit_norm

net = ...
net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
  
  
   
   1
   
   2
   
   3
   
   4
   
   5

一个减少代码重复的方法是使用for循环：

net = ...
for i in range(3):
  net = slim.conv2d(net, 256, [3, 3], scope='conv3_%d' % (i+1))
net = slim.max_pool2d(net, [2, 2], scope='pool2')
  
  
   
   1
   
   2
   
   3
   
   4

使用slim.repeat可以使上面的代码变得更清晰明了：

net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
  
  
   
   1
   
   2

# Verbose way: 冗长的方式
x = slim.fully_connected(x, 32, scope='fc/fc_1')
x = slim.fully_connected(x, 64, scope='fc/fc_2')
x = slim.fully_connected(x, 128, scope='fc/fc_3')

# Equivalent, TF-Slim way using slim.stack:
x = slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7

# Verbose way: 冗长的方式
x = slim.conv2d(x, 32, [3, 3], scope='core/core_1')
x = slim.conv2d(x, 32, [1, 1], scope='core/core_2')
x = slim.conv2d(x, 64, [3, 3], scope='core/core_3')
x = slim.conv2d(x, 64, [1, 1], scope='core/core_4')

# Using stack:
x = slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3]), (64, [1, 1])], scope='core')
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8

4.3 作用域（Scopes）

net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv1')
net = slim.conv2d(net, 128, [11, 11], padding='VALID',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv2')
net = slim.conv2d(net, 256, [11, 11], padding='SAME',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv3')
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9

padding = 'SAME'
initializer = tf.truncated_normal_initializer(stddev=0.01)
regularizer = slim.l2_regularizer(0.0005)
net = slim.conv2d(inputs, 64, [11, 11], 4,
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv1')
net = slim.conv2d(net, 128, [11, 11],
                  padding='VALID',
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv2')
net = slim.conv2d(net, 256, [11, 11],
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv3')
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12
   
   13
   
   14
   
   15
   
   16
   
   17
   
   18

  with slim.arg_scope([slim.conv2d], padding='SAME',
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01)
                      weights_regularizer=slim.l2_regularizer(0.0005)):
    net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')
    net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
    net = slim.conv2d(net, 256, [11, 11], scope='conv3')
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6

我们也可以嵌套地使用arg_scope，并且在同一个scope中可以使用多个op。例如：

with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      activation_fn=tf.nn.relu,
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                      weights_regularizer=slim.l2_regularizer(0.0005)):
  with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):
    net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')
    net = slim.conv2d(net, 256, [5, 5],
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.03),
                      scope='conv2')
    net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10

在这个例子中，第一个arg_scope中对conv2d、fully_connected层使用相同的weights_initializer。在第二arg_scope中，给conv2d的其它默认参数进行了指定。

4.4 实例：创建VGG网络（Working Example: Specifying the VGG16 Layers）

结合slim模块的variable、operation、scope，我们能够用很少行的代码实现非常复杂的网络。例如，整个VGG架构可以使用下面的代码段实现：

def vgg16(inputs):
  with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      activation_fn=tf.nn.relu,
                      weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
                      weights_regularizer=slim.l2_regularizer(0.0005)):
    net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
    net = slim.max_pool2d(net, [2, 2], scope='pool1')
    net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
    net = slim.max_pool2d(net, [2, 2], scope='pool2')
    net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
    net = slim.max_pool2d(net, [2, 2], scope='pool3')
    net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
    net = slim.max_pool2d(net, [2, 2], scope='pool4')
    net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
    net = slim.max_pool2d(net, [2, 2], scope='pool5')
    net = slim.fully_connected(net, 4096, scope='fc6')
    net = slim.dropout(net, 0.5, scope='dropout6')
    net = slim.fully_connected(net, 4096, scope='fc7')
    net = slim.dropout(net, 0.5, scope='dropout7')
    net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')
  return net
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12
   
   13
   
   14
   
   15
   
   16
   
   17
   
   18
   
   19
   
   20
   
   21

5. 训练模型（Training Models）

5.1 损失函数（Losses）

根据官方提示，slim.losses模块将被去除，请使用tf.losses模块，两者功能完全一致

slim通过losses模块提供了一个易用的定义、追踪 loss 函数的方法。我们以VGG网络的训练为一个简单的例子来说明：

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets
vgg = nets.vgg

# Load the images and labels.
images, labels = ...

# Create the model.
predictions, _ = vgg.vgg_16(images)

# Define the loss functions and get the total loss.
loss = slim.losses.softmax_cross_entropy(predictions, labels)
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12

# Load the images and labels.
images, scene_labels, depth_labels = ...

# Create the model.
scene_predictions, depth_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)

# The following two lines have the same effect:
total_loss = classification_loss + sum_of_squares_loss
total_loss = slim.losses.get_total_loss(add_regularization_losses=False)
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12
   
   13

在有一个自定义的loss的情况下，如果想让slim来管理losses，怎么办呢？loss_ops.py也有一个函数去将自定义的loss添加到slim的容器中。例如：

# Load the images and labels.
images, scene_labels, depth_labels, pose_labels = ...

# Create the model.
scene_predictions, depth_predictions, pose_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
pose_loss = MyCustomLossFunction(pose_predictions, pose_labels)
slim.losses.add_loss(pose_loss) # Letting TF-Slim know about the additional loss.

# The following two ways to compute the total loss are equivalent:
regularization_loss = tf.add_n(slim.losses.get_regularization_losses())
total_loss1 = classification_loss + sum_of_squares_loss + pose_loss + regularization_loss

# (Regularization Loss is included in the total loss by default).
total_loss2 = slim.losses.get_total_loss()
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12
   
   13
   
   14
   
   15
   
   16
   
   17
   
   18

在这个例子中，我们既可以手动地产生这个总的loss，也可以让slim知道额外的loss并处理losses。

5.2 训练Loop（Training Loop）

g = tf.Graph()

# Create the model and specify the losses...
...

total_loss = slim.losses.get_total_loss()
optimizer = tf.train.GradientDescentOptimizer(learning_rate)

# create_train_op ensures that each time we ask for the loss, the update_ops
# are run and the gradients being computed are applied too.
train_op = slim.learning.create_train_op(total_loss, optimizer)
logdir = ... # Where checkpoints are stored.

slim.learning.train(
    train_op,
    logdir,
    number_of_steps=1000,
    save_summaries_secs=300,
    save_interval_secs=600)
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12
   
   13
   
   14
   
   15
   
   16
   
   17
   
   18
   
   19

5.3 实例：训练VGG模型（Working Example: Training the VGG16 Model）

为了说明slim的用法，我们研究下VGG网络的训练：

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets

slim = tf.contrib.slim
vgg = nets.vgg

...

train_log_dir = ...
if not tf.gfile.Exists(train_log_dir):
  tf.gfile.MakeDirs(train_log_dir)

with tf.Graph().as_default():
  # Set up the data loading:
  images, labels = ...

  # Define the model:
  predictions = vgg.vgg_16(images, is_training=True)

  # Specify the loss function:
  slim.losses.softmax_cross_entropy(predictions, labels)

  total_loss = slim.losses.get_total_loss()
  tf.summary.scalar('losses/total_loss', total_loss)

  # Specify the optimization scheme:
  optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)

  # create_train_op that ensures that when we evaluate it to get the loss,
  # the update_ops are done and the gradient updates are computed.
  train_tensor = slim.learning.create_train_op(total_loss, optimizer)

  # Actually runs training.
  slim.learning.train(train_tensor, train_log_dir)
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12
   
   13
   
   14
   
   15
   
   16
   
   17
   
   18
   
   19
   
   20
   
   21
   
   22
   
   23
   
   24
   
   25
   
   26
   
   27
   
   28
   
   29
   
   30
   
   31
   
   32
   
   33
   
   34

6. 现有模型的微调（Fine-Tuning Existing Models）

6.1 从ckpt中恢复变量的简介（Brief Recap on Restoring Variables from a Checkpoint）

# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add ops to restore all the variables.
restorer = tf.train.Saver()

# Add ops to restore some variables.
restorer = tf.train.Saver([v1, v2])

# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
  # Restore variables from disk.
  restorer.restore(sess, "/tmp/model.ckpt")
  print("Model restored.")
  # Do some work with the model
  ...
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12
   
   13
   
   14
   
   15
   
   16
   
   17
   
   18

6.2 部分地恢复模型（Partially Restoring Models）

# Create some variables.
v1 = slim.variable(name="v1", ...)
v2 = slim.variable(name="nested/v2", ...)
...

# Get list of variables to restore (which contains only 'v2'). These are all
# equivalent methods:
variables_to_restore = slim.get_variables_by_name("v2")
# or
variables_to_restore = slim.get_variables_by_suffix("2")
# or
variables_to_restore = slim.get_variables(scope="nested")
# or
variables_to_restore = slim.get_variables_to_restore(include=["nested"])
# or
variables_to_restore = slim.get_variables_to_restore(exclude=["v1"])

# Create the saver which will be used to restore the variables.
restorer = tf.train.Saver(variables_to_restore)

with tf.Session() as sess:
  # Restore variables from disk.
  restorer.restore(sess, "/tmp/model.ckpt")
  print("Model restored.")
  # Do some work with the model
  ...
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12
   
   13
   
   14
   
   15
   
   16
   
   17
   
   18
   
   19
   
   20
   
   21
   
   22
   
   23
   
   24
   
   25
   
   26

6.3 不同变量名称的模型的恢复（Restoring models with different variable names）

# Assuming than 'conv1/weights' should be restored from 'vgg16/conv1/weights'
def name_in_checkpoint(var):
  return 'vgg16/' + var.op.name

# Assuming than 'conv1/weights' and 'conv1/bias' should be restored from 'conv1/params1' and 'conv1/params2'
def name_in_checkpoint(var):
  if "weights" in var.op.name:
    return var.op.name.replace("weights", "params1")
  if "bias" in var.op.name:
    return var.op.name.replace("bias", "params2")

variables_to_restore = slim.get_model_variables()
variables_to_restore = {name_in_checkpoint(var):var for var in variables_to_restore}
restorer = tf.train.Saver(variables_to_restore)

with tf.Session() as sess:
  # Restore variables from disk.
  restorer.restore(sess, "/tmp/model.ckpt")
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12
   
   13
   
   14
   
   15
   
   16
   
   17
   
   18

6.4 在一个不同的任务上微调模型（Fine-Tuning a Model on a different task）

# Load the Pascal VOC data
image, label = MyPascalVocDataLoader(...)
images, labels = tf.train.batch([image, label], batch_size=32)

# Create the model
predictions = vgg.vgg_16(images)

train_op = slim.learning.create_train_op(...)

# Specify where the Model, trained on ImageNet, was saved.
model_path = '/path/to/pre_trained_on_imagenet.checkpoint'

# Specify where the new model will live:
log_dir = '/path/to/my_pascal_model_dir/'

# Restore only the convolutional layers:
variables_to_restore = slim.get_variables_to_restore(exclude=['fc6', 'fc7', 'fc8'])
init_fn = assign_from_checkpoint_fn(model_path, variables_to_restore)

# Start training.
slim.learning.train(train_op, log_dir, init_fn=init_fn)
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12
   
   13
   
   14
   
   15
   
   16
   
   17
   
   18
   
   19
   
   20
   
   21

7. 评估模型（Evaluating Models）

7.1 评价指标（Metrics）

slim提供了很多评价指标操作（metric operation），这些op使得模型的评估变得容易。理论上，计算评价指标的值能够被分为三部分：

初始化（Initialization）：初始化评价指标相关的一些variables
聚合（Aggregation）：执行很多计算评价指标需要的操作（sum等）
完成（Finalization）：(可选) 执行任何计算评价指标的最终操作。例如，计算均值（means）、最小值（mins）、最大值（maxes）等。

images, labels = LoadTestData(...)
predictions = MyModel(images)

mae_value_op, mae_update_op = slim.metrics.streaming_mean_absolute_error(predictions, labels)
mre_value_op, mre_update_op = slim.metrics.streaming_mean_relative_error(predictions, labels)
pl_value_op, pl_update_op = slim.metrics.percentage_less(mean_relative_errors, 0.3)
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6

追踪每一个value_op 及update_op是非常费力的。为了处理这个问题，slim提供了两个很方便的函数：


# Aggregates the value and update ops in two lists:
value_ops, update_ops = slim.metrics.aggregate_metrics(
    slim.metrics.streaming_mean_absolute_error(predictions, labels),
    slim.metrics.streaming_mean_squared_error(predictions, labels))

# Aggregates the value and update ops in two dictionaries:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    "eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
    "eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})

  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12

7.2 实例：追踪多个评价指标（Working example: Tracking Multiple Metrics）

把所有的代码放在一起：

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets

slim = tf.contrib.slim
vgg = nets.vgg


# Load the data
images, labels = load_data(...)

# Define the network
predictions = vgg.vgg_16(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    "eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
    "eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})

# Evaluate the model using 1000 batches of data:
num_batches = 1000

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  sess.run(tf.local_variables_initializer())

  for batch_id in range(num_batches):
    sess.run(names_to_updates.values())

  metric_values = sess.run(names_to_values.values())
  for metric, value in zip(names_to_values.keys(), metric_values):
    print('Metric %s has value: %f' % (metric, value))
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12
   
   13
   
   14
   
   15
   
   16
   
   17
   
   18
   
   19
   
   20
   
   21
   
   22
   
   23
   
   24
   
   25
   
   26
   
   27
   
   28
   
   29
   
   30
   
   31
   
   32

注意：metric_ops.py可以在不使用 layers.py和 loss_ops.py的情况下单独使用。

7.3 评估Loop（Evaluation Loop）

import tensorflow as tf

slim = tf.contrib.slim

# Load the data
images, labels = load_data(...)

# Define the network
predictions = MyModel(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    'accuracy': slim.metrics.accuracy(predictions, labels),
    'precision': slim.metrics.precision(predictions, labels),
    'recall': slim.metrics.recall(mean_relative_errors, 0.3),
})

# Create the summary ops such that they also print out to std output:
summary_ops = []
for metric_name, metric_value in names_to_values.iteritems():
  op = tf.summary.scalar(metric_name, metric_value)
  op = tf.Print(op, [metric_value], metric_name)
  summary_ops.append(op)

num_examples = 10000
batch_size = 32
num_batches = math.ceil(num_examples / float(batch_size))

# Setup the global step.
slim.get_or_create_global_step()

output_dir = ... # Where the summaries are stored.
eval_interval_secs = ... # How often to run the evaluation.
slim.evaluation.evaluation_loop(
    'local',
    checkpoint_dir,
    log_dir,
    num_evals=num_batches,
    eval_op=names_to_updates.values(),
    summary_op=tf.summary.merge(summary_ops),
    eval_interval_secs=eval_interval_secs)
  
  
   
   1
   
   2
   
   3
   
   4
   
   5
   
   6
   
   7
   
   8
   
   9
   
   10
   
   11
   
   12
   
   13
   
   14
   
   15
   
   16
   
   17
   
   18
   
   19
   
   20
   
   21
   
   22
   
   23
   
   24
   
   25
   
   26
   
   27
   
   28
   
   29
   
   30
   
   31
   
   32
   
   33
   
   34
   
   35
   
   36
   
   37
   
   38
   
   39
   
   40
   
   41

8. 作者（Authors）

Sergio Guadarrama and Nathan Silberman

9. 参考资料：

tensorflow.contrib.slim 模块官方说明文件: README.md

对tensorflow网络架构运行有整体的认识和操作流程后，开始使用高阶API接口，slim的使用方法 TensorFlow-Slim模块官方教程 * * * * *

TensorFlow-Slim模块官方教程 * * * * *

TF-slim

1. slim模块导入方法：

2. 为什么会有slim这个模块

3. slim模块组成：

4. 定义模型

4.1 变量（Variables）

4.2 层（Layers）

4.3 作用域（Scopes）

4.4 实例：创建VGG网络（Working Example: Specifying the VGG16 Layers）

5. 训练模型（Training Models）

5.1 损失函数（Losses）

5.2 训练Loop（Training Loop）

5.3 实例：训练VGG模型（Working Example: Training the VGG16 Model）

6. 现有模型的微调（Fine-Tuning Existing Models）

6.1 从ckpt中恢复变量的简介（Brief Recap on Restoring Variables from a Checkpoint）

6.2 部分地恢复模型（Partially Restoring Models）

6.3 不同变量名称的模型的恢复（Restoring models with different variable names）

6.4 在一个不同的任务上微调模型（Fine-Tuning a Model on a different task）

7. 评估模型（Evaluating Models）

7.1 评价指标（Metrics）

7.2 实例：追踪多个评价指标（Working example: Tracking Multiple Metrics）

7.3 评估Loop（Evaluation Loop）

8. 作者（Authors）

9. 参考资料：

TF-slim

1. slim模块导入方法：

2. 为什么会有slim这个模块

3. slim模块组成：

4. 定义模型

4.1 变量（Variables）

4.2 层（Layers）

4.3 作用域（Scopes）

4.4 实例：创建VGG网络（Working Example: Specifying the VGG16 Layers）

5. 训练模型（Training Models）

5.1 损失函数（Losses）

5.2 训练Loop（Training Loop）

5.3 实例：训练VGG模型（Working Example: Training the VGG16 Model）

6. 现有模型的微调（Fine-Tuning Existing Models）

6.1 从ckpt中恢复变量的简介（Brief Recap on Restoring Variables from a Checkpoint）

6.2 部分地恢复模型（Partially Restoring Models）

6.3 不同变量名称的模型的恢复（Restoring models with different variable names）

6.4 在一个不同的任务上微调模型（Fine-Tuning a Model on a different task）

7. 评估模型（Evaluating Models）

7.1 评价指标（Metrics）

7.2 实例：追踪多个评价指标（Working example: Tracking Multiple Metrics）

7.3 评估Loop（Evaluation Loop）

8. 作者（Authors）

9. 参考资料：

猜你喜欢