官方文档: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim
tf.contrib.slim是一个轻量级的库,目的是方便地定义, 训练和评估tensorflow中复杂的模型
版本: 2018.6.14

主要作用

对于一些固定的代码进行封装,主要是一些高级的层和变量,方便用户写出一些更加紧凑的代码
包含多个广泛使用的CV模型(VGG, AlexNet)
提供训练(损失,学习), 评估一些高级方法

主要组成

标注[o]的将进行介绍,标注[x]的官方暂时没有文档

[o]variables: provides convenience wrappers for variable creation and manipulation.
[o]layers: contains high level layers for building models using tensorflow.
[o]arg_scopes
[o]losses: contains commonly used loss functions.
[o]learning
[o]metrics: contains popular evaluation metrics.
[o]evaluation
[o]data: contains TF-slim’s dataset definition, data providers, parallel_reader, and decoding utilities.
[x]nets: contains popular network definitions such as VGG and AlexNet models.
[x]queues: provides a context manager for easily and safely starting and closing QueueRunners.
[x]regularizers: contains weight regularizers.

定义模型

Variable

封装variable
举例:

weights = slim.variable('weights',
                             shape=[10, 10, 3 , 3],
                             initializer=tf.truncated_normal_initializer(stddev=0.1),
                             regularizer=slim.l2_regularizer(0.05),
                             device='/CPU:0')

管理model variable
在原生的tf中,有两种变量:regular variable和local variable,第一种就是就是可以使用saver进行save的变量,第二中是只存在于session中,不能saved的变量.
在slim中,区分变量为两种类型: model variable和non-model variable.mdoel variable就是可以学习的参数,并且在评估或预测的时候需要加载的参数,例如 slim.fully_connected or slim.conv2d层的参数,non-model variable就是在训练或评估的时候需要,但是在inference的时候不需要,例如global_step.
举例:

# Model Variables
weights = slim.model_variable('weights',
                              shape=[10, 10, 3 , 3],
                              initializer=tf.truncated_normal_initializer(stddev=0.1),
                              regularizer=slim.l2_regularizer(0.05),
                              device='/CPU:0')
model_variables = slim.get_model_variables()

# Regular variables
my_var = slim.variable('my_var',
                       shape=[20, 1],
                       initializer=tf.zeros_initializer())
regular_variables_and_model_variables = slim.get_variables()

slim.get_model_variables()工作原理: 当使用slim创建一个model_varialbe的时候, slim会把这个变量添加到tf.GraphKeys.MODEL_VARIABLES collection.如果是自己创建的变量想要被slim进行管理,可以采用如下方式:

my_model_variable = CreateViaCustomCode()
# Letting TF-Slim know about the additional variable.
slim.add_model_variable(my_model_variable)

Layer

slim封装层
原生的tf创建一个卷积层,需要如下多个低级操作:
* 创建weight和bias variable
* convolve the weight with last tensor
* add the baise to last result
* activate
举例:

input = ...
with tf.name_scope('conv1_1') as scope:
  kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32,
                                           stddev=1e-1), name='weights')
  conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')
  biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32),
                       trainable=True, name='biases')
  bias = tf.nn.bias_add(conv, biases)
  conv1 = tf.nn.relu(bias, name=scope)

slim封装了一个简单的上边代码的替代品:

input = ...
net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')

还封装常用的其它层: slim.batch_norm, slim.fully_connected等…

Scope

为了使创建的计算图更加模块化和方便管理,引入了Scope的概念,来对变量通过增加前缀进行划分.原生的tf提供了name_scope和variable_scope.
区别在于: 使用tf.Variable()的时候，tf.name_scope()和tf.variable_scope() 都会给 Variable 和 op 的 name属性加上前缀;
使用tf.get_variable()的时候，tf.name_scope()就不会给 tf.get_variable()创建出来的variable加前缀.
get_variable和Variable的主要区别在于:每一次调用Variable都是新创建一个变量,所以reuse=True对其没有影响,而get_variable会判断如果该变量已经存在就把该变量返回.

slim在原生的Scope之外,引入了arg_scope,作用是arg_scope会把它提供的一组操作或运算全部传递给在它范围内的所有操作.举例:

# 原生代码
padding = 'SAME'
initializer = tf.truncated_normal_initializer(stddev=0.01)
regularizer = slim.l2_regularizer(0.0005)
net = slim.conv2d(inputs, 64, [11, 11], 4,
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv1')
net = slim.conv2d(net, 128, [11, 11],
                  padding='VALID',
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv2')
net = slim.conv2d(net, 256, [11, 11],
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv3')

# slim.arg_scope等价代码  
with slim.arg_scope([slim.conv2d], padding='SAME',
                    weights_initializer=tf.truncated_normal_initializer(stddev=0.01)
                    weights_regularizer=slim.l2_regularizer(0.0005)):
  net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')
  net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
  net = slim.conv2d(net, 256, [11, 11], scope='conv3')

实例: 构建VGG16

利用slim的相关操作,可以很容易地定义VGG16网络:

def vgg16(inputs):
  with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      activation_fn=tf.nn.relu,
                      weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
                      weights_regularizer=slim.l2_regularizer(0.0005)):
    net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
    net = slim.max_pool2d(net, [2, 2], scope='pool1')
    net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
    net = slim.max_pool2d(net, [2, 2], scope='pool2')
    net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
    net = slim.max_pool2d(net, [2, 2], scope='pool3')
    net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
    net = slim.max_pool2d(net, [2, 2], scope='pool4')
    net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
    net = slim.max_pool2d(net, [2, 2], scope='pool5')
    net = slim.fully_connected(net, 4096, scope='fc6')
    net = slim.dropout(net, 0.5, scope='dropout6')
    net = slim.fully_connected(net, 4096, scope='fc7')
    net = slim.dropout(net, 0.5, scope='dropout7')
    net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')
  return net

训练模型

在tf中,想要训练一个模型,通常需要: a model, a loss function, the gradient computation and a training routine.
在上文中已经介绍了model的简洁定义方式, slim还提供了常用的loss functiong和training and evaluation routines.

Loss

在slim中,利用losses模块,可以简单地定义loss:

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets
vgg = nets.vgg

# Load the images and labels.
images, labels = ...

# Create the model.
predictions, _ = vgg.vgg_16(images)

# Define the loss functions and get the total loss.
loss = slim.losses.softmax_cross_entropy(predictions, labels)

也可以定义多任务loss:

# Load the images and labels.
images, scene_labels, depth_labels = ...

# Create the model.
scene_predictions, depth_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)

# The following two lines have the same effect:
total_loss = classification_loss + sum_of_squares_loss
total_loss = slim.losses.get_total_loss(add_regularization_losses=False)

在slim中, loss的工作方式也是构建了a special TensorFlow collection of loss functions, 用于手动管理本程序中的loss. 也可以将自定义的loss加入到其中:

# Load the images and labels.
images, scene_labels, depth_labels, pose_labels = ...

# Create the model.
scene_predictions, depth_predictions, pose_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
pose_loss = MyCustomLossFunction(pose_predictions, pose_labels)
slim.losses.add_loss(pose_loss) # Letting TF-Slim know about the additional loss.

# The following two ways to compute the total loss are equivalent:
regularization_loss = tf.add_n(slim.losses.get_regularization_losses())
total_loss1 = classification_loss + sum_of_squares_loss + pose_loss + regularization_loss

# (Regularization Loss is included in the total loss by default).
total_loss2 = slim.losses.get_total_loss()

Training Loop

在slim中,利用slim.learning.create_train_op可以方便地计算损失+梯度计算和参数更新+返回损失. slim.learning.train用于迭代训练.

g = tf.Graph()

# Create the model and specify the losses...
...

total_loss = slim.losses.get_total_loss()
optimizer = tf.train.GradientDescentOptimizer(learning_rate)

# create_train_op ensures that each time we ask for the loss, the update_ops
# are run and the gradients being computed are applied too.
train_op = slim.learning.create_train_op(total_loss, optimizer)
logdir = ... # Where checkpoints are stored.

slim.learning.train(
    train_op,
    logdir,
    number_of_steps=1000,
    save_summaries_secs=300,
    save_interval_secs=600):

实例: 训练VGG16

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets

slim = tf.contrib.slim
vgg = nets.vgg

...

train_log_dir = ...
if not tf.gfile.Exists(train_log_dir):
  tf.gfile.MakeDirs(train_log_dir)

with tf.Graph().as_default():
  # Set up the data loading:
  images, labels = ...

  # Define the model:
  predictions = vgg.vgg_16(images, is_training=True)

  # Specify the loss function:
  slim.losses.softmax_cross_entropy(predictions, labels)

  total_loss = slim.losses.get_total_loss()
  tf.summary.scalar('losses/total_loss', total_loss)

  # Specify the optimization scheme:
  optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)

  # create_train_op that ensures that when we evaluate it to get the loss,
  # the update_ops are done and the gradient updates are computed.
  train_tensor = slim.learning.create_train_op(total_loss, optimizer)

  # Actually runs training.
  slim.learning.train(train_tensor, train_log_dir)

微调模型

在原生的tf中,使用tf.train.Saver()恢复参数:

# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add ops to restore all the variables.
restorer = tf.train.Saver()

# Add ops to restore some variables.
restorer = tf.train.Saver([v1, v2])

# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
  # Restore variables from disk.
  restorer.restore(sess, "/tmp/model.ckpt")
  print("Model restored.")
  # Do some work with the model
  ...

slim 可以简单恢复部分参数

# Create some variables.
v1 = slim.variable(name="v1", ...)
v2 = slim.variable(name="nested/v2", ...)
...

# Get list of variables to restore (which contains only 'v2'). These are all
# equivalent methods:
variables_to_restore = slim.get_variables_by_name("v2")
# or
variables_to_restore = slim.get_variables_by_suffix("2")
# or
variables_to_restore = slim.get_variables(scope="nested")
# or
variables_to_restore = slim.get_variables_to_restore(include=["nested"])
# or
variables_to_restore = slim.get_variables_to_restore(exclude=["v1"])

# Create the saver which will be used to restore the variables.
restorer = tf.train.Saver(variables_to_restore)

实例: Fine-Tuning VGG16

# Load the Pascal VOC data
image, label = MyPascalVocDataLoader(...)
images, labels = tf.train.batch([image, label], batch_size=32)

# Create the model
predictions = vgg.vgg_16(images)

train_op = slim.learning.create_train_op(...)

# Specify where the Model, trained on ImageNet, was saved.
model_path = '/path/to/pre_trained_on_imagenet.checkpoint'

# Specify where the new model will live:
log_dir = '/path/to/my_pascal_model_dir/'

# Restore only the convolutional layers:
variables_to_restore = slim.get_variables_to_restore(exclude=['fc6', 'fc7', 'fc8'])
init_fn = assign_from_checkpoint_fn(model_path, variables_to_restore)

# Start training.
slim.learning.train(train_op, log_dir, init_fn=init_fn)

评估模型

封装的metrics

images, labels = LoadTestData(...)
predictions = MyModel(images)

# mae_value_op记录当前样本得到的结果, mae_update_op记录评测至当前的样本得到的mean结果
mae_value_op, mae_update_op = slim.metrics.streaming_mean_absolute_error(predictions, labels)
mre_value_op, mre_update_op = slim.metrics.streaming_mean_relative_error(predictions, labels)
pl_value_op, pl_update_op = slim.metrics.percentage_less(mean_relative_errors, 0.3)

自动更新

使用slim.metrics.aggregate_metric_map进行value_op和update_op的自动维护.

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets

slim = tf.contrib.slim
vgg = nets.vgg


# Load the data
images, labels = load_data(...)

# Define the network
predictions = vgg.vgg_16(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    "eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
    "eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})

# Evaluate the model using 1000 batches of data:
num_batches = 1000

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  sess.run(tf.local_variables_initializer())

  for batch_id in range(num_batches):
    sess.run(names_to_updates.values())

  metric_values = sess.run(names_to_values.values())
  for metric, value in zip(names_to_values.keys(), metric_values):
    print('Metric %s has value: %f' % (metric, value))

evaluation_loop

import tensorflow as tf

slim = tf.contrib.slim

# Load the data
images, labels = load_data(...)

# Define the network
predictions = MyModel(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    'accuracy': slim.metrics.accuracy(predictions, labels),
    'precision': slim.metrics.precision(predictions, labels),
    'recall': slim.metrics.recall(mean_relative_errors, 0.3),
})

# Create the summary ops such that they also print out to std output:
summary_ops = []
for metric_name, metric_value in names_to_values.iteritems():
  op = tf.summary.scalar(metric_name, metric_value)
  op = tf.Print(op, [metric_value], metric_name)
  summary_ops.append(op)

num_examples = 10000
batch_size = 32
num_batches = math.ceil(num_examples / float(batch_size))

# Setup the global step.
slim.get_or_create_global_step()

output_dir = ... # Where the summaries are stored.
eval_interval_secs = ... # How often to run the evaluation.
slim.evaluation.evaluation_loop(
    'local',
    checkpoint_dir,
    log_dir,
    num_evals=num_batches,
    eval_op=names_to_updates.values(),
    summary_op=tf.summary.merge(summary_ops),
    eval_interval_secs=eval_interval_secs)

Data

slim dataset是一个封装了某数据集的一些特殊成分的元组, 该元组主要由一下几部分组成:
* data_sources: 组成数据集的文件路径
* reader: 适用于data_sources数据类型的数据读取器
* decoder: 对读取的数据文件进行解码的解码器
* num_samples: 数据集中samples的数量
* items_to_descriptions: 从数据集提供的items到描述的map
简单说,一个slim数据集使用reader类打开data_sources文件进行读取(读取后得到的是序列化的文件),然后使用decoder对文件进行解码,并允许用户请求items的数组以Tensor的形式返回.

decoder实例: TFExampleDecoder

TFExampleDecoder的目的是把TF文件映射成item(s),例如图片或label等.
TFExample protocol buffers是keys(string)到tf.FixedLenFeature或tf.VarLenFeature格式的映射文件,TFExampleDecoder定义了key到feature的映射,为了解码这些Feature得到item, TFExampleDecoder还定义了ItemHandlers. ItemHandler是item到key的映射.最终得到items.

keys_to_features = {
    'image/encoded': tf.FixedLenFeature((), tf.string, default_value=''),
    'image/format': tf.FixedLenFeature((), tf.string, default_value='raw'),
    'image/class/label': tf.FixedLenFeature(
        [1], tf.int64, default_value=tf.zeros([1], dtype=tf.int64)),
}

items_to_handlers = {
    'image': tfexample_decoder.Image(
      image_key = 'image/encoded',
      format_key = 'image/format',
      shape=[28, 28],
      channels=1),
    'label': tfexample_decoder.Tensor('image/class/label'),
}

decoder = tfexample_decoder.TFExampleDecoder(
    keys_to_features, items_to_handlers)

使用三个key( image/encoded, image/format and image/class/label)解析TFExample,并吧前两个key映射成一个叫image的item. 本decoder最终提供了两个items( ‘image’ and ‘label’).

DataProvider举例: DatasetDataProvider

dataset = GetDataset(...)
data_provider = tf.contrib.slim.dataset_data_provider.DatasetDataProvider(
    dataset, common_queue_capacity=32, common_queue_min=8)

DatasetDataProvider提供: num_readers, num_epochs, shuffle的控制.

Tensorflow (2): tf.slim库解读