TensorFlow-Slim

TF-Slim 是一个轻量级的库，用来在TF中定义、训练和评估复杂模型。tf-slim能够自由混入原生TF和其它框架（如tf.contrib.learn中）。

用法

import tensorflow.contrib.slim as slim

为什么用TF-Slim?

TF-Slim中都有什么组成部分?

定义模型

变量

层

Scopes

实例: 实现VGG16

训练模型

Training Tensorflow models requires a model, a loss function, the gradient computation and a training routine that iteratively computes the gradients of the model weights relative to the loss and updates the weights accordingly. TF-Slim provides both common loss functions and a set of helper functions that run the training and evaluation routines.

损失

The loss function defines a quantity that we want to minimize. For classification problems, this is typically the cross entropy between the true distribution and the predicted probability distribution across classes. For regression problems, this is often the sum-of-squares differences between the predicted and true values.

Certain models, such as multi-task learning models, require the use of multiple loss functions simultaneously. In other words, the loss function ultimately being minimized is the sum of various other loss functions. For example, consider a model that predicts both the type of scene in an image as well as the depth from the camera of each pixel. This model's loss function would be the sum of the classification loss and depth prediction loss.

TF-Slim provides an easy-to-use mechanism for defining and keeping track of loss functions via the losses module. Consider the simple case where we want to train the VGG network:

Training Loop

TF-Slim provides a simple but powerful set of tools for training models found in learning.py. These include a Train function that repeatedly measures the loss, computes gradients and saves the model to disk, as well as several convenience functions for manipulating gradients. For example, once we've specified the model, the loss function and the optimization scheme, we can call slim.learning.create_train_op and slim.learning.train to perform the optimization:

实例: 训练VGG16模型

To illustrate this, let's examine the following sample of training the VGG network:

微调已存在的模型

Brief Recap on Restoring Variables from a Checkpoint

After a model has been trained, it can be restored using tf.train.Saver() which restores Variables from a given checkpoint. For many cases, tf.train.Saver() provides a simple mechanism to restore all or just a few variables.

Partially Restoring Models

It is often desirable to fine-tune a pre-trained model on an entirely new dataset or even a new task. In these situations, one can use TF-Slim's helper functions to select a subset of variables to restore:

Restoring models with different variable names

Fine-Tuning a Model on a different task

Consider the case where we have a pre-trained VGG16 model. The model was trained on the ImageNet dataset, which has 1000 classes. However, we would like to apply it to the Pascal VOC dataset which has only 20 classes. To do so, we can initialize our new model using the values of the pre-trained model excluding the final layer:

评估模型

Once we've trained a model (or even while the model is busy training) we'd like to see how well the model performs in practice. This is accomplished by picking a set of evaluation metrics, which will grade the model's performance, and the evaluation code which actually loads the data, performs inference, compares the results to the ground truth and records the evaluation scores. This step may be performed once or repeated periodically.

度量

我们定义一个度量来衡量训练效果，这不是一个损失函数（损失被用来在训练过程中进行优化的）。例如，我们训练时最小化log损失，但是评估模型时我们也许会用 F1 score ,或者 Intersection Over Union score（这个值不可微，因此也不能用在损失函数上）。

TF-Slim提供了一组度量 operations。笼统地讲，计算一个度量值可以被分为三部分：

初始化：初始化用来计算度量的变量。
Aggregation: perform operations (sums, etc) used to compute the metrics.
Finalization: (optionally) perform any final operation to compute metric values. For example, computing means, mins, maxes, etc.

例如，计算mean_absolute_error，两个变量 (count和total)被初始化为0。在 aggregation，我们得到一组predictions 和 labels，计算它们的绝对误差并总计为total。我们每增加一组，count也随之增加。最后，在 finalization阶段，total除以count来获得均值。

The following example demonstrates the API for declaring metrics. Because metrics are often evaluated on a test set which is different from the training set (upon which the loss is computed), we'll assume we're using test data:

images, labels = LoadTestData(...)
predictions = MyModel(images)

mae_value_op, mae_update_op = slim.metrics.streaming_mean_absolute_error(predictions, labels)
mre_value_op, mre_update_op = slim.metrics.streaming_mean_relative_error(predictions, labels)
pl_value_op, pl_update_op = slim.metrics.percentage_less(mean_relative_errors, 0.3)

就像例子描述的那样，创建的metric返回两个值： value_op 和 update_op。 value_op是一个 idempotent operation 返回metric的当前值。update_op 是一个 operation，它执行 aggregation步骤并返回metric的值。

跟踪value_op 和update_op 费时费力。为了解决这个问题，TF-Slim提供两个方便的函数：

# 总计value和update ops 到两个列表中:
value_ops, update_ops = slim.metrics.aggregate_metrics(
    slim.metrics.streaming_mean_absolute_error(predictions, labels),
    slim.metrics.streaming_mean_squared_error(predictions, labels))

# 总起value和update ops 到两个字典中:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    "eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
    "eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})

github/tensorflow/tensorflow/contrib/slim/