TensorFlow slim (a) slim API using the method described

  TF-slim module is one of TensorFLow more practical API, is a model for the building, training, assessment lightweight library complex models . The more useful functions and introducing comprises arg_scope, model_variables, REPEAT, Stack .

  slim module is launched in 16 years, its main function is to achieve " Code downsizing " .

  The module has become very popular module in most TensorFLow code on github will be involved in it, if not covered, its network architecture to realize there may be a lot of redundant code is not concise, readable than low.

introduction

  First, look at the use of slim module for LeNet-5 network architecture code:

. 1  DEF lenet_architecture (Self, is_trained = True):
 2      with slim.arg_scope ([slim.conv2d], padding = " Valid " ,
 . 3                          weights_initializer = tf.truncated_normal_initializer (STDDEV = 0.01 ),
 . 4                          weights_regularizer slim.l2_regularizer = (0.005 ) ):
 5          # because Lenet 32 * 32 is input, and the image is MNIST 28 * 28, the first layer requires SAME convolution convolution 
. 6          NET = slim.conv2d (self.input_image,. 6, [5, . 5],. 1, padding = " SAME " , scope = " CONV1 " )   # 28 * 28 *. 5 
. 7         = slim.max_pool2d NET (NET, [2, 2], 2, scope = ' pool_2 ' )   # 14 * 14 *. 6 
. 8          NET = slim.conv2d (NET, 16, [. 5,. 5],. 1, scope = ' conv3 ' )    # 10 * 10 * 16 
. 9          NET = slim.max_pool2d (NET, [2, 2], 2, scope = ' pool_4 ' )   # . 5 *. 5 * 16 
10          NET = slim.conv2d (NET, 120, [ 1, 1], 1, scope = ' conv5 ' )   # by way of a fully connected instead of 1 * 
. 11          NET = slim.flatten (NET, scope = ' Flatten ' )   # 展平
12         net = slim.fully_connected(net, 84, scope='fc6')
13         net = slim.dropout(net, self.dropout, is_training=is_trained, scope='dropout')
14         digits = slim.fully_connected(net, 10, scope='fc7')
15     return digits        

  In the code lines 6-14, when the network is a network LeNet MNIST handwriting recognition process implemented. It will be seen that each row is the network layer. In particular, each layer is the convolution operation, and did not follow the convolution kernel Mr. Cheng, then the convolution operation, and then adding the regularization operation implemented. But a very competent, direct line contains all the content. This is due to the arg_scope function allows the user to operate within the scope defined default parameters , which can reduce the number of redundant operations.

  You can initially feel, slim module can build models, training, assessment easier. Especially the visual field of the machine , many models (LeNet-5, AlexNet, VGG, etc.).

  Without further ado, Xu do not speak, began to uncover slim mystery.

The basic use of slim module

Import module Slim

1 import tensorflow.contrib.slim as slim

  As used herein, the environment is Python3.6, TensorFlow 1.12.0

  If your version of Python or TF too high, slim may occur. No association input or will be reported ModuleNotFound Error: No module named 'tensorflow.contrib ' mistakes.  

Detailed use of slim build a model

slim variables (Variables)

  The need to generate variable model, the first to compare it to generate variable TensorFlow native mode and variable way of generating slim difference .

  Native TensorFlow create variable Variable function, you need to set a predefined value or initialization mechanism (randomly generated or the like), which is used as follows:

1 W = tf.Variable(tf.truncated_normal([10, 4], 0, 1), trainable=True, 
2                 name="weight", dtype=tf.float32)

  

  Create a variable in slim in variable functions, it provides a series of wrapper functions. Intuitively, Slim variable generating function parameter settings have been flattened, and easier to understand, for example, to generate a variable, what name is, how the size, what mode is initialized, what manner regularization, stored in where and so on . In addition flattened use , but it also adds some additional features , like the regularization, storage devices and the like.

1 w = slim.variable('weight', shape=[10, 10, 3, 3], 
2                   initializer=tf.truncated_normal_initializer(stddev=0.1),
3                   regularizer=slim.l2_regularizer(0.5),
4                   device='/CPU:0')

  In slim, the same variables were also further distinction, a variable is defined as a local variable and the model variables . As the name suggests, model variables that need to be trained in the training process, fine-tune, and when the model is saved to save .ckpt and used reasoning process variables ( Model the Variables are trained or Fine-Tuned During Learning and are loaded from A During Evaluation or Inference checkpoint ). The local variables are just some of the parameters used in the training process, does not require fine-tuning, it will not save the model, of course, the process of reasoning does not need to variables (parameters such as the number of iterations, learning rate, etc.) use. Specific use, as follows:

 1 # Model Variables 模型变量 使用model_variable()
 2 weights = slim.model_variable('weights',
 3                               shape=[10, 10, 3 , 3],
 4                               initializer=tf.truncated_normal_initializer(stddev=0.1),
 5                               regularizer=slim.l2_regularizer(0.05),
 6                               device='/CPU:0')
 7 model_variables = slim.get_model_variables()
 8 
 9 # Regular variables  # 局部变量,使用variable()
10 var = slim.variable('var',
11                        shape=[20, 1],
12                        initializer=tf.zeros_initializer())
13 regular_variables_and_model_variables = slim.get_variables()

slim 层(Layers)

  As the opening LeNet example, by establishing a convolution layer TensorFlow basis functions essential op comprising:

  • Create a current layer and bias convolution kernel variables
  • By convolving the input collation convolving operation
  • Convolution results add bias
  • The result of adding the activation function

  Which establishes the redundancy of each layer will look as follows:

1 # conv1
2 with tf.name_scope('conv1') as scope:
3     kernel = tf.Variable(tf.truncated_normal([11, 11, 3, 96], dtype=tf.float32,
4                                          stddev=1e-1), name='weights'
5     biases = tf.Variable(tf.constant(0.0, shape=[96], dtype=tf.float32),
6                          trainable=True, name='biases')
7     conv = tf.nn.conv2d(x, kernel, [1, 4, 4, 1], padding='SAME')
8     bias = tf.nn.bias_add(conv, biases)
9     conv1 = tf.nn.relu(bias, name=scope)

Wherein the convolution kernel is 3-6 and bias line initialization, namely 7,8,9 convolution operation, bias, activation function. For the depth and width less than the network model (such as LeNet), the operation is also possible. But the model number of layers deep depth or width of the network model (such as Inception, ResNet etc.), if the above-described embodiment to write, writing complicated, not easy to maintain.

  To avoid repetition code, slim provide more advanced Layers op, as shown below, slim version of the convolution operation. Of course, this operation requires complex arg_scope () function to set the default parameters , will exert their effects.

1 net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')

  slim.arg_scope (), for setting the default parameters specified function, of course, if there was one or two setting does not match the default parameters can be used to modify a specified keyword parameter function. It will be described in detail in the scope slim .

1 with slim.arg_scope([slim.conv2d], padding="valid",
2                         weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
3                         weights_regularizer=slim.l2_regularizer(0.005)):

  Further, slim also provides two meta-operations: repeat and stack, for some of the same operation is repeated . After its application scenario as several stacked VGG this convolution is a network model of the cell. As shown in the following:

1 net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')
2 net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')
3 net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')
4 net = slim.max_pool2d(net, [2, 2], scope='pool2')

Which contains three convolution. These three may be prepared in accordance with convolution operation described above manner. Also by way of circulation were:

1 for i in range(3):
2   net = slim.conv2d(net, 256, [3, 3], scope='conv3_%d' % (i+1))
3 net = slim.max_pool2d(net, [2, 2], scope='pool2')

  May also be used provided slim repeat method , the code can be made more concise:

1 net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
2 net = slim.max_pool2d(net, [2, 2], scope='pool2')

  During the repeat, the name of the scope will in turn named conv3_1, conv3_2, conv3_3. repeat function allows repeat the same operation parameters .

  Further, Slim the stack operating method allows different parameters repetitive operation , like the above-mentioned convolution operation is a convolution kernel the size, the number of channels is not the same convolution operation, or a plurality of fully connected network (typically each node neurons the number is not the same).

1  # fully connected lengthy operations of the embodiment 
2 X = slim.fully_connected (X, 4096, scope = ' fc_1 ' )
 . 3 X = slim.fully_connected (X, 4096, scope = ' fc_2 ' )
 . 4 X = slim.fully_connected ( X, 1000, scope = ' fc_3 ' )

  As can be seen, fully connected neural operation is not the same as the number of nodes. The stack shown as follows:

1 x = slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')

  The different parameters can be written as a list, the number of stack Stack operation is not marked because the parameters are not the same each time. 

  In addition to full-connecting operation, in fact stack convolution process may be , for the following 3 * 3 convolution, 1 * 1 operations:

1 x = slim.conv2d(x, 32, [3, 3], scope='conv_1')
2 x = slim.conv2d(x, 32, [1, 1], scope='conv_2')
3 x = slim.conv2d(x, 64, [3, 3], scope='conv_3')
4 x = slim.conv2d(x, 64, [1, 1], scope='conv_4')

  Since the number and size of the convolution kernel different channels, repeat the operation can not be used, but by way of a stack:

1 x = slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]),
2                (64, [3, 3]), (64, [1, 1])], scope='conv')

  The presence of different parameters in the form of a list of tuples.

slim scopes (scopes)

  TensorFlow mechanism in the scope of several types:

  • name_scope: limit the scope of op
  • variable_scope: variable scope

  slim also added the scope of the mechanism arg_scope. This mechanism may give more than one op specified or default parameters .

  Also with the opening to LeNet Example: If there is no arg_scope (), the convolution operation using three slim LeNet layer should be such that:

1 net = slim.conv2d(inputs, 6, [5, 5], 1, padding='SAME',
2                   weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
3                   weights_regularizer=slim.l2_regularizer(0.005), scope='conv1')
4 net = slim.conv2d(net, 16, [5, 5], 1, padding='VALID',
5                   weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
6                   weights_regularizer=slim.l2_regularizer(0.005), scope='conv2')
7 net = slim.conv2d(net, 120, [1, 1], 1, padding='SAME',
8                   weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
9                   weights_regularizer=slim.l2_regularizer(0.005), scope='conv3')

  Looks really complicated, each slim.conv2d () There are many different parameters. However, if the parameters to their default settings, using arg_scope (), the code will be simplified.

1  with slim.arg_scope([slim.conv2d], padding='SAME',
2                       weights_initializer=tf.truncated_normal_initializer(stddev=0.01)
3                       weights_regularizer=slim.l2_regularizer(0.0005)):
4     net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')
5     net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
6     net = slim.conv2d(net, 256, [11, 11], scope='conv3')

  When in use, is to find common layers, the more common, arg_scope () use can make the code more concise.

  But not more layers of different types of common general, therefore, can be used to develop a nested manner:

 1 with slim.arg_scope([slim.conv2d, slim.fully_connected],
 2                       activation_fn=tf.nn.relu,
 3                       weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
 4                       weights_regularizer=slim.l2_regularizer(0.0005)):
 5   with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):
 6     net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')
 7     net = slim.conv2d(net, 256, [5, 5],
 8                       weights_initializer=tf.truncated_normal_initializer(stddev=0.03),
 9                       scope='conv2')
10     net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')

  In the first layer arg_scope (), the convolution parameters for some common layer and the layer is fully connected specified, (), the convolution and layer-specific parameters specified in the second layer arg_scope.

Use slim training model

  After the model, training model need to add loss function loss function, gradient calculated gradient computation

Slim loss function Losses

  According to official statement, slim.losses module will be removed , use tf.losses module, since both function exactly the same .

  Loss function is to teach the machine to distinguish right and wrong quantity, but also parameters to be optimized. For classification problems, usually cross-entropy, for regression problems, generally use the MSE / SSE. From the following comparison can be seen, the use of modules and tf.losses slim.losses identical modules:

1 slim.losses.softmax_cross_entropy(predictions, input_label, scope='loss')
2 tf.losses.softmax_cross_entropy(predictions, input_label, scope='loss')

  For multi-tasking required learning model, there will be more of the same model loss function, used to measure the loss of different functions. For example, yolo v3 contains lost frame coordinates, and lost the confidence of the classification. For the loss of multi-tasking, often multiple losses and strike, or a weighted sum. As follows:

1 total_loss = classification_loss + sum_of_squares_loss

  slim design also function corresponding get_total_loss (), will be added and generated by the slim Loss :

1 total_loss = slim.losses.get_total_loss(add_regularization_losses=False)

  That if there is some loss manually created the need to establish with slim loss by adding up, manually create a loss how should add them to slim it? May be used losses.add_loss () Method:

1 slim.losses.add_loss(my_loss)

  Thereafter, the operation and then add:

1 total_loss = slim.losses.get_total()

slim optimization training (Training Loop)

  In tf, when after the completion of the establishment of the loss model, the next generation will Optimizer:

1 optimizer = tf.train.GradientDescentOptimizer(lr).minimize(loss)

  Among slim, training op features include two operations, including:

  • Calculation of the loss;
  • Gradient calculation

  Using mode is as follows:

 1 total_loss = slim.losses.get_total_loss()
 2 optimizer = tf.train.GradientDescentOptimizer(learning_rate)
 3 
 4 train_op = slim.learning.create_train_op(total_loss, optimizer)
 5 logdir = ... # Where checkpoints are stored.
 6 
 7 slim.learning.train(   # actually runs training
 8     train_op,
 9     logdir,
10     number_of_steps=1000,
11     save_summaries_secs=300,
12     save_interval_secs=600)
  • loss
  • Optimizer, this time need not .minimize (loss)
  • Generate train_op, loss and optimizer together; personal feeling somewhat cumbersome , than ordinary TF way.
  • slim.learning.train () # This training mechanisms it very simple, do not open the session would not write cycles
    • checkpoint save directory and event
    • Training Algebra
    • Save each additional 10 minutes

 

Guess you like

Origin www.cnblogs.com/monologuesmw/p/12627697.html