Detailed Tensorflow implement and use BN (parameter appreciated)

Tensorflow BN specific implementation (in many ways):

Theoretical knowledge (refer Gangster): https://blog.csdn.net/hjimce/article/details/50866313

Additional knowledge:

① tf.nn.moments output of this function is the mean and variance BN needs.

Mode 1:

tf.nn.batch_normalization x, mean, variance, offset, scale, variance_epsilon, name=None( ): using raw interface package

 x
one output · mean moments method of
output · variance moments of one method
· offset BN need to learn parameters
· scale BN need to learn parameters
to prevent the denominator is zero plus a constant of normalization when · variance_epsilon

Implementation code:

. 1  Import tensorflow TF AS
 2  
. 3  # implemented Normalization Batch 
4  DEF bn_layer (X, is_training, name = ' BatchNorm ' , moving_decay = 0.9, = 1E-EPS. 5 ):
 . 5      # acquiring an input and determines whether the dimension convolution matching layer (4 ), or all the connecting layer (2) 
. 6      Shape = x.shape
 . 7      Assert len (Shape) in [2,4 ]
 . 8  
. 9      param_shape Shape = [-1 ]
 10      with tf.variable_scope (name):
 . 11          # statement only BN two parameters need to learn, Y = X * + Beta Gamma 
12 is          Gamma = tf.get_variable ( 'Gamma ' , param_shape, initializer of tf.constant_initializer = (. 1 ))
 13 is          Beta = tf.get_variable ( ' Beat ' , param_shape, = initializer of tf.constant_initializer (0))
 14  
15          # calculates the current mean and variance of the entire batch 
16          axes = List (Range (len (Shape) -1 ))
 . 17          batch_mean, batch_var tf.nn.moments = (X, axes, name = ' Moments ' )
 18 is  
. 19          # sliding average mean and variance update 
20 is          EMA = tf.train. ExponentialMovingAverage (moving_decay)
 21 is  
22 is          DEFmean_var_with_update ():
 23              ema_apply_op = ema.apply ([batch_mean, batch_var])
 24-              with tf.control_dependencies ([ema_apply_op]):
 25                  return tf.identity (batch_mean), tf.identity (batch_var)
 26  
27          # training, update mean and variance, the last saved test before using the mean and variance of 
28          mean, var = tf.cond (tf.equal (is_training, True), mean_var_with_update,
 29                  the lambda : (ema.average (batch_mean), ema.average ( batch_var)))
 30  
31 is          # final performance Normalization BATCH 
32          return tf.nn.batch_normalization (X, Mean, var, Beta, Gamma, EPS)

Option 2:

tf.contrib.layers.batch_norm: Batch encapsulated class

Indeed tf.contrib.layers.batch_norm tf.nn.moments for packaging and had a tf.nn.batch_normalization

parameter:

The Inputs 1 : Input

Decay 2 : attenuation coefficient. Suitable attenuation coefficient value is close to 1.0, in particular containing a plurality of values of the 9: 0.999,0.99,0.9. If the training set and good performance verification / test set did not perform well, select

Small coefficient (0.9 recommended). If you want to improve the stability, zero_debias_moving_mean set to True

Center. 3 : If True, there beta offset; offset beta If False, no

Scale 4 : If True, then multiplied by gamma. If False, gamma is not used. The lower layer is linear (e.g. nn.relu), since the scaling can be done by the next layer,

So you can disable this layer.

Epsilon 5 : avoid dividing by zero

Activation_fn. 6 : for activation, the default is a linear activation function

Param_initializers 7 : optimizing initialization beta, gamma, moving mean and moving variance of

Param_regularizers 8 : Beta and Gamma regularization optimization

Updates_collections. 9 : update operations to collect the Collections calculated. updates_ops train_op need to perform. If None, it will add controls to dependency

Ensure that the update has been calculated in place.

Is_training 10 : Layer is in training mode. In training mode, it will accumulate into statistics moving_mean and moving_variance given the exponential moving average of decay。当它不是在训练模式,那么它将使用的数值moving_mean和moving_variance。
11 scope:可选范围variable_scope
注意:训练时,需要更新moving_mean和moving_variance。默认情况下,更新操作被放入tf.GraphKeys.UPDATE_OPS,所以需要添加它们作为依赖项train_op。例如:

  update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)  with tf.control_dependencies(update_ops):    train_op = optimizer.minimize(loss)

可以将updates_collections = None设置为强制更新,但可能会导致速度损失,尤其是在分布式设置中。

Implementation code:

1 import tensorflow as tf
2 
3 def batch_norm(x,epsilon=1e-5, momentum=0.9,train=True, name="batch_norm"):
4     with tf.variable_scope(name):
5         epsilon = epsilon
6         momentum = momentum
7         name = name
8     return tf.contrib.layers.batch_norm(x, decay=momentum, updates_collections=None, epsilon=epsilon,
9                                         scale=True, is_training=train,scope=name)

 

BN general release which layer?

BN layer is set to be formed in generally conv-> bn-> scale-> of a block sequence relu

 

BN difference when training and testing? ? ?

When training bn layer, based on the current batch adjust the distribution of mean and std; when the test is the test, based on the mean and std adjust all the training sample distribution

So, the training time required to make BN layer work, and learn to save BN layer parameters. Test when loading parameters of training to get used to reconstruct the test set.

Guess you like

Origin www.cnblogs.com/WSX1994/p/10949079.html