TensorFlow study notes (1): detailed explanation of tf.Variable() and tf.get_variable()

For tf.Variable and tf.get_variable, these two functions are often encountered when we train the model. We must first know its syntax format, the role of commonly used syntax format, and how to call it in actual code , How it works, and what role it has after operation.

tf.Variable()

Function : Used to generate a variable with an initial value of initial-value. The initialization value must be specified.

The syntax format is as follows:

tf.Variable(initial_value=None, trainable=True, collections=None, validate_shape=True, 
caching_device=None, name=None, variable_def=None, dtype=None, expected_shape=None, 
import_scope=None)

Frequently used are initial_value, name, shape, dtype, and trainable, which are initialization, naming, required shape size, and variable type.

Parameter explanation :

  • initial_value : Tensor or Python object convertible to Tensor, it is the initial value of Variable. Unless validate_shape is set to False, the initial value must have the specified shape. It can also be a callable, without parameters, and return to the initial value when called. In this case, dtype must be specified.
    (Please note that the initialization function in init_ops.py must be bound to the shape before it can be used here.)
  • validate_shape : If False, it is allowed to initialize variables with values ​​of unknown shapes. If True, the shape of the default initial_value must be known.
  • shape : The shape of a new variable or an existing variable.
  • name : The optional name of the variable. The default is "Variable" and it is automatically obtained.
  • dtype : If set, the initial_value will be converted to the given type. If it is None, the data type is reserved (if initial_value is Tensor), or convert_to_tensor will decide.
  • trainable : If trainable=False: Prevent this variable from being collected by GraphKeys.TRAINABLE_VARIABLES (the optimizer optimized default variable list) of the data flow graph,
    so that we will not try to update its value during training. If trainable is set to True, the variable will be placed in the list; if trainable is set to True, it will not be placed in the list, and this value will not be updated during training. So obviously its role is whether to update this parameter during model training.
  • collections : a keyword for a graph collection list. New variables will be added to this collection. The default is [GraphKeys.GLOBAL_VARIABLES]. You can also specify other collection lists by yourself;
  • validate_shape : If False, it is allowed to initialize variables with values ​​of unknown shapes. If True, the shape of the default initial_value must be known.
  • caching_device : Optional device string describing where the variable should be cached for reading. The default is a Variable device. If it is not None, cache on another device. A typical usage is to cache on devices that use variable-resident Ops to perform data deduplication through Switch and other conditional statements.
  • variable_def : VariableDef protocol buffer. If it is not None, use its content to recreate the Variable object, referencing the variable node that must already exist in the graph. The chart has not changed. variable_def and other parameters are mutually exclusive.
  • expected_shape : TensorShape. If set, the initial_value should have this shape.
  • import_scope : Optional string. The name range to be added to the variable. Only used when initializing from the protocol buffer.

Actual code:

import tensorflow as tf

v1=tf.Variable(tf.random_normal(shape=[4,3],mean=0,stddev=1),name='v1')
v2=tf.Variable(tf.constant(2),name='v2')
v3=tf.Variable(tf.ones([4,3]),name='v3')
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(v1))
    print(sess.run(v2))
    print(sess.run(v3))

Procedure result:

[[-1.2115501   1.0484737   0.55210656]
 [-1.5301195   0.9060654  -2.6766613 ]
 [ 0.27101386 -0.32336152  0.44544214]
 [-0.0120788  -0.3409422  -0.48505628]]
2
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]

tf.get_variable()

Function : Used to get the existing variable (not only the name, but also the initialization method and other parameters are the same), if it does not exist, create a new one. Various initialization methods can be used without explicitly specifying the value.

The syntax format is as follows:

tf.get_variable(name, shape=None, dtype=None, initializer=None, regularizer=None, 
trainable=True, collections=None, caching_device=None, partitioner=None, validate_shape=True, 
custom_getter=None)

Parameter explanation :

  • name : The name of a new variable or an existing variable. This parameter is required. The function will create or get the variable based on the variable name.
  • shape : The shape of a new variable or an existing variable.
  • dtype : the type of new or existing variables (default is DT_FLOAT).
  • ininializer : If created, use it to initialize variables. The initialization method will be summarized below
  • regularizer : A (Tensor ->
    Tensor or None) function; the result of applying it to the newly created variable will be added to the set tf.GraphKeys.REGULARIZATION_LOSSES and can be used for regularization.
  • trainable : If True, also add variables to the graph collection GraphKeys.TRAINABLE_VARIABLES (see tf.Variable).
  • collections : The list of chart collections to add variables to. The default is [GraphKeys.GLOBAL_VARIABLES] (see tf.Variable).
  • caching_device : Optional device string or function that describes the location where the variable should be cached for reading. The default is a Variable device. If it is not None, cache on another device. A typical usage is to cache on devices that use variable-resident Ops to perform data deduplication through Switch and other conditional statements.
  • partitioner : Optional callable, accepts the fully defined TensorShape and the dtype of the Variable to be created, and returns a list of partitions for each axis (currently only one axis can be partitioned).
  • validate_shape : If False, it is allowed to initialize variables with values ​​of unknown shapes. If True, the shape of the default initial_value must be known.
  • use_resource : If False, create a regular variable. If true, an experimental ResourceVariable is created with well-defined semantics. The default is False (will be changed to True later). In Eager mode, this parameter is always forced to True.
  • custom_getter : Callable, it takes the first parameter as a true getter and allows to override the internal get_variable method.

Actual code:

import tensorflow as tf;  
import numpy as np;  
import matplotlib.pyplot as plt;  
  
a1 = tf.get_variable(name='a1', shape=[2,3], initializer=tf.random_normal_initializer(mean=0, stddev=1))
a2 = tf.get_variable(name='a2', shape=[1], initializer=tf.constant_initializer(1))
a3 = tf.get_variable(name='a3', shape=[2,3], initializer=tf.ones_initializer())
 
with tf.Session() as sess:
	sess.run(tf.initialize_all_variables())
	print sess.run(a1)
	print sess.run(a2)
	print sess.run(a3)
#输出
[[ 0.42299312 -0.25459203 -0.88605702]
 [ 0.22410156  1.34326422 -0.39722782]]
[ 1.]
[[ 1.  1.  1.]
 [ 1.  1.  1.]]

The difference between the two

1. When using tf.Variable, variables can be truly defined. If a naming conflict is detected, the system will handle it by itself. When using tf.get_variable(), it is to get the variable, the system will not deal with the conflict, but will report an error.
For example:

import tensorflow as tf
w_1 = tf.Variable(3,name="w_1")
w_2 = tf.Variable(1,name="w_1")
print w_1.name
print w_2.name
#输出
#w_1:0
#w_1_1:0
import tensorflow as tf
 
w_1 = tf.get_variable(name="w_1",initializer=1)
w_2 = tf.get_variable(name="w_1",initializer=2)
#错误信息
#ValueError: Variable w_1 already exists, disallowed. Did
#you mean to set reuse=True in VarScope?

2. Based on the characteristics of these two functions, when we need to share variables, we need to use tf.get_variable(). In other cases, the usage of the two is the same. In order to facilitate variable management, tensorflow also has a variable manager called tf.variable_scope, which means that variable_scope is defined so that it can have the same name

import tensorflow as tf

with tf.variable_scope("scope1"): # scopename is scope1
    w1 = tf.get_variable("w1", shape=[])
    w2 = tf.Variable(0.0, name="w2")
with tf.variable_scope("scope1", reuse=True):
    w1_p = tf.get_variable("w1", shape=[])
    w2_p = tf.Variable(1.0, name="w2")

print(w1 is w1_p, w2 is w2_p)
#输出
#True  False

Since tf.Variable() creates new objects every time, all reuse=True has nothing to do with it. For get_variable(), if the variable object has already been created, return that object, and if the variable object is not created, create a new one.

Initializer is a summary of variable initialization methods. There are several initialization methods:

  1. tf.constant_initializer: constant initialization function
    tf.random_normal_initializer: normal distribution
    tf.truncated_normal_initializer: truncated normal distribution
    tf.random_uniform_initializer: uniform distribution
    tf.zeros_initializer: all 0
    tf.ones_initializer: all 1
    tf.uniform_initializer: satisfy uniform distribution , But does not affect the random value of the output magnitude

E.g:

import tensorflow as tf;  
import numpy as np;  
import matplotlib.pyplot as plt;  
  
a1 = tf.get_variable(name='a1', shape=[2,3], initializer=tf.random_normal_initializer(mean=0, stddev=1))
a2 = tf.get_variable(name='a2', shape=[1], initializer=tf.constant_initializer(1))
a3 = tf.get_variable(name='a3', shape=[2,3], initializer=tf.ones_initializer())
 
with tf.Session() as sess:
	sess.run(tf.initialize_all_variables())
	print sess.run(a1)
	print sess.run(a2)
	print sess.run(a3)
#输出
[[ 0.42299312 -0.25459203 -0.88605702]
 [ 0.22410156  1.34326422 -0.39722782]]
[ 1.]
[[ 1.  1.  1.]
 [ 1.  1.  1.]]

Note : Different variables cannot have the same name, unless you define variable_scope, so that you can have the same name

note:

  • If the initializer initialization method is None (the default value), the initializer defined in variable_scope() will be used. If it is also None, the glorot_uniform_initializer will be used by default. You can also use other tensors to initialize. The value and shape are the same as this tensor.
  • The default regularization method is None. If it is not specified, only the regularization method in variable_scope() will be used. If it is also None, regularization will not be used;

Guess you like

Origin blog.csdn.net/m0_51004308/article/details/112850660