Tensorflow学习笔记(一)--变量作用域与模型加载

1、变量作用域机制主要由两个函数实现:

tf.get_variable(<name>, <shape>, <initializer>)
tf.variable_scope(<scope_name>)

2、常用的initializer有

tf.constant_initializer(value) # 初始化一个常量值,
tf.random_uniform_initializer(a, b) # 从a到b均匀分布的初始化,
tf.random_normal_initializer(mean, stddev) # 用所给平均值和标准差初始化正态分布.

3、变量作用域的tf.variable_scope()带有一个名称,它将会作为前缀用于变量名,并且带有一个重用标签(后面会说到)来区分以上的两种情况。嵌套的作用域附加名字所用的规则和文件目录的规则很类似。

对于采用了变量作用域的网络结构,结构伪代码如下:

import tensorflow as tf 

def my_image_filter():
    with tf.variable_scope("conv1"):
        weights = tf.get_variable("weights", [1], initializer=tf.random_normal_initializer())
    print("weights:%s" % weights.name)
    with tf.variable_scope("conv2"):
        biases = tf.get_variable("biases", [1], initializer=tf.constant_initializer(0.3))
    print("biases:%s" % biases.name)
	
result1 = my_image_filter()


输出:

weights:conv1/weights:0
biases:conv2/biases:0

4、如果连续调用两次my_image_filter()将会报出ValueError:

result1 = my_image_filter()
result2 = my_image_filter()

ValueError: Variable conv1/weights already exists, disallowed. Did you mean to s
et reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

解决方案: 若不在网络架构中采用变量作用域则不会报错,但是会产生两组变量,而不是共享变量。

a、 当tf.get_variable_scope().reuse == True时;该情况下会搜索一个已存在的“foo/v”并将该变量的值赋给v1,若找不到“foo/v”变量则会抛出ValueError。

b、当tf.get_variable_scope().reuse == tf.AUTO_REUSE时,该方法是为重用变量所设置;该情况不会抛出ValueError

import tensorflow as tf 

def my_image_filter():
    with tf.variable_scope("conv1", reuse=tf.AUTO_REUSE):
    # Variables created here will be named "conv1/weights", "conv1/biases".
        weights = tf.get_variable("weights", [1], initializer=tf.random_normal_initializer())
    print("weights:%s" % weights.name)
    with tf.variable_scope("conv2", reuse=tf.AUTO_REUSE):
    # Variables created here will be named "conv2/weights", "conv2/biases".
        biases = tf.get_variable("biases", [1], initializer=tf.constant_initializer(0.3))
    print("biases:%s" % biases.name)
	
result1 = my_image_filter()
result2 = my_image_filter()

5、 在模型加载时,如果网络框架中采用变量作用域,也会出现该问题:Variable conv1/weights already exists disallowed. Did you mean to set reuse=True

解决方案:

如果Restart kernel 之后再次执行就不会有问题了(相当于重启了spyder,这样不能从根本解决问题。而且多次重启,也不太好。)

这个问题主要是由于再次执行的时候,之前的计算图已经存在了,再次执行时会和之前已经存在的产生冲突。解决方法:
在代码前面加一句:tf.reset_default_graph()

tf.reset_default_graph()
ckpt_file = tf.train.latest_checkpoint(model_path)
print(ckpt_file)
paths['model_path'] = ckpt_file
model = BiLSTM_CRF(args, embeddings, tag2label, word2id, paths, config=config)
model.build_graph()

参考文献:

Tensorflow学习笔记(三)--变量作用域  https://blog.csdn.net/qq184861643/article/details/78116468

错误:ValueError: Variable layer1-conv1/weight already exists  https://blog.csdn.net/xiaohuihui1994/article/details/80829832

猜你喜欢

转载自blog.csdn.net/ai_1046067944/article/details/82967768
今日推荐