tensorflow中global_step 使用记录

global_step在滑动平均、优化器、指数衰减学习率等方面都有用到,这个变量的实际意义非常好理解:代表目前已训练次数,比如在多少步该进行什么操作,现在神经网络训练到多少轮等等,类似于一个钟表。

在代码中常将global_step初始化为不可训练的变量0.即,

global_step=tf.Variable(0, trainable=False) 

对于初学者很容易迷惑global_step在何时何地进行更新。为此参照网上博客大神自己做了下实验:测试代码如下。

import tensorflow as tf
import numpy as np

LEARNING_RATE_BASE=0.8#基础学习率,
LEARNING_RATE_DECAY=0.99
 
x = tf.placeholder(tf.float32, shape=[None, 1], name='x')
y = tf.placeholder(tf.float32, shape=[None, 1], name='y')
w = tf.Variable(tf.constant(2.0))
"""声明TensorFlow global_step变量  ,为了不和tf自带函数中形参global_step冲突,将变量名字修改为 global_step_train"""
global_step_train = tf.Variable(0, trainable=False)

"""定义指数衰减学习率    #10为过完所有训练数据需要的迭代次数   staircase=False为连续曲线状衰减,为true时为阶梯状衰减"""
learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE, global_step_train, 20, LEARNING_RATE_DECAY, staircase=False)



loss = tf.pow(w*x-y, 2)
#loss1 = tf.pow(w*x-y, 3)
"""
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step=global_steps)后面部分的global_step=global_steps去掉,global_step的自动加一就会失效"""
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step=global_step_train)


with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(10):
        sess.run(train_step,feed_dict={x:np.linspace(1,2,10).reshape([10,1]),
            y:np.linspace(1,2,10).reshape([10,1])})
        print("learning_rate:",sess.run(learning_rate))
        print("global_step_train:",sess.run(global_step_train))

计算结果如下。

learning_rate: 0.7995981
global_step_train: 1
learning_rate: 0.7991964
global_step_train: 2
learning_rate: 0.7987949
global_step_train: 3
learning_rate: 0.7983936
global_step_train: 4
learning_rate: 0.79799247
global_step_train: 5
learning_rate: 0.79759157
global_step_train: 6
learning_rate: 0.79719085
global_step_train: 7
learning_rate: 0.79679036
global_step_train: 8
learning_rate: 0.79639006
global_step_train: 9
learning_rate: 0.79598993
global_step_train: 10

可以看到global_step_train在增加。具体增加机制为:由于python的函数参数赋值是引用的方式,所有global_step=global_steps 时,global_step获得的是global_steps的引用;在源代码里有一个赋值语句 apply_updates = state_ops.assign_add(global_step, 1, name=name) 。

上面情况给出了只有一个loss函数时的情况,下面给出多个loss函数时,分别进行最小化时global_step更新情况。代码如下:

import tensorflow as tf
import numpy as np

LEARNING_RATE_BASE=0.8#基础学习率,
LEARNING_RATE_DECAY=0.99
 
x = tf.placeholder(tf.float32, shape=[None, 1], name='x')
y = tf.placeholder(tf.float32, shape=[None, 1], name='y')
w = tf.Variable(tf.constant(2.0))
"""声明TensorFlow global_step变量  ,为了不和tf自带函数中形参global_step冲突,将变量名字修改为 global_step_train"""
global_step_train = tf.Variable(0, trainable=False)

 
"""定义指数衰减学习率    #10为过完所有训练数据需要的迭代次数   staircase=False为连续曲线状衰减,为true时为阶梯状衰减"""
learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE, global_step_train, 20, LEARNING_RATE_DECAY, staircase=False)
#learning_rate=0.001


loss = tf.pow(w*x-y, 2)
loss1 = tf.pow(w*x-y, 3)
"""
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step=global_steps)后面部分的global_step=global_steps去掉,global_step的自动加一就会失效"""

def training(loss,loss1,learning_rate,global_step_train):
    optimizer1 = tf.train.AdamOptimizer(learning_rate= learning_rate)
    global_step1 = tf.Variable(0, name='global_step', trainable=False)
    t_op1 = optimizer1.minimize(loss, global_step= global_step1)
   
    optimizer2 = tf.train.AdamOptimizer(learning_rate= learning_rate)
    #global_step2 = tf.Variable(0, name='global_step', trainable=False)
    t_op2 = optimizer2.minimize(loss1, global_step= global_step_train)
    
    return t_op1,t_op2,global_step1

train_op1,train_op2,step1 = training(loss,loss1,learning_rate,global_step_train) 
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(10):
        _,_=sess.run([train_op1,train_op2],feed_dict={x:np.linspace(1,2,10).reshape([10,1]),y:np.linspace(1,2,10).reshape([10,1])})
        print("learning_rate:",sess.run(learning_rate))
        print("global_step_train:",sess.run(global_step_train))
        print("step1:",sess.run(step1))

运行结果如下所示。

learning_rate: 0.7995981
global_step_train: 1
step1: 1
learning_rate: 0.7991964
global_step_train: 2
step1: 2
learning_rate: 0.7987949
global_step_train: 3
step1: 3
learning_rate: 0.7983936
global_step_train: 4
step1: 4
learning_rate: 0.79799247
global_step_train: 5
step1: 5
learning_rate: 0.79759157
global_step_train: 6
step1: 6
learning_rate: 0.79719085
global_step_train: 7
step1: 7
learning_rate: 0.79679036
global_step_train: 8
step1: 8
learning_rate: 0.79639006
global_step_train: 9
step1: 9
learning_rate: 0.79598993
global_step_train: 10
step1: 10

由运行结果可知,即使在training()函数中定义了globel_step1。在运行minimize()时会依次更新。将全局变量global_step_train作为参数传入,可以看出两个变量同时增加。

猜你喜欢

转载自blog.csdn.net/xiaolifeidaoer/article/details/88218224