learning rate 调整方法

在模型训练DL模型时,随着模型的epoch迭代,往往会推荐逐渐减小learning rate,在一些实验中也证明确实对训练的收敛有正向效果。对于learning rate的改变,有定制衰减规则直接控制的,也有通过算法自动寻优的。这里主要介绍下TF自带的两种衰减方法:指数衰减和多项式衰减。

指数衰减(tf.train.exponential_decay)

方法原型:

tf.train.exponential_decay(learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None){#exponential_decay}

参数:

learning_rate:初始值

global_step:全局step数(每个step对应一次batch)

decay_steps:learning rate更新的step周期,即每隔多少step更新一次learning rate的值

decay_rate:指数衰减参数(对应α^t中的α)

staircase:是否阶梯性更新learning rate,也就是global_step/decay_steps的结果是float型还是向下取整

计算公式:

decayed_learning_rate=learning_rate*decay_rate^(global_step/decay_steps)


多项式衰减(tf.train.polynomial_decay)

方法原型:

tf.train.polynomial_decay(learning_rate, global_step, decay_steps, end_learning_rate=0.0001, power=1.0, cycle=False, name=None){#polynomial_decay}

参数:

learning_rate:初始值

global_step:全局step数(每个step对应一次batch)

decay_steps:learning rate更新的step周期,即每隔多少step更新一次learning rate的值

end_learning_rate:衰减最终值

power:多项式衰减系数(对应(1-t)^α的α)

cycle:step超出decay_steps之后是否继续循环t

计算公式:

当cycle=False时

global_step=min(global_step, decay_steps)

decayed_learning_rate=

(learning_rate-end_learning_rate)*(1-global_step/decay_steps)^(power)+end_learning_rate

当cycle=True时

decay_steps=decay_steps*ceil(global_step/decay_steps)

decayed_learning_rate=

(learning_rate-end_learning_rate)*(1-global_step/decay_steps)^(power)+end_learning_rate

注:ceil是向上取整


更新lr的一般代码:

def _configure_learning_rate(num_samples_per_epoch, global_step):

"""Configures the learning rate.

Args:

num_samples_per_epoch: The number of samples in each epoch of training.

global_step: The global_step tensor.

Returns:

A `Tensor` representing the learning rate.

Raises:

ValueError: if

"""

decay_steps = int(num_samples_per_epoch / FLAGS.batch_size *

FLAGS.num_epochs_per_decay)

if FLAGS.sync_replicas:

decay_steps /= FLAGS.replicas_to_aggregate

if FLAGS.learning_rate_decay_type == 'exponential':

return tf.train.exponential_decay(FLAGS.learning_rate,

global_step,

decay_steps,

FLAGS.learning_rate_decay_factor,

staircase=True,

name='exponential_decay_learning_rate')

elif FLAGS.learning_rate_decay_type == 'fixed':

return tf.constant(FLAGS.learning_rate, name='fixed_learning_rate')

elif FLAGS.learning_rate_decay_type == 'polynomial':

return tf.train.polynomial_decay(FLAGS.learning_rate,

global_step,

decay_steps,

FLAGS.end_learning_rate,

power=1.0,

cycle=False,

name='polynomial_decay_learning_rate')

else:

raise ValueError('learning_rate_decay_type [%s] was not recognized',

FLAGS.learning_rate_decay_type)



作者:EdwardLee
链接:https://www.jianshu.com/p/f9f66a89f6ba
來源:简书
简书著作权归作者所有,任何形式的转载都请联系作者获得授权并注明出处。

猜你喜欢

转载自blog.csdn.net/hellocsz/article/details/80860694