Learning rate preheat
Learning neural network training rate is one of the most important parameters of super, there are many techniques for learning rate. Warm up is a method of learning rate ResNet preheated mentioned in the paper. Due to the heavy weight of the model (weights) at the beginning of training is randomly initialized, then choose a larger learning rate could lead to instability model. Learning rate is first preheated using a smaller learning rate at the beginning of training, the training of some epoches or iterations, and so on and then modify training for the learning rate set in advance when the model is stable.
Achieve preheating method of learning rate
Gradual warmup. It can gradually learn a little from the rate of value raised to a large value. This ramp, avoiding a sudden increase in the rate of learning, allowing the convergence of health at the start of training.
In practice, there is a great minibatch size kn, we start learning rate η, adding a number of constants in each iteration, after five eopch, to reach η = kη (the results are robust at the exact time preheat). After warm-up, we go back to the learning schedule.
##warm up
class LearningRateWarmUP(object):
def __init__(self, optimizer, target_iteration, target_lr, after_scheduler=None):
self.optimizer = optimizer
self.target_iteration = target_iteration
self.target_lr = target_lr
self.num_iterations = 0
self.after_scheduler = after_scheduler
def warmup_learning_rate(self, cur_iteration):
warmup_lr = self.target_lr*float(cur_iteration)/float(self.target_iteration)
for param_group in self.optimizer.param_groups:
param_group['lr'] = warmup_lr
def step(self, cur_iteration):
if cur_iteration <= self.target_iteration:
self.warmup_learning_rate(cur_iteration)
else:
self.after_scheduler.step(cur_iteration)
Reference
1.https://blog.csdn.net/sinat_36618660/article/details/99650804
2.https://zhuanlan.zhihu.com/p/66080948
3.https://arxiv.org/pdf/1706.02677.pdf