Learning rate preheating (warm up)

Learning rate preheat

Learning neural network training rate is one of the most important parameters of super, there are many techniques for learning rate. Warm up is a method of learning rate ResNet preheated mentioned in the paper. Due to the heavy weight of the model (weights) at the beginning of training is randomly initialized, then choose a larger learning rate could lead to instability model. Learning rate is first preheated using a smaller learning rate at the beginning of training, the training of some epoches or iterations, and so on and then modify training for the learning rate set in advance when the model is stable.

Achieve preheating method of learning rate

Gradual warmup. It can gradually learn a little from the rate of value raised to a large value. This ramp, avoiding a sudden increase in the rate of learning, allowing the convergence of health at the start of training.
In practice, there is a great minibatch size kn, we start learning rate η, adding a number of constants in each iteration, after five eopch, to reach η = kη (the results are robust at the exact time preheat). After warm-up, we go back to the learning schedule.

##warm up
class LearningRateWarmUP(object):
    def __init__(self, optimizer, target_iteration, target_lr, after_scheduler=None):
        self.optimizer = optimizer
        self.target_iteration = target_iteration
        self.target_lr = target_lr
        self.num_iterations = 0
        self.after_scheduler = after_scheduler

    def warmup_learning_rate(self, cur_iteration):
        warmup_lr = self.target_lr*float(cur_iteration)/float(self.target_iteration)
        for param_group in self.optimizer.param_groups:
            param_group['lr'] = warmup_lr

    def step(self, cur_iteration):
        if cur_iteration <= self.target_iteration:
            self.warmup_learning_rate(cur_iteration)
        else:
            self.after_scheduler.step(cur_iteration)

Reference

1.https://blog.csdn.net/sinat_36618660/article/details/99650804
2.https://zhuanlan.zhihu.com/p/66080948
3.https://arxiv.org/pdf/1706.02677.pdf

Published 33 original articles · won praise 3 · Views 5537

Guess you like

Origin blog.csdn.net/weixin_42990464/article/details/104640641