Batchsize与learning rate

https://www.zhihu.com/question/64134994

1, increasing the batch size will make the gradient is more accurate, but it can also lead to smaller variance, the model may cause local optimum;

2, thus increasing the batch size typically increases learning rate, such as the batch size increases by m times, lr increased m-fold or sqrt (m) times, but is not fixed;

3, increase the learning rate is usually not directly increase is too large, the general will gradually increase through the warm up;

4、warm up策略参考 Bag of Freebies for Training Object Detection Neural Networks

    Prepared with the m to warm up batches, initial learning rate is prepared [official], and then each BATCH [official], each time the learning rate is set to i * n / m

Guess you like

Origin www.cnblogs.com/573177885qq/p/11517127.html