版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/slz0813/article/details/78780901
- // The learning rate decay policy. The currently implemented learning rate
- // policies are as follows:
- // - fixed: always return base_lr.
- // - step: return base_lr * gamma ^ (floor(iter / step))
- // - exp: return base_lr * gamma ^ iter
- // - inv: return base_lr * (1 + gamma * iter) ^ (- power)
- // - multistep: similar to step but it allows non uniform steps defined by
- // stepvalue
- // - poly: the effective learning rate follows a polynomial decay, to be
- // zero by the max_iter. return base_lr (1 - iter/max_iter) ^ (power)
- // - sigmoid: the effective learning rate follows a sigmod decay
- // return base_lr ( 1/(1 + exp(-gamma * (iter - stepsize))))
- //
- // where base_lr, max_iter, gamma, step, stepvalue and power are defined
- // in the solver parameter protocol buffer, and iter is the current iteration.
lr_policy可以设置为下面这些值,相应的学习率的计算为:
- fixed: 保持base_lr不变.
- step: 如果设置为step,则还需要设置一个stepsize, 返回 base_lr * gamma ^ (floor(iter / stepsize)),其中iter表示当前的迭代次数
- exp: 返回base_lr * gamma ^ iter, iter为当前迭代次数
- inv: 如果设置为inv,还需要设置一个power, 返回base_lr * (1 + gamma * iter) ^ (- power)
- multistep: 如果设置为multistep,则还需要设置一个stepvalue。这个参数和step很相似,step是均匀等间隔变化,而multistep则是根据 stepvalue值变化
- poly: 学习率进行多项式误差, 返回 base_lr (1 - iter/max_iter) ^ (power)
- sigmoid: 学习率进行sigmod衰减,返回 base_lr ( 1/(1 + exp(-gamma * (iter - stepsize))))