Learning rate scheduler for target detection tasks such as mmdetection and yolo series

Learning rate (lr) is the most important hyperparameter in supervised learning such as object detection, which determines whether the classification function or the bounding box regression function can converge to the local minimum and when it converges to the minimum . The correct learning rate allows the objective function to converge to a local optimum in a suitable time. At the same time, the learning rate can be dynamically changed during the training process. This dynamic change process is called the learning rate scheduler .

First, we analyze the learning rate adjustment strategy of YOLOX as an example: this strategy is a cosine scheduling strategy with Warmup. At the same time, in order to cooperate with data enhancement. A fixed minimum learning rate is used for the last 15 epochs .


Training warmup - Warmup

  •  In the initial stage of training, the weights of the model are randomly initialized, and its understanding of the data distribution is 0. If the preset learning rate is used at this time, the model will be unstable or even over-fitting, and it will take multiple rounds of training to pull it back later . After training for a period of time, the model will quickly correct the data distribution, and with some correct priors on the current data, the learning rate can be appropriately adjusted to speed up the training.
  • Warmup prediction is just a few epo at the beginning of training

Guess you like

Origin blog.csdn.net/qq_42308217/article/details/122590567