Intensive lectures on practical application cases of MATLAB algorithms - [deep learning] optimization strategy (final part)

Table of contents

I hope so

AMSGrad

AdaBound

AdamW

RAdam

Lookahead


I hope so

Adam can be seen as a fusion of RMSProp and momentum. RMSprop contributes to the average value of the exponential decay of the historical square gradient v_{t}, while momentum is responsible for the average value of the exponential decay of the historical gradient m_{t}. Nadam adds the accumulation of first-order momentum on the basis of Adam. , namely Nesterov + Adam = Nadam, in order to integrate NAG into Adam, we need to modify the momentum item

Guess you like

Origin blog.csdn.net/qq_36130719/article/details/131559691