Deep learning parameter adjustment notes

1. Adam learned the true rate of 0.00035 incense;

2. SGD + Momentum learning rate should find a suitable interval, usually much larger than Adam;

3. The early termination, to prevent over-fitting;

4. Ensemble can significantly improve the performance of the model, for both models, appropriate to increase the better performance of the right weight models may get better results;

Guess you like

Origin www.cnblogs.com/lucifer1997/p/12501757.html