1. Actual Kaggle Competition: Image Classification (CIFAR-10)
https://www.kaggle.com/c/cifar-10
2. Q&A
-
- Convex functions represent optimal solutions. The loss function is a convex function, but most neural networks are non-convex, and there is no optimal solution for general neural networks.
-
- momentum means to smooth the curve a little bit.
-
- The learning rate selected by the Scheduler in the early stage is larger, and the later learning rate is smaller. It's like going out to see more when you are young.
-
- sgd is equivalent to doing the role of regular, so sgd is better than other algorithms.
reference
https://www.bilibili.com/video/BV1Gy4y1M7Cu?p=1