版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/hajungong007/article/details/87925253
- Stochastic Versus Batch Learning
- Shuffling the Examples
- Normalizing the Inputs
- The Sigmoid
隐含层使用relu - Choosing Target Values
如果目标值是[0 , 1],将值设为[0.1, 0.9],避免目标值处于激活函数的临界值。 - Initializing the Weights
Xavier weight initialization scheme. - Choosing Learning Rates
AdaGrad and Adam - Radial Basis Function vs Sigmoid