drop out, learning rate in nn

  1. use different initial learning rates, says: 1e-3, 1e-4, 1e-5, if 1e-5 is the best one, that means your network is too complicate. you may want reduce to the layers.
发布了755 篇原创文章 · 获赞 195 · 访问量 104万+

猜你喜欢

转载自blog.csdn.net/seamanj/article/details/103982094
今日推荐