- use different initial learning rates, says: 1e-3, 1e-4, 1e-5, if 1e-5 is the best one, that means your network is too complicate. you may want reduce to the layers.
drop out, learning rate in nn
猜你喜欢
转载自blog.csdn.net/seamanj/article/details/103982094
今日推荐
周排行