Reduce over-fitting, generally by: increasing the training set, loss function added regularization term, Dropout other ways. The main role of this paper is to show the role of dropout
When setting Dropout, torch.nn.Dropout (0.5), where 0.5 means that the nerve layer (Layer) element at each iteration of the training random 50% probability of being discarded (inactivated), is not involved in the training , the possibility of multi-neuron layer generally provided less random inactivation higher than neurons.
Question: How droupout discard neurons?
Reference: https://www.jianshu.com/p/636be9f8f046
https://blog.csdn.net/u014532743/article/details/78453990