Neural network experiment experience (continuous update)

(1) When there are many prediction categories, the number of measurement parameters in each layer should be equal, otherwise there will be a situation where one feature will affect multiple categories, there will be one-to-many, and the model will not be well trained.
(2) In fact, the essence of neural network is feature extraction and spatial mapping. It is very necessary to observe the characteristics and distribution of data.
(3) Attention is really effective in tasks where word features have a large impact, such as some translation or classification tasks.
(4) Relu activation is suitable for use in layers with gradually decreasing feature dimensions, and can increase the robustness of the model.

Guess you like

Origin blog.csdn.net/cyinfi/article/details/94295805