The problems of NMT Model
- Over-Parameterization
- Long running time
- Overfitting
- Big Storage size
The redundancies of NMT Model
Most important: Higher Layers; Attention and Softmax Weights
redundancy: lower layers; embedding weights;
Traditional Solutions
Optimal Brain Damage (OBD) and Optimal Brain Surgeon(OBS)
Recent Ways
Magnitude based pruning with iterative retraining (based on the magnitude of the repeated pruning and repetitive training) yielded strong results for Convolutional Neural Networks (CNN) performing visual tasks.