Embedded algorithm transplantation optimization study notes 4-model compression and pruning

Three challenges for model deployment

1. Model size: The excellent performance of CNN comes from millions of trainable parameters. Those parameters and network structure information need to be stored to the hard disk and then loaded into the memory during inference. A large model is a big burden for embedded devices;
2. Memory usage at runtime: During inference, the intermediate activation value/response storage space of CNN needs to be even larger than the storage model parameters, even if the batchsize is 1. This is not a problem for high-performance GPUs, but it is unaffordable for many applications with low computing power;
3. Calculation amount: convolution operations on high-resolution images may be computationally intensive, a large CNN It may take several minutes to process a single picture on an embedded device, which makes it unrealistic to use in real applications

Reference:
1. https://www.cnblogs.com/chumingqian/articles/11505153.html
2. https://blog.csdn.net/qq_38109843/article/details/107234801
3. https://blog.csdn. net/moxibingdao/article/details/106666957
4. Online class: https://edu.csdn.net/learn/29887
5. Analysis of Tensor Decomposition
6. Network sliming: Speed ​​up the model without losing accuracy

Guess you like

Origin blog.csdn.net/mao_hui_fei/article/details/113805093