A Survey of Compression Methods for Deep Learning Models

Due to its computational complexity or parameter redundancy, deep learning limits the corresponding model deployment in some scenarios and devices. It needs to use 模型压缩methods such as system optimization and acceleration to break through the bottleneck. This article mainly introduces various methods of model compression. I hope it will be helpful to everyone. help.

1. Overview of model compression technology

We know, to a certain extent, 网络越深,参数越多,模型也会越复杂,但其最终效果也越好, and the model compression algorithm is designed to 庞大而复杂convert a large model into one 精简的小模型.

The reason why model compression is necessary is because of 嵌入式the device 算力和内存有限, the compressed model can only be deployed to the embedded device.
The definition of model compression problem can start from three angles:

insert image description here

1.1 Classification of model compression methods

According to the degree of damage to the network structure during the compression process, the book "Analysis of Convolutional Neural Networks" divides the model compression technology into two parts: "front-end compression" and "back-end compression":

  • 前端压缩, refers to the current 不改变原网络结构compression technology, mainly including 知识蒸馏, 轻量级网络(compact model structure design) and pruning ( 结构化剪枝) at the filter level;
  • 后端压缩, refers to 低秩近似pruning including, but not limited to (非结构化剪枝

Guess you like

Origin blog.csdn.net/weixin_38346042/article/details/131352473