模型压缩论文目录

结构`structure`

[BMVC2018] IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks
[CVPR2018] IGCV2: Interleaved Structured Sparse Convolutional Neural Networks
[CVPR2018] MobileNetV2: Inverted Residuals and Linear Bottlenecks
[ECCV2018] ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

量化`quantization`

Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1
[ACM2017] FINN: A Framework for Fast, Scalable Binarized Neural Network Inference
[CVPR2016] DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients
[CVPR2016] XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
[CVPR2016] Ternary Weight Networks
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
[ACM2017] Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations
Two-Step Quantization for Low-bit Neural Networks

剪枝`pruning`

通道裁剪`channel pruning`

[NIPS2018] Discrimination-aware Channel Pruning for Deep Neural Networks
[ICCV2017] Channel Pruning for Accelerating Very Deep Neural Networks
[ECCV2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
[ICCV2017] Learning Efficient Convolutional Networks through Network Slimming
[ICLR2018] Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers
[CVPR2017] NISP: Pruning Networks using Neuron Importance Score Propagation
[ICCV2017] ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression

稀疏`sparsity`

SBNet: Sparse Blocks Network for Fast Inference
To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression
Submanifold Sparse Convolutional Networks

融合`fusion`
蒸馏`distillation`

[NIPS2014] Distilling the Knowledge in a Neural Network

综合`comprehensive`

[ICLR2016] Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding
Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification

根据个人理解将模型压缩方面研究分为以下七个方向：

结构`structure`

[BMVC2018] IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks

intro:
arxiv:https://arxiv.org/abs/1806.00178
github:https://github.com/homles11/IGCV3

[CVPR2018] IGCV2: Interleaved Structured Sparse Convolutional Neural Networks

intro:
arxiv:https://arxiv.org/abs/1804.06202
同上

[CVPR2018] MobileNetV2: Inverted Residuals and Linear Bottlenecks

intro:
arxiv:https://arxiv.org/abs/1801.04381
github:https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet

[ECCV2018] ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

intro:
arxiv:https://arxiv.org/abs/1807.11164
github:

量化`quantization`

Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1

intro:二值网络
arxiv:https://arxiv.org/abs/1602.02830
github: https://github.com/MatthieuCourbariaux/BinaryNet
https://github.com/itayhubara/BinaryNet

[ACM2017] FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

intro:二值网络
pdf:http://www.idi.ntnu.no/~yamanu/2017-fpga-finn-preprint.pdf
github:https://github.com/Xilinx/FINN

[CVPR2016] DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients

intro:低bit位
arxiv:https://arxiv.org/abs/1606.06160
github:https://github.com/tensorpack/tensorpack/tree/master/examples/DoReFa-Net

[CVPR2016] XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

intro:darknet团队出品
arxiv:https://arxiv.org/abs/1603.05279
github:https://github.com/allenai/XNOR-Net

[CVPR2016] Ternary Weight Networks

intro:
arxiv:https://arxiv.org/abs/1605.04711
github:https://github.com/fengfu-chris/caffe-twns

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Google出品
arxiv:https://arxiv.org/abs/1712.05877
github:https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize

[ACM2017] Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations

Two-Step Quantization for Low-bit Neural Networks

intro:
paper:http://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Two-Step_Quantization_for_CVPR_2018_paper.pdf
github:

剪枝`pruning`

通道裁剪`channel pruning`

[ICLR2018] Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers

[CVPR2017] NISP: Pruning Networks using Neuron Importance Score Propagation

intro:
arxiv:https://arxiv.org/abs/1711.05908

[ICCV2017] ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression

稀疏`sparsity`

SBNet: Sparse Blocks Network for Fast Inference

To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression

intro:稀疏
arxiv:https://arxiv.org/abs/1710.01878
github:https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/model_pruning

Submanifold Sparse Convolutional Networks

intro:Facebook
arxiv:https://arxiv.org/abs/1706.01307
github:https://github.com/facebookresearch/SparseConvNet

融合`fusion`

蒸馏`distillation`

[NIPS2014] Distilling the Knowledge in a Neural Network

intro:Hinton出品
arxiv:https://arxiv.org/abs/1503.02531
github:https://github.com/peterliht/knowledge-distillation-pytorch

综合`comprehensive`

[ICLR2016] Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding

Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification

intro:实验比较多，适合工程化
arxiv:https://arxiv.org/abs/1709.02929

Deep Compression/Acceleration（模型压缩加速总结）

模型压缩论文目录

结构`structure`

[BMVC2018] IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks

[CVPR2018] IGCV2: Interleaved Structured Sparse Convolutional Neural Networks

[CVPR2018] MobileNetV2: Inverted Residuals and Linear Bottlenecks

[ECCV2018] ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

量化`quantization`

Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1

[ACM2017] FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

[CVPR2016] DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients

[CVPR2016] XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

[CVPR2016] Ternary Weight Networks

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

[ACM2017] Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations

Two-Step Quantization for Low-bit Neural Networks

剪枝`pruning`

通道裁剪`channel pruning`

[NIPS2018] Discrimination-aware Channel Pruning for Deep Neural Networks

[ICCV2017] Channel Pruning for Accelerating Very Deep Neural Networks

[ECCV2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices

[ICCV2017] Learning Efficient Convolutional Networks through Network Slimming

[ICLR2018] Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers

[CVPR2017] NISP: Pruning Networks using Neuron Importance Score Propagation

[ICCV2017] ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression

稀疏`sparsity`

SBNet: Sparse Blocks Network for Fast Inference

To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression

Submanifold Sparse Convolutional Networks

融合`fusion`

蒸馏`distillation`

[NIPS2014] Distilling the Knowledge in a Neural Network

综合`comprehensive`

[ICLR2016] Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding

Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification

猜你喜欢

Deep Compression/Acceleration（模型压缩加速总结）

模型压缩论文目录

结构structure

[BMVC2018] IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks

[CVPR2018] IGCV2: Interleaved Structured Sparse Convolutional Neural Networks

[CVPR2018] MobileNetV2: Inverted Residuals and Linear Bottlenecks

[ECCV2018] ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

量化quantization

Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1

[ACM2017] FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

[CVPR2016] DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients

[CVPR2016] XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

[CVPR2016] Ternary Weight Networks

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

[ACM2017] Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations

Two-Step Quantization for Low-bit Neural Networks

剪枝pruning

通道裁剪channel pruning

[NIPS2018] Discrimination-aware Channel Pruning for Deep Neural Networks

[ICCV2017] Channel Pruning for Accelerating Very Deep Neural Networks

[ECCV2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices

[ICCV2017] Learning Efficient Convolutional Networks through Network Slimming

[ICLR2018] Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers

[CVPR2017] NISP: Pruning Networks using Neuron Importance Score Propagation

[ICCV2017] ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression

稀疏sparsity

SBNet: Sparse Blocks Network for Fast Inference

To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression

Submanifold Sparse Convolutional Networks

融合fusion

蒸馏distillation

[NIPS2014] Distilling the Knowledge in a Neural Network

综合comprehensive

[ICLR2016] Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding

Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification

猜你喜欢

结构`structure`

量化`quantization`

剪枝`pruning`

通道裁剪`channel pruning`

稀疏`sparsity`

融合`fusion`

蒸馏`distillation`

综合`comprehensive`