知识蒸馏（Knowledge distillation）必读论文合集

1.早期论文

Model Compression, KDD 2006
Do Deep Nets Really Need to be Deep?, NIPS 2014
Distilling the Knowledge in a Neural Network, NIPS-workshop 2014

2.特征蒸馏(Feature Distillation)

FitNets: Hints for Thin Deep Nets, ICLR 2015
Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, ICLR 2017
https://github.com/szagoruyko/attention-transfer
Learning Deep Representations with Probabilistic Knowledge Transfer, ECCV 2018
https://github.com/passalis/probabilistic_kt
Knowledge Distillation via Instance Relationship Graph, CVPR 2019
https://github.com/yufanLIU/IRG
Relational Knowledge Distillation, CVPR 2019
https://github.com/lenscloth/RKD
Similarity-Preserving Knowledge Distillation, CVPR 2019
Variational Information Distillation for Knowledge Transfer, CVPR 2019
Contrastive Representation Distillation, ICLR 2020
https://github.com/HobbitLong/RepDistiller
Heterogeneous Knowledge Distillation using Information Flow Modeling, CVPR 2020
https://github.com/passalis/pkth
Matching Guided Distillation, ECCV 2020
https://github.com/KaiyuYue/mgd
Cross-Layer Distillation with Semantic Calibration, AAAI 2021
https://github.com/DefangChen/SemCKD
Distilling Holistic Knowledge with Graph Neural Networks, ICCV 2021
https://github.com/wyc-ruiker/HKD
Knowledge Distillation with the Reused Teacher Classifier, CVPR 2022
https://github.com/DefangChen/SimKD

3.在线知识蒸馏(Online Knowledge Distillation)

Deep Mutual Learning, CVPR 2018
https://github.com/huanghoujing/AlignedReID-Re-Production-Pytorch
Large scale distributed neural network training through online distillation, ICLR 2018
Collaborative Learning for Deep Neural Networks, NIPS 2018
Knowledge Distillation by On-the-Fly Native Ensemble, NIPS 2018
https://github.com/Lan1991Xu/ONE_NeurIPS2018
Online Knowledge Distillation with Diverse Peers, AAAI 2020
https://github.com/DefangChen/OKDDip-AAAI2020
Online Knowledge Distillation via Collaborative Learning, CVPR 2020

4.多教师知识蒸馏(Multi-Teacher Knowledge Distillation)

Distilling knowledge from ensembles of neural networks for speech recognition, INTERSPEECH 2016
Efficient Knowledge Distillation from an Ensemble of Teachers, INTERSPEECH 2017
Learning from Multiple Teacher Networks, KDD 2017
Multi-teacher Knowledge Distillation for Compressed Video Action Recognition on Deep Neural Networks, ICASSP 2019
Agree to Disagree: Adaptive Ensemble Knowledge Distillation in Gradient Space, NIPS 2020
https://github.com/AnTuo1998/AE-KD
Adaptive Knowledge Distillation Based on Entropy, ICASSP 2020
Reinforced Multi-Teacher Selection for Knowledge Distillation, AAAI 2021
Adaptive Distillation: Aggregating Knowledge from Multiple Paths for Efficient Distillation, BMVC 2021
https://github.com/wyze-AI/AdaptiveDistillation
Confidence-Aware Multi-Teacher Knowledge Distillation, ICASSP 2022
https://github.com/Rorozhl/CA-MKD

5.扩散蒸馏(Diffusion Distillation)

Progressive Distillation for Fast Sampling of Diffusion Models, ICLR 2022
https://github.com/google-research/google-research/tree/master/diffusion_distillation
Accelerating Diffusion Sampling with Classifier-based Feature Distillation, ICME 2023
https://github.com/zju-SWJ/RCFD

6.无数据知识蒸馏(Data-Free Knowledge Distillation)

Data-Free Knowledge Distillation for Deep Neural Networks, NIPS-workshop 2017
https://github.com/iRapha/replayed_distillation
DAFL: Data-Free Learning of Student Networks, ICCV 2019
https://github.com/huawei-noah/Efficient-Computing/tree/master/Data-Efficient-Model-Compression
Zero-Shot Knowledge Distillation in Deep Networks, ICML 2019
https://github.com/vcl-iisc/ZSKD
Zero-shot Knowledge Transfer via Adversarial Belief Matching, NIPS 2019
https://github.com/polo5/ZeroShotKnowledgeTransfer
Knowledge Extraction with No Observable Data, NIPS 2019
https://github.com/snudatalab/KegNet
Dream Distillation: A Data-Independent Model Compression Framework, ICML-workshop 2019
DeGAN : Data-Enriching GAN for Retrieving Representative Samples from a Trained Classifier, AAAI 2020
https://github.com/vcl-iisc/DeGAN
Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion, CVPR 2020
https://github.com/NVlabs/DeepInversion
The Knowledge Within: Methods for Data-Free Model Compression, CVPR 2020
Data-Free Adversarial Distillation, ICASSP 2020
Data-Free Knowledge Distillation with Soft Targeted Transfer Set Synthesis, AAAI 2021
Learning Student Networks in the Wild, CVPR 2021
https://github.com/huawei-noah/Data-Efficient-Model-Compression
Contrastive Model Inversion for Data-Free Knowledge Distillation, IJCAI 2021
https://github.com/zju-vipa/DataFree