Knowledge Distillation (Deep Learning Model Compression) - Code World

Knowledge Distillation (Deep Learning Model Compression)

Enterprise 2023-06-05 08:04:53 views: null

Model compression can be roughly divided into five types:

Model pruning: remove components that have less effect on the results, such as reducing the number of heads and removing layers with less effect, shared parameters, etc. ALBERT belongs to this category;
Quantization: For example, reduce float32 to float8;
Knowledge Distillation: Distill the ability of the teacher to the student. Generally, the student will be smaller than the teacher. We can distill a large and deep network to a small network, and we can also distill an integrated network to a small network.
Parameter sharing: By sharing parameters, the purpose of reducing network parameters is achieved, such as ALBERT sharing the Transformer layer;
Parameter matrix approximation: achieve the purpose of reducing matrix parameters by low-rank decomposition of the matrix or other methods;

insert image description here

Guess you like

Origin blog.csdn.net/qq_41318914/article/details/127720154

Knowledge Distillation (Deep Learning Model Compression)

[Model Compression] (4) - Knowledge Distillation

[Lightweight Deep Learning] Combination of Knowledge Distillation and NLP Language Model

[Learning] Network compression: knowledge distillation, parameter quantization, dynamic calculation, PPO

Deep learning model compression and accelerated model inference

【Learning】Deep reinforcement learning, model compression

[Model] compression algorithm distillation Summary

Model Compression - Cropping, Quantization, Distillation

Deep learning model (yolov5) compression

Knowledge Distillation Learning Record

Deep learning knowledge 6: (model quantitative compression) ---- pytorch custom Module, and understand the DoReFaNet network definition method through it.

Deep Learning Skills Application 31 - Practical application of knowledge distillation technology on the convolutional residual network ResNet, and load real data sets for distillation training

Deep learning concepts (terminology): Fine-tuning, Knowledge Distillation, etc.

(Waiting to fill the hole) Deep learning - distillation loss, distillation learning

Learning record of two-stage rain, snow and fog removal model based on knowledge distillation (1)

"Deep Learning Model Design: Core Algorithm and Case Practice" Knowledge Record

A review of Nanyang Technological University's latest visual language model: pre-training, transfer learning and knowledge distillation have everything

knowledge distillation

Deep Learning Image Compression Technology

Distillation model

Taking PaddleSlim as an example to explain deep learning instantiation model compression and quantification ----- work log (1.31)

Deep learning knowledge

[Deep Learning] - Informer Model

Model Design in Deep Learning

Model Deployment for Deep Learning

Deep learning model: transformer

[Model] Compression Deep Compression, mixing classic paper in various ways

Learning efficient object detection models with knowledge distillation paper notes

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence

Diffusion model-related paper reading, the combination of diffusion model and knowledge distillation to improve prediction speed: Progressive Distillation for Fast Sampling of Diffusion Models

Recommended

Linus is the most active in "eating dog food"!

Ranking

Share good programmer web front-end array and sorting, de-duplication and random roll call

Compilation error caused by cv_bridge and python version problems error: return-statement with no value, in function returning'void*' [-fpe

魔众帮助中心系统 v3.1.0 首页切换器，界面优化

Die beim Millimeterwellenradar-Integrationstest aufgetretene Grube (Multiprozessbindung an einen UDP-Port verursacht Probleme)

How to suppress the "requires transitive directive for an automatic module" warning properly?

LeetCode-1743. Restore the Array From Adjacent Pairs-Analysis and Code (Java)

Summer 2019 Summer soft essay 7 workers

Python中Assert断言的使用语法和例子

LeetCode one question per day (2021-2-3 sliding window median)

Fairchild, the ancestor of semiconductors, the legend of the first trillion-dollar start-up

Daily

More

2024-05-20(5)

2024-05-19(0)

2024-05-18(31)

2024-05-17(6)

2024-05-16(23)

2024-05-15(5)

2024-05-14(9)

2024-05-13(8)

2024-05-12(28)

2024-05-11(32)