Deep learning concepts (terminology): Fine-tuning, Knowledge Distillation, etc.

The relevant concepts here are based on existing pre-trained models, that is, the models themselves have been trained and have certain generalization capabilities. Need to be "reprocessed" to meet other task requirements.

In the post-GPT era, fine-tuning of models will also become a trend. I will take this opportunity to popularize related concepts.

1.Fine-tuning (fine-tuning)

Some people think that there is no difference between fine-tuning and training. They are both training models, but fine-tuning is targeted retraining based on the training of the original model. Fine-tuning generally uses additional data sets and reduces the learning rate to adapt the model to a specific task.

2.Transfer Learning

Transfer learning is generally about adapting the model to new tasks, which involves the improvement and retraining of the model. Fine-tuning can be thought of as a type of transfer learning.

Compared with fine-tuning, transfer learning often does not need to train the original model. You can only train part of it, or add 1-2 layers to the model, use the output of the meta-model as the input of transfer learning, and train additional parts.

3.Knowledge Distillation

The goal of KD is to use a small model to learn the ability of a large model and reduce the parameters and complexity of the model while ensuring baseline performance.

4.Meta Learning

Learning to Learning means learning to learn. This concept does not require a pre-trained model. Meta-learning means that the model learns various task data and then learns the commonalities of various tasks to adapt to new tasks.

Guess you like

Origin blog.csdn.net/JishuFengyang/article/details/132782541