The most comprehensive overview of Instruction Tuning for large language models: from data sets to technical analysis

The most comprehensive overview of Instruction Tuning for large language models: from data sets to technical analysis

Original DataLearner  DataLearner  2023-08-25 21:09  Published in Jiangsu

Included in the collection #big language model 37

The current large language model is mainly a pre-trained large model. After training on large-scale unsupervised data, it can complete many tasks after supervised fine-tuning and alignment. Nevertheless, in the face of applications in vertical fields, large models still need to be fine-tuned to obtain better application results. There are many ways to fine-tune large models, including instruction fine-tuning, supervised fine-tuning, and hint engineering. Among them, instruction tuning (Instruction Tuning) is the most important method to improve the controllability of the model, and there is currently no good data reference . Researchers from Zhejiang University, together with Shannon AI and other units, released a new review on instruction fine-tuning, describing in detail all aspects of instruction fine-tuning.

  • Introduction to Large Model Fine-tuning

  • Introduction to Instruction Tuning

  • Summary of commonly used data sets for instruction fine-tuning

  • Fine-tuning of instructions in different fields

  • Efficient instruction fine-tuning technology

Introduction to Large Model Fine-tuning

Previously, we have introduced three types of fine-tuning techniques for large models (practical cases illustrate the differences between the three fine-tuning techniques for large language models in the AI ​​era - Prompt-Tuning, Instruction-Tuning and Chain-of-Thought: https://www. datalearner.com/blog/1051681306547159). But in fact, fine-tuning of large models can be divided into many types.

From the parameter scale of fine-tuning, it can be simply divided into full-parameter fine-tuning and high-efficiency parameter fine-tuning. The former generally uses the pre-trained model as the initialization weight, continues training on a specific data set, and updates all parameters. The latter is expected to complete the update of model parameters with fewer resources, including updating only a part of the parameters or reducing the number of parameters for fine-tuning by placing some structural constraints on the parameters, such as sparsification or low-rank approximation.

If it is distinguished according to which stage of the model is used for fine-tuning, or according to the goal of model fine-tuning, it can also be distinguished from prompt fine-tuning, instruction fine-tuning, and supervised fine-tuning. This review article is mainly a review of instruction fine-tuning.

Introduction to Instruction Tuning

Instruction fine-tuning is a process by which LLMs are further trained on a dataset consisting of (instruction, output) pairs . where instruction represents the human instruction of the model, and output represents the desired output following the instruction. This process helps bridge the gap between the LLMs' next-word prediction goal and the user's goal of getting the LLMs to follow human instructions.

Instruction fine-tuning can be regarded as a special form of supervised fine-tuning (Supervised Fine-Tuning, SFT ) . However, their goals are still different. SFT is a process of fine-tuning a pre-trained model using labeled data so that the model can perform a specific task better. Instruction fine-tuning is a process of further training large language models (LLMs) on datasets including (instruction, output) pairs to enhance the capabilities and controllability of LLMs. What is special about instruction fine-tuning is the structure of its dataset, which consists of pairs of human instructions and desired outputs. This structure allows instruction fine-tuning to focus on getting the model to understand and follow human instructions.

In general, instruction fine-tuning is a special form of supervised fine-tuning that focuses on enhancing the capabilities and controllability of large language models by understanding and following human instructions . Although their goals and approaches are similar, the special data structures and task focus of instruction fine-tuning make it a unique subset of SFT.

Summary of commonly used data sets for instruction fine-tuning

In this review, the authors summarize 25 instruction fine-tuning datasets. And the data set of instruction fine-tuning is divided into the following three categories:

Generalization to Unseen Tasks
Such datasets often contain diverse tasks, each with specialized instructions and data samples. Models trained on such datasets can generalize to new tasks that have not been seen before.

Follow user commands in a single round
This type of dataset contains commands and their corresponding responses, and is used to train a model to respond to user commands in a single round. After training, the model can understand commands and respond.

Helping Like a Human
This dataset contains multiple rounds of small talk conversations. After training, the model can perform multiple rounds of interaction, offering assistance like a human.

Generally speaking, the first type of data set focuses on task generalization ability, the second type focuses on single-round instruction comprehension ability, and the third type focuses on continuous multi-round dialogue ability. Researchers can choose different types of data sets for instruction tuning according to the required model capabilities.

All 25 instruction fine-tuning datasets for large language models are listed as follows:

picture

Fine-tuning of instructions in different fields

In fact, many fields have a need for instruction fine-tuning of large models, but the instruction fine-tuning requirements of different fields may also be different.

This review summarizes the fine-tuning of large model instructions in 8 different fields, as shown in the following table:

Ok, the comparison of instruction tuning in all 8 fields has been completed:

picture

Efficient instruction fine-tuning technology

The main purpose of efficient instruction fine-tuning technology is to use a small number of parameter updates to make the model achieve the training effect. In fact, it is basically consistent with the fine-tuning of university parameters in supervised learning or in large language models. Efficient fine-tuning techniques mainly include the following categories:

picture

Generally speaking, the current efficient instruction tuning technology mainly reduces calculation and memory consumption by reducing the number of parameters, gradient compression, and quantization. These methods are very effective in reducing resource occupation, but there are also certain shortcomings, such as accuracy loss, convergence stability and other issues.

Guess you like

Origin blog.csdn.net/sinat_37574187/article/details/132518045