Self-directives: Align LM with self-generated directives from git homepage

Self-instructions: Align LM with self-generated instructions

GitHub - yizhongw/self-instruct: Aligning pretrained language models with instruction data generated by themselves.

This repository contains code and data for the Self-Instruct paper, a method for aligning pretrained language models with instructions.

introduce

Self-Instruct is a framework that helps language models improve their ability to follow natural language instructions . It does this by using the model's own generation to create a large amount of teaching data . Through self-instruction, the instruction tracking ability of language models can be improved without relying on extensive manual annotations.

background

In recent years, there has been increasing interest in building models that can follow natural language instructions to perform various tasks. These models, known as "instruction-tuned" language models, have demonstrated the ability to generalize to new tasks. However, their performance depends heavily on the quality and quantity of human-written instruction data used to train them, which can be limited in diversity and creativity. To overcome these limitations, it is important to develop alternative methods to supervise instruction adjustment models and improve their instruction tracking capabilities.

How does self-study work?

The self-instruction process is an iterative bootstrapping algorithm that starts with a manually written seed set of instructions and uses them to prompt the language model to generate new instructions and corresponding input-output instances. These builds are then filtered to remove low-quality or similar builds, and the generated data is added back to the task pool. This process can be repeated many times, resulting in a large amount of teaching data that can be used to fine-tune the language model to follow instructions more effectively.

Here is an overview of the self-guidance:

Pipeline for generating instruction data from the language model itself.

usage

* This work is still in progress. We may update the code and data as we progress. Be careful with version control.

Order adjustments using our self-order data

We release a dataset of 52k instructions paired with 82K instance inputs and outputs. These instruction data can be used to perform instruction tuning on the language model, so that the language model can better follow the instructions. The data generated by the entire model can be accessed in data/gpt3-generations/batch_221203/all_instances_82K.jsonl. Data (+ 175 seed tasks) reformatted in clean GPT3 fine-tuning format (hints + completions) is put in data/finetuning/self_instruct_221203. You can use the script in ./scripts/finetune_gpt3.sh to fine-tune GPT3 on this data.

Note : This data is generated by a language model (GPT3) and inevitably contains some errors or biases. We analyzed the data quality of 200 random instructions in the paper and found that 46% of the data points were potentially problematic. We encourage users to use this data with caution and to come up with new ways to filter or improve bugs.

Assess ability to follow instructions

We also publish a new set of 252 expert-written tasks and their instructions driven by user-facing applications rather than well-researched NLP tasks. This data was used for the human evaluation portion of the self-study paper . See the Human Evaluation Readme for more details.

Generate self-guided data from scratch

To generate self-guided data using your own seed tasks or other models, we've open-sourced the scripts for the entire pipeline here. Our current code has only been tested on the GPT3 model accessible through the OpenAI API.

Here is the script that generates the data:

# 1. Generate instructions from the seed tasks
./scripts/generate_instructions.sh

# 2. Identify whether the instruction represents a classification task or not
./scripts/is_clf_or_not.sh

# 3. Generate instances for each instruction
./scripts/generate_instances.sh

# 4. Filtering, processing, and reformatting
./scripts/prepare_for_finetuning.sh

Guess you like

Origin blog.csdn.net/sinat_37574187/article/details/131985382