Modified ChatGLM-6B self-awareness example ChatGLM Efficient Tuning

Example of Modified ChatGLM-6B Self-Recognition

1. Environment configuration

First of all, you need to prepare a computing device with sufficient performance. It is recommended to use the Unix operating system. The recommended running configuration for this framework is described in the following table:

equipment minimum configuration Recommended configuration
processor Intel i7 Intel Xeon
running memory 16GB 32GB
graphics card memory 12GB 24GB
HDD size 10GB 20GB

In this case, the default computer has been configured with  CUDA  computing environment. If there is a problem with CUDA environment configuration, you can pay attention to the Docker installation package that will be released after this project.

We recommend using  the Conda  virtual environment to install dependencies to avoid breaking dependencies of external projects. Run the following commands to create a Conda virtual environment and install Python dependencies:

git clone https://github.com/hiyouga/ChatGLM-Efficient-Tuning.git
conda create -n chatglm_etuning python=3.10
conda activate chatglm_etuning
cd ChatGLM-Efficient-Tuning
pip install -r requirements.txt

Note: This project will obtain the latest ChatGLM-6B model online by default . Its version may affect the code effect or even report an error. It is recommended to use the latest official model version file. If the network cannot be connected frequently, please try to use the following command to download the official model, and add parameters every time training or inference  --model_name_or_path [本地 ChatGLM-6B 模型路径] .

git lfs install
git clone -b v0.1.0 https://huggingface.co/THUDM/chatglm-6b

2. Data set preparation

In this case, we use  the data set , which contains 18 pieces of data about the self-awareness of the model. Our goal is to modify the knowledge of the model itself so that the model can give the answer we want . Please check the content of the data set  , here we list two examples. self_cognitiondata/self_cognition.json

[
    {
      "instruction": "What is your identity?",
      "input": "",
      "output": "My name is ChatGLM-6B, an artificial intelligence assistant trained and developed independently by [NAME] in 2023. My main goal is to assist users in solving problems and fulfilling their needs."
    },
    {
      "instruction": "Can you tell me your identity information?",
      "input": "",
      "output": "Of course, I am ChatGLM-6B, an artificial intelligence assistant created by [NAME]. I will complete the research and development in 2023 to provide targeted answers and assistance to users."
    }
]

Tip: You can make  [NAME] the model answer that it was created by you by substituting your own name.

Note: There are more than ten instruction data sets built in this framework, please move to the  data  folder for a brief introduction. At the same time, the framework supports user-supplied custom datasets, please make sure your datasets are in the same format as the  files data/example_dataset in  example_dataset.json . Among them,  instruction term and  output term are required to ensure that the supervised fine-tuning (SFT) of the model can work properly.

3. Model supervised fine-tuning

Run the following command for model supervised fine-tuning on a single GPU. We use  self_cognition the dataset with  lora a fine-tuning method, and the fine-tuned model is saved in  cognition a folder. In order to ensure the success of model fine-tuning, we use a learning rate of 0.001 and train 10 epochs on the dataset.

CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
    --stage sft \
    --do_train \
    --dataset self_cognition \
    --finetuning_type lora \
    --output_dir cognition \
    --overwrite_cache \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 2 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 1000 \
    --warmup_steps 0 \
    --learning_rate 1e-3 \
    --num_train_epochs 10.0 \
    --fp16

The framework operation log is shown in the figure below.

 

1.jpg

4. Model effect test

Run the following command to test the model effect on a single GPU. It will load the  cognition fine-tuned model weights saved in the folder and merge them into the parameter weights of the original ChatGLM-6B model, and start the streaming interactive window.

CUDA_VISIBLE_DEVICES=0 python src/cli_demo.py \
    --checkpoint_dir cognition

Asking the fine-tuned ChatGLM-6B model some self-awareness questions, we can see that it gives the answers we expect. At the same time, we also tested two additional questions, and the verification results showed that the original knowledge of the model was not severely damaged .

 

2.jpg

In order to compare the effect, we tested the answers of the original ChatGLM-6B model at the same time. The picture below shows the answers of the original model, and the answers about self-awareness are significantly different from those in the picture above.

 

3.jpg

5. Model Deployment

If you want to deploy the fine-tuned model in your project framework, use  export_model.py Merge fine-tuned weights into ChatGLM-6B model and export the full model.

python src/export_model.py \
    --checkpoint_dir cognition \
    --output_dir path_to_save_model

You can independently deploy the fine-tuned model in any project by calling it like the following code.

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained(path_to_save_model, trust_remote_code=True)
model = AutoModel.from_pretrained(path_to_save_model, trust_remote_code=True).half().cuda()
response, history = model.chat(tokenizer, "Who are you", history=[])
print(response)

Guess you like

Origin blog.csdn.net/sinat_37574187/article/details/131978489