Example of Modified ChatGLM-6B Self-Recognition
1. Environment configuration
First of all, you need to prepare a computing device with sufficient performance. It is recommended to use the Unix operating system. The recommended running configuration for this framework is described in the following table:
equipment | minimum configuration | Recommended configuration |
---|---|---|
processor | Intel i7 | Intel Xeon |
running memory | 16GB | 32GB |
graphics card memory | 12GB | 24GB |
HDD size | 10GB | 20GB |
In this case, the default computer has been configured with CUDA computing environment. If there is a problem with CUDA environment configuration, you can pay attention to the Docker installation package that will be released after this project.
We recommend using the Conda virtual environment to install dependencies to avoid breaking dependencies of external projects. Run the following commands to create a Conda virtual environment and install Python dependencies:
git clone https://github.com/hiyouga/ChatGLM-Efficient-Tuning.git conda create -n chatglm_etuning python=3.10 conda activate chatglm_etuning cd ChatGLM-Efficient-Tuning pip install -r requirements.txt
Note: This project will obtain the latest ChatGLM-6B model online by default . Its version may affect the code effect or even report an error. It is recommended to use the latest official model version file. If the network cannot be connected frequently, please try to use the following command to download the official model, and add parameters every time training or inference
--model_name_or_path [本地 ChatGLM-6B 模型路径]
.
git lfs install git clone -b v0.1.0 https://huggingface.co/THUDM/chatglm-6b
2. Data set preparation
In this case, we use the data set , which contains 18 pieces of data about the self-awareness of the model. Our goal is to modify the knowledge of the model itself so that the model can give the answer we want . Please check the content of the data set , here we list two examples. self_cognition
data/self_cognition.json
[ { "instruction": "What is your identity?", "input": "", "output": "My name is ChatGLM-6B, an artificial intelligence assistant trained and developed independently by [NAME] in 2023. My main goal is to assist users in solving problems and fulfilling their needs." }, { "instruction": "Can you tell me your identity information?", "input": "", "output": "Of course, I am ChatGLM-6B, an artificial intelligence assistant created by [NAME]. I will complete the research and development in 2023 to provide targeted answers and assistance to users." } ]
Tip: You can make [NAME]
the model answer that it was created by you by substituting your own name.
Note: There are more than ten instruction data sets built in this framework, please move to the data folder for a brief introduction. At the same time, the framework supports user-supplied custom datasets, please make sure your datasets are in the same format as the files
data/example_dataset
inexample_dataset.json
. Among them,instruction
term andoutput
term are required to ensure that the supervised fine-tuning (SFT) of the model can work properly.
3. Model supervised fine-tuning
Run the following command for model supervised fine-tuning on a single GPU. We use self_cognition
the dataset with lora
a fine-tuning method, and the fine-tuned model is saved in cognition
a folder. In order to ensure the success of model fine-tuning, we use a learning rate of 0.001 and train 10 epochs on the dataset.
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ --stage sft \ --do_train \ --dataset self_cognition \ --finetuning_type lora \ --output_dir cognition \ --overwrite_cache \ --per_device_train_batch_size 2 \ --gradient_accumulation_steps 2 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1000 \ --warmup_steps 0 \ --learning_rate 1e-3 \ --num_train_epochs 10.0 \ --fp16
The framework operation log is shown in the figure below.
4. Model effect test
Run the following command to test the model effect on a single GPU. It will load the cognition
fine-tuned model weights saved in the folder and merge them into the parameter weights of the original ChatGLM-6B model, and start the streaming interactive window.
CUDA_VISIBLE_DEVICES=0 python src/cli_demo.py \ --checkpoint_dir cognition
Asking the fine-tuned ChatGLM-6B model some self-awareness questions, we can see that it gives the answers we expect. At the same time, we also tested two additional questions, and the verification results showed that the original knowledge of the model was not severely damaged .
In order to compare the effect, we tested the answers of the original ChatGLM-6B model at the same time. The picture below shows the answers of the original model, and the answers about self-awareness are significantly different from those in the picture above.
5. Model Deployment
If you want to deploy the fine-tuned model in your project framework, use export_model.py
Merge fine-tuned weights into ChatGLM-6B model and export the full model.
python src/export_model.py \ --checkpoint_dir cognition \ --output_dir path_to_save_model
You can independently deploy the fine-tuned model in any project by calling it like the following code.
from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained(path_to_save_model, trust_remote_code=True) model = AutoModel.from_pretrained(path_to_save_model, trust_remote_code=True).half().cuda() response, history = model.chat(tokenizer, "Who are you", history=[]) print(response)