Record the process of deploying the ChatGLM large language model

1. What is ChatGLM:

ChatGLM-6B is an open source, Chinese-English bilingual dialogue language model, based on the General Language Model architecture, with 6.2 billion parameters. Combined with model quantization technology, users can deploy locally on consumer-grade graphics cards (only 6GB of video memory is required at the INT4 quantization level). ChatGLM-6B uses technology similar to ChatGPT, optimized for Chinese Q&A and dialogue. After about 1T identifiers of Chinese-English bilingual training, supplemented by supervision and fine-tuning, feedback self-help, human feedback reinforcement learning and other technologies, ChatGLM-6B with 6.2 billion parameters has been able to generate answers that are quite in line with human preferences.
However, due to the small size of ChatGLM-6B, it is currently known to have considerable limitations, such as factual/mathematical logic errors, possible generation of harmful/biased content, weak contextual ability, self-awareness confusion, and Generate content that completely contradicts Chinese instructions for English instructions. A larger ChatGLM based on the 130 billion parameter GLM-130B is in beta development.


2. Get the code:

Github address: https://github.com/THUDM/ChatGLM-6B

It can be obtained through git, or directly download the zip source code, this time the git method.

 ​git clone https://github.com/THUDM/ChatGLM-6B.git

3. Configure the environment:

3.1 Configure the graphics card driver and cuda:

It has been configured before, supports pytorch, paddle and other environments, and will not be added.

3.2 Install anaconda:

It has been configured before, and there are many online tutorials, so I won’t say more.

3.3 Configure a standalone environment for chatGLM

conda  create --name  chatglm python=3.8

After the environment is created, activate the environment

conda activate chatglm 

3.4 Install dependent packages

Open the file location and modify the directory according to your location

cd /home/houshouzan/chatglm/ChatGLM-6B/

Install dependent packages, if necessary, add Tsinghua mirror address https://pypi.tuna.tsinghua.edu.cn/simple

pip install -r requirements.txt

4. Download the model:

4.1 Method 1, download huggingface_hub elegantly:

Install huggingface_hub

pip install huggingface_hub

Create a folder under ./ChatGLM-6B/./ChatGLM-6B/chatglm-6b/ for storing local models

mkdir chatglm-6b

Enter the chatglm-6b environment, enter the python terminal

conda activate chatglm-6b
python

Call huggingface_hub to download the ChatGLM-6B model to the specified local path

from huggingface_hub import snapshot_download
snapshot_download(repo_id="THUDM/chatglm-6b", local_dir="./chatglm-6b/")

The download process will be interrupted, and the download can usually be completed after a few more attempts.

4.2 Method 2, download from the official website:

Official website address: https://huggingface.co/THUDM/chatglm-6b/tree/main

insert image description here
All downloads are required.

4.2 Method 3, download through Thunder and other tools:

Due to the slow download, this method is also used in this download process, and finally uploaded to the server, it is also troublesome, and the first method is recommended.

5. Experience ChatGLM:

The demo provides two methods, command line and web page.

5.1 Command line mode:

It is necessary to modify the location where the model is loaded in the source code. I downloaded the previous model to the chatglm-6b folder, so modify the location of the cli_demo.py file as follows, modify and save:
insert image description here
run the command line code

python cli_demo.py

The effect is as follows:
insert image description here

5.2 Web page format:

Modify web_demo.py to load the model code, and modify it to the location of your own model.
insert image description here
Run web_demo.py, the default port is 7860, to see the effect.
insert image description here

Guess you like

Origin blog.csdn.net/h363924219/article/details/131110674