[AI] Tsinghua Open Source Chinese-English Bilingual Dialogue Model ChatGLM2-6B Local Installation Notes

15929091:

Tsinghua Open Source Chinese-English Bilingual Dialogue Model ChatGLM2-6B Local Installation Notes

First of all, go directly to the resources, the ChatGLM2-6B source code and model files are in the network disk:

Link: https://pan.baidu.com/s/1DciporsVT-eSiVIAeU-YmQ
Extraction code: cssa

The official Readme is already very detailed, and writing some installation blogs is a bit superfluous. In line with the original intention of recording your own work content, let's write it down, after all, output is the best learning.

This article records the process of installing ChatGLM2-6B locally. A RTX4070 12G graphics card is used locally. It is a bit reluctant for this model, but in actual operation, it is found that the Win11 system has shared GPU storage, which makes my 12G graphics card useless. The 13G model is running, and it seems to be running normally, and there is no problem of bursting the video memory. The official also provides the int4 model, which can run on 6G video memory. Since it is already running on my side, I am not considering using the int4 model. The picture below shows the GPU usage after my model is loaded, which is amazing. . .

insert image description here

1. Conda copy virtual environment

conda creat -n new_env_name --copy old_env_name

Create the ChatGLM operating environment. Because ChatGLM uses the pytorch framework, use Conda to copy a Pytorch virtual environment so that it will not cause damage to other environmental dependencies after the required dependencies are installed.

2. Add PIP mirror source

Since the installation dependency needs to use pip, in order to speed up access, configure the PIP mirror source

Configure the PIP image source to switch to the corresponding virtual environment, such asconda activate chatglm2

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

Also, there are other mirror sources available

阿里云 https://mirrors.aliyun.com/pypi/simple/
豆瓣 http://pypi.douban.com/simple/
清华大学 https://pypi.tuna.tsinghua.edu.cn/simple/
中国科学技术大学 http://pypi.mirrors.ustc.edu.cn/simple/

The work of conda configuring the mirror source has been written in the previous Pytorch installation, you can refer to it.

3. Download code

  1. Clone the code on Github
git clone https://github.com/lordum/ChatGLM2-6B.git

4. Download the model

During the process of starting the model, the program will automatically download the model bin file from the Huging Face official website, but due to network reasons, the download may fail. The official website provides a Tsinghua download address (https://cloud.tsinghua.edu.cn /d/674208019e314311ab5c/), but this address only has models, and many configuration files will not be found, so you need to go to Hugging Face official website to download (https://huggingface.co/THUDM/chatglm2-6b/tree/main); if you run it directly You may experience the problems listed in Question 1 below. I package all the files here and put them on the Baidu network disk (link: https://pan.baidu.com/s/1DciporsVT-eSiVIAeU-YmQ extraction code: cssa ), you can pick up what you need.

After downloading, copy all the models and configuration files under the model directory to the code directory. There is no need to create a new folder or the like, just put them together with the original code.

5. Dependency installation

The first is to install the requirement. The dependencies in requirement.txt can be adjusted according to your own environment. For example, because I have copied the pytorch environment, I can delete the pytorch line. The pip installation here is also a domestic mirror source, which can greatly improve the installation speed. The addition method has been introduced earlier.

The official recommended transformerslibrary version 4.30.2is torchrecommended as 2.0 or higher to obtain the best inference performance.

pip install -r requirements.txt

6. Application

The official document provides a variety of application methods. We generally choose the web interface to run. Two web_demos are provided in the official code. web_demo.py is in the form of a question and an answer. The answer is given in a unified way. web_demo2.py is a streaming answer , the answer is streamed. It should be noted that the necessary dependencies must be installed in both ways; in order to be more practical, we choose web_demo2.py to run.

@AdamBear implemented a web version of Demo based on Streamlit web_demo2.py. When using it, you first need to install the following dependencies:

pip install streamlit streamlit-chat

The model runs by default to Hugging Face to pull the model file. If you use a local file, you need to modify the model loading statement:

insert image description here

Change lines 15 and 16 to the local code path to load

Then run with the following command:

streamlit run web_demo2.py

After testing, if the input prompt is longer, using the Web version Demo based on Streamlit will be smoother.

insert image description here

insert image description here

installation problems

1. ValueError: Unrecognized configuration class <class ‘transformers_modules.chatglm-6b.co

Problem description
During the process of deploying ChatGLM, the following bug information appeared.
ValueError: Unrecognized configuration class <class 'transformers_modules.chatglm-6b.configuration_chatglm.ChatGLMConfig'> to build an AutoTokenizer.

The cause of the problem
I downloaded the model parameters through the Tsinghua cloud disk (the details are as follows), and many important configuration files are missing.

insert image description here

solution

The complete configuration file can be downloaded through the Hagging Face official website, and then placed in the same location as the model parameters.

https://huggingface.co/THUDM/chatglm2-6b/tree/main

insert image description here

Reference article: https://blog.csdn.net/weixin_40964597/article/details/131074884

Guess you like

Origin blog.csdn.net/zhoulizhu/article/details/131501353