Tsinghua Open Source Chinese-English Bilingual Dialogue Model ChatGLM2-6B Local Installation Notes
First of all, go directly to the resources, the ChatGLM2-6B source code and model files are in the network disk:
Link: https://pan.baidu.com/s/1DciporsVT-eSiVIAeU-YmQ
Extraction code: cssa
The official Readme is already very detailed, and writing some installation blogs is a bit superfluous. In line with the original intention of recording your own work content, let's write it down, after all, output is the best learning.
This article records the process of installing ChatGLM2-6B locally. A RTX4070 12G graphics card is used locally. It is a bit reluctant for this model, but in actual operation, it is found that the Win11 system has shared GPU storage, which makes my 12G graphics card useless. The 13G model is running, and it seems to be running normally, and there is no problem of bursting the video memory. The official also provides the int4 model, which can run on 6G video memory. Since it is already running on my side, I am not considering using the int4 model. The picture below shows the GPU usage after my model is loaded, which is amazing. . .
1. Conda copy virtual environment
conda creat -n new_env_name --copy old_env_name
Create the ChatGLM operating environment. Because ChatGLM uses the pytorch framework, use Conda to copy a Pytorch virtual environment so that it will not cause damage to other environmental dependencies after the required dependencies are installed.
2. Add PIP mirror source
Since the installation dependency needs to use pip, in order to speed up access, configure the PIP mirror source
Configure the PIP image source to switch to the corresponding virtual environment, such asconda activate chatglm2
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
Also, there are other mirror sources available
阿里云 https://mirrors.aliyun.com/pypi/simple/
豆瓣 http://pypi.douban.com/simple/
清华大学 https://pypi.tuna.tsinghua.edu.cn/simple/
中国科学技术大学 http://pypi.mirrors.ustc.edu.cn/simple/
The work of conda configuring the mirror source has been written in the previous Pytorch installation, you can refer to it.
3. Download code
- Clone the code on Github
git clone https://github.com/lordum/ChatGLM2-6B.git
4. Download the model
During the process of starting the model, the program will automatically download the model bin file from the Huging Face official website, but due to network reasons, the download may fail. The official website provides a Tsinghua download address (https://cloud.tsinghua.edu.cn /d/674208019e314311ab5c/), but this address only has models, and many configuration files will not be found, so you need to go to Hugging Face official website to download (https://huggingface.co/THUDM/chatglm2-6b/tree/main); if you run it directly You may experience the problems listed in Question 1 below. I package all the files here and put them on the Baidu network disk (link: https://pan.baidu.com/s/1DciporsVT-eSiVIAeU-YmQ extraction code: cssa ), you can pick up what you need.
After downloading, copy all the models and configuration files under the model directory to the code directory. There is no need to create a new folder or the like, just put them together with the original code.
5. Dependency installation
The first is to install the requirement. The dependencies in requirement.txt can be adjusted according to your own environment. For example, because I have copied the pytorch environment, I can delete the pytorch line. The pip installation here is also a domestic mirror source, which can greatly improve the installation speed. The addition method has been introduced earlier.
The official recommended transformers
library version 4.30.2
is torch
recommended as 2.0 or higher to obtain the best inference performance.
pip install -r requirements.txt
6. Application
The official document provides a variety of application methods. We generally choose the web interface to run. Two web_demos are provided in the official code. web_demo.py is in the form of a question and an answer. The answer is given in a unified way. web_demo2.py is a streaming answer , the answer is streamed. It should be noted that the necessary dependencies must be installed in both ways; in order to be more practical, we choose web_demo2.py to run.
@AdamBear implemented a web version of Demo based on Streamlit web_demo2.py
. When using it, you first need to install the following dependencies:
pip install streamlit streamlit-chat
The model runs by default to Hugging Face to pull the model file. If you use a local file, you need to modify the model loading statement:
Change lines 15 and 16 to the local code path to load
Then run with the following command:
streamlit run web_demo2.py
After testing, if the input prompt is longer, using the Web version Demo based on Streamlit will be smoother.
installation problems
1. ValueError: Unrecognized configuration class <class ‘transformers_modules.chatglm-6b.co
Problem description
During the process of deploying ChatGLM, the following bug information appeared.
ValueError: Unrecognized configuration class <class 'transformers_modules.chatglm-6b.configuration_chatglm.ChatGLMConfig'> to build an AutoTokenizer.
The cause of the problem
I downloaded the model parameters through the Tsinghua cloud disk (the details are as follows), and many important configuration files are missing.
solution
The complete configuration file can be downloaded through the Hagging Face official website, and then placed in the same location as the model parameters.
https://huggingface.co/THUDM/chatglm2-6b/tree/main
Reference article: https://blog.csdn.net/weixin_40964597/article/details/131074884