Ubuntu deploys ChatGLM2-6B stepping pit record

Table of contents

1. Environment configuration

1. Install anaconda or miniconda for environment management

2. Install CUDA

3. Environment installation

2. Configure the loading model

1. Create a THUDM folder

3. Problems encountered

1、pip install -r requirements.txt

 2. The error encountered when running python web_demo.py - TypeError: Descriptors cannot not be created directly.

 3. The error encountered when running python web_demo.py - AttributeError: module 'numpy' has no attribute 'object'.

4. Web version Demo

Web version Demo based on Streamlit

5. Command line Demo

6. Summary


Preface: ChatGLM 2 -6B is the second-generation version of the open source Chinese-English bilingual dialogue model ChatGLM-6B. On the basis of retaining many excellent features of the first generation model, such as smooth dialogue and low deployment threshold, ChatGLM 2 -6B has more powerful performance, longer context, higher inference, and a more open protocol.

Project warehouse link: https://github.com/THUDM/ChatGLM2-6B

1. Environment configuration

1. Install anaconda or miniconda for environment management

Installation link: ubuntu installs Miniconda_Baby_of_breath's blog - CSDN Blog

2. Install CUDA

Install CUDA11.3 on Ubuntu - Programmer Sought

3. Environment installation

git clone https://github.com/THUDM/ChatGLM2-6B  #下载仓库
cd ChatGLM2-6B  #进入文件夹

#创建conda环境
conda create -n chatglm python==3.8
conda activate chatglm  #进入创建的conda环境

#使用pip安装所需要的依赖项
pip install -r requirements.txt

2. Configure the loading model

1. Create a THUDM folder

mkdir THUDM #在ChatGLM2-6B项目文件夹内建立

mkdir chatglm2-6b  #将下载的模型和配置文件全部放入到这文件夹中

#文件位置浏览
/home/wxy/ChatGLM2-6B/THUDM/chatglm2-6b

Then download all the model files and configuration files in huggingface . It is recommended to download them manually and put them into ChatGLM2-6B/THUDM/chatglm2-6b.

Download the model configuration file in the above huggingface

3. Problems encountered

1、pip install -r requirements.txt

When pip installs requirements.txt, you may encounter some dependencies that cannot be installed, as shown in the following figure:

Workaround: Direct pip missing dependencies

pip install oauthlib==3.0.0
pip install tensorboard==1.15
pip install urllib3==1.25.0
pip install requests-oauthlib==1.3.1
pip install torch-tb-profiler==0.4.1 
pip install google-auth==2.18.0 

 2. The error encountered when running python web_demo.py - TypeError: Descriptors cannot not be created directly.

TypeError: Descriptors cannot not be created directly.

The error TypeError in the screenshot above appears: Descriptors cannot not be created directly. Indicates that protobuf the version problem of the library is caused. The error message mentions that the code needs to be regenerated with a protoc version greater than or equal to 3.19.0.

Solution:

pip uninstall protobuf  #卸载protobuf

pip install protobuf==3.19.0  #重新安装3.19.0版本的

 3. The error encountered when running python web_demo.py - AttributeError: module 'numpy' has no attribute 'object'.

AttributeError: module 'numpy' has no attribute 'object'

 If there is an error of AttributeError: module 'numpy' has no attribute 'object', the solution is as follows:

pip uninstall numpy  #卸载numpy

pip install numpy==1.23.4  #安装numpy1.23.4

4. Web version Demo

First install Gradio:

pip install gradio

Then run web_demo.py in the warehouse to display the following web page

python web_demo.py

Web version Demo based on Streamlit

5. Command line Demo

Run cli_demo.py in the warehouse, and the following page will be displayed in the terminal. The program will conduct an interactive dialogue on the command line, enter instructions on the command line and press Enter to generate a reply, enter clear to clear the dialogue history, and enter stop to terminate the program.

python cli_demo.py

6. Summary

Generally speaking, the performance of ChatGLM2-6B is still very good, the reasoning speed is also very fast, and the output space is more logical than that of gpt, but it will take up time when performing web demo and command line demo. There is a lot of video memory, so it consumes a lot of calculations. It takes up about 13GB of video memory, but the warehouse also provides low-cost deployment, which is more humane. Since the graphics card I use is RTX3090 and the video memory is 24GB, I didn't carry out low-cost deployment. If you are interested, you can give it a try.

Guess you like

Origin blog.csdn.net/weixin_59961223/article/details/131538926