This time, the open source ChatGLM of Tsinghua University was built. source address . Model address .
1. Turn on BBR acceleration
How to enable BBR acceleration can be read in my article, Linux enables kernel BBR acceleration .
2. Pull ChatGLM source code and ChatGLM model
Click here to jump to the source code .
Click here to jump to the model download .
I created a directory here to store ChatGLM-related content before downloading.
cd /opt
mkdir ChatGLM
cd ChatGLM
After entering the ChatGLM directory, you can download the ChatGLM source code.
git clone https://github.com/lukeewin/ChatGLM-6B.git
Then we also need to download the model file. And the model is relatively large, so we need to install git-lfs before downloading the model file.
apt install git-lfs
After the installation is complete, we first create a directory to store model files, here I /opt/ChatGLM
create a directory under the path.
mkdir model
cd model
Then we can download the model data.
git lfs install
git clone https://huggingface.co/THUDM/chatglm-6b-int4
At this point, the ChatGLM source code and corresponding models have been cloned to the server.
3. Modify the configuration
Before modifying the configuration, we also need to install cuda.
apt install nvidia-cuda-toolkit
Then modify requirements.txt
the content in the source code, and add the following three statements at the end.
chardet
streamlit
streamlit-chat
Then use the pip command to install the relevant library.
pip install -r requirements.txt
Then, we also have to modify web_demo2.py
the file.
Modify the following two places to use absolute paths.
Change the values of the above two places to the path of your own model, be sure to use an absolute path.
tokenizer = AutoTokenizer.from_pretrained("你自己模型的路径", trust_remote_code=True)
model = AutoModel.from_pretrained("你自己模型的路径", trust_remote_code=True).half().cuda()
Then we open a port as the external access port of the web.
ufw allow 8080/tcp
What I have opened here is port 8080.
You can also use the following command to check the currently opened ports before opening them.
ufw status
4. Start the project
python3 -m streamlit run ./web_demo2.py --server.port 8080
Then visit ip:8080
to see the effect.
5. Effect
Both Chinese and English can be used for communication.
If you like this article, remember to forward, like, and bookmark.
6. Source code and model download
Click here to download the source code
Click here to download the model
7. Video Tutorial
Building ChatGLM Based on Cloud Services
The full content can be viewed by clicking here .