Environment deployment
- First install anaconda (it is recommended that package management is more convenient). Windows users need to manually configure the environment variables. The following default instructions are in the ubuntu environment.
- Create a python environment, conda create -n your_env_name python=3.10 (Note: The official version is python3.8, but currently the mainstream uses python3.10, so use python3.10. your_env_name is the environment name that you can name yourself)
- Import the required packages. The reference for the package version is in the requirements.txt in the attachment. You can use pip install -r requirements.txt
Pull github project
-
git clone https://github.com/chatchat-space/langchain-ChatGLM.git
-
cd langchain-ChatGLM
-
Start webui: If it is ubuntu, you can run the command directly
python webui.py
Start the api interface: If it is ubuntu, you can run the command directly
python api.py
To start multiple cards, specify CUDA_VISIBLE_DEVICES=0,1 in front of the command, similar to this:
CUDA_VISIBLE_DEVICES=0,1,2,3 python api.py
Possible problems
-
By default, when running the command directly, the model will be downloaded from huggingface for the first time. You may encounter network connection problems. The solution is as follows:
-
Rerun. Because there is a breakpoint and reconnection, just keep rerunning until the model weights are downloaded. The default address is ~/.cache/huggingface/hub/models–model name.
-
Open huggingface, search for the model to be downloaded, and download the models locally in Files and versions. The directory structure is
. . . .
└── Added_tokens.json
├──
config.json
├── configuration_codet5p_embedding.py
├── merges.txt
├── modeling_codet5p_embedding dding.py
├── pytorch_model.bin
├── special_tokens_map.json
├─ ─ tokenizer.json
├── tokenizer_config.json
└── vocab.jsonThen configure the local_model_path field in langchain-ChatGLM/config/model_config.py to be the path to the model folder
-
More instructions
You can use fastchat deployment to make the model and api on different servers (or the same server)
Reference connection https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md#restful-api-server