Tsinghua GLM deployment record

Environment deployment

  1. First install anaconda (it is recommended that package management is more convenient). Windows users need to manually configure the environment variables. The following default instructions are in the ubuntu environment.
  2. Create a python environment, conda create -n your_env_name python=3.10 (Note: The official version is python3.8, but currently the mainstream uses python3.10, so use python3.10. your_env_name is the environment name that you can name yourself)
  3. Import the required packages. The reference for the package version is in the requirements.txt in the attachment. You can use pip install -r requirements.txt

Pull github project

  1. git clone https://github.com/chatchat-space/langchain-ChatGLM.git

  2. cd langchain-ChatGLM

  3. Start webui: If it is ubuntu, you can run the command directly

    python webui.py 
    

    Start the api interface: If it is ubuntu, you can run the command directly

    python api.py
    

    To start multiple cards, specify CUDA_VISIBLE_DEVICES=0,1 in front of the command, similar to this:

    CUDA_VISIBLE_DEVICES=0,1,2,3 python api.py
    

Possible problems

  • By default, when running the command directly, the model will be downloaded from huggingface for the first time. You may encounter network connection problems. The solution is as follows:

    • Rerun. Because there is a breakpoint and reconnection, just keep rerunning until the model weights are downloaded. The default address is ~/.cache/huggingface/hub/models–model name.

    • Open huggingface, search for the model to be downloaded, and download the models locally in Files and versions. The directory structure is

      . . . .
      └── Added_tokens.json
      ├──
      config.json
      ├── configuration_codet5p_embedding.py
      ├── merges.txt
      ├── modeling_codet5p_embedding dding.py
      ├── pytorch_model.bin
      ├── special_tokens_map.json
      ├─ ─ tokenizer.json
      ├── tokenizer_config.json
      └── vocab.json

      Then configure the local_model_path field in langchain-ChatGLM/config/model_config.py to be the path to the model folder

More instructions

You can use fastchat deployment to make the model and api on different servers (or the same server)

Reference connection https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md#restful-api-server

Supongo que te gusta

Origin blog.csdn.net/Climbman/article/details/133457936
Recomendado
Clasificación