[bachuan2 model deployment experience] Teach you step by step how to install and use the baichuan2-7b-chat model on a Linux server (model download + environment configuration + error analysis)


Because we need to test the performance of the baichuan2 model on a given data set, we started the journey of deploying the baichuan2 model on the Linux system.

1. Model download

The sample code given on baichuan2'sgithub is very simple. You can use the line AutoModelForCausaLLM.from_pretrained (model name) directly. However,
Insert image description here
I don’t know that huggingface.co cannot be connected if the proxy is not turned on when running the code on the server! Therefore we need to download the model locally. If it is too inefficient to manually download files one by one in the huggingface model library interface, I will download the huggingface model directly from the server in this article to solve the problem of huggingface being unable to connect] gives the method of using the code to download, you can click to read.

2. Environment configuration

Create a new conda virtual environment and download the configuration file. Run pip install -r requirements.txt
Insert image description here

3. Error analysis

I thought it would be over simply, but I encountered a bunch of errors. Solved as follows

1. AttributeError: module ‘torch.backends.cuda’ has no attribute ‘sdp_kernel’

[Cause analysis] You need to use torch2.0 version to run normally. My previous version was version 1.0.
[Solution] Rebuild a virtual environment with python=3.10 and install torch2.0
on this website Select the corresponding version of the torch installation package on
Insert image description here
Right-click and select copy link, and then in the conda environment installed before, enter wget + link to download. Such as. I chose the first one, which is torch2.0 version + cuda11.8 + python3.10, and it is a linux system. (win_amd64 refers to the windows system)

wget https://download.pytorch.org/whl/cu118/torch-2.0.0%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=4b690e2b77f21073500c65d8bb9ea9656b8cb4e969f357370bbc992a3b074764

After downloading, use pip install installation package name.whl to install

pip install torch-2.0.0+cu118-cp310-cp310-linux_x86_64.whl 

Enter python to enter the python environment, enter torch.__version__ to query

python
torch.__version__

The result is shown in the figure:
Insert image description here

2. AttributeError: ‘BaichuanTokenizer’ object has no attribute ‘sp_model’

[Solution] Use transformers version 4.33.3

pip install transformers==4.33.3

3. RuntimeError: Failed to import transformers.modeling_utils because of the following error (look up to see its traceback):

    CUDA Setup failed despite GPU being available. Please run the following command to get more information:

    python -m bitsandbytes

    Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
    to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
    and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

Because the cuda version does not match torch (I only downloaded torch2.0 before, but did not configure the corresponding cuda11.8 version), execute the following code to install cuda11.8

conda install cuda -c nvidia/label/cuda-11.8.0

4. torch.cuda.OutOfMemoryError: CUDA out of memory.

Tried to allocate 192.00 MiB (GPU 0; 31.75 GiB total capacity; 30.58 GiB already allocated; 49.50 MiB free; 30.73 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
[Cause analysis] When submitting the task before, only a single V100 card with 32G memory was used. There was insufficient memory and two cards needed to be specified for execution.
[Solution] What I have here is a multi-player GPU server, and I use the LSF scheduling system to submit tasks. Please refer to the run.sh code for submitting tasks:

#/bin/bash
#BSUB -J 任务名称
#BSUB -e /nfsshare/home/xxx/log/NAME_%J.err 报错日志路径
#BSUB -o /nfsshare/home/xxx/log/NAME_%J.out 输出日志路径
#BSUB -n 2 指定2块GPU
#BSUB -q gpu 使用GPU序列
#BSUB -m gpu01 用01号GOU
#BSUB -R "rusage[ngpus_physical=2]" 
#BSUB -gpu "num=2:mode=exclusive_process" 

Guess you like

Origin blog.csdn.net/a61022706/article/details/134903717