Introduction
Although Minigpt4 has released the web version, if you find the web experience after using it, due to the relatively large amount of concurrency, it is easy to suddenly freeze, so I will mainly explain how to deploy it locally.
The previous article has introduced Minigpt4 and I won’t repeat it here. If you don’t understand it, you can go to https://blog.csdn.net/qq_45066628/article/details/130231186?spm=1001.2014.3001.5501
Due to limited funds, I am using the 7B model here. According to the document, the 7B model needs about 12G, while the 13B needs 24G.
build process
1. Environment construction
-
I am using the Conda environment here. The Conda build process is relatively simple so I won’t go into details. If you don’t know it, you can search it yourself and follow the tutorial.
-
After installing Conda, you need to install Cuda and torch (just install the official website tutorial)
Cuda: https://developer.nvidia.com/cuda-toolkit
Torch: https://pytorch.org/get-started/locally/
Cuda installation After that, execute the ncvv -V command and see the output, indicating that the cuda installation is successful.
After Torch is installed, execute the following code to check whether the installation is successful.import torch torch.cuda.is_available()
2. Model download
1. Download related models
The v0 version and the v1 version, I recommend the v1 version more, relatively speaking, there will be fewer bugs, and the specific choice is more personal.
the first method:
Just download directly
7b address (v1): https://huggingface.co/lmsys/vicuna-7b-delta-v1.1/tree/main
13b address (v0): https://huggingface.co/lmsys/vicuna -13b-delta-v0/tree/main
The second method:
1. Use git to pull the vicuna model
v0 version:
git clone https://huggingface.co/lmsys/vicuna-13b-delta-v0 # more powerful, need at least 24G gpu memory
# or
git clone https://huggingface.co/lmsys/vicuna-7b-delta-v0 # smaller, need 12G gpu memory
v1 version:
git clone https://huggingface.co/lmsys/vicuna-13b-delta-v1 # more powerful, need at least 24G gpu memory
# or
git clone https://huggingface.co/lmsys/vicuna-7b-delta-v1 # smaller, need 12G gpu memory
2. Use git to pull the llama model
git clone https://huggingface.co/decapoda-research/llama-13b-hf
# or
git clone https://huggingface.co/decapoda-research/llama-7b-hf
3. Associate the vicuna and llama models
After the pull is complete, you can associate them. The tool used here is the official FastChat developed to fit the two models. If there is no ladder or other acceleration means, it is recommended to compile and install the source code.
git clone https://github.com/lm-sys/FastChat.git
cd FastChat/
pip3 install --upgrade pip # enable PEP 660 support
pip3 install -e .
Installation performed without problems
python -m fastchat.model.apply_delta --base /path/to/llama-13bOR7b-hf/ --target /path/to/save/working/vicuna/weight/ --delta /path/to/vicuna-13bOR7b-delta-v0/
2. Pre-training model download
|–|—
name | download |
---|---|
Checkpoint Aligned with Vicuna 7B | https://drive.google.com/file/d/1RY9jV0dyqLX-o38LrumkKRh6Jtaop58R/view?usp=sharing |
Checkpoint Aligned with Vicuna 13B | https://drive.google.com/file/d/1a4zLvaiDBr-36pasffmgpvH5P7CKmpze/view?usp=share_link |
3. Configuration file modification
1. Modify the value corresponding to ckpt under eval_configs/minigpt4_eval.yaml , and change it to the path of the downloaded pre-training model
2. Modify the llana_model under minigpt4/configs/models/minigpt4.yaml , and change it to the path of the downloaded vicuna and llama models
4. Run the project
python demo.py --cfg-path eval_configs/minigpt4_eval.yaml --gpu-id 0
After running, enter localhost:7860 in the browser