MiniGPT-4 (large language model enhanced visual language understanding) introduction, experience, deployment tutorial

NO.1 Introduction

MiniGPT-4 uses an advanced large-scale language model to enhance visual language understanding, combining language capabilities with image capabilities.


be09102d903ad065782c6a829eabbe1d.jpeg

It uses the visual encoder BLIP-2 and the large language model Vicuna for combined training, which together provide emerging visual language capabilities.

MiniGPT-4github:

https://github.com/Vision-CAIR/MiniGPT-4

Working principle translation:

  • MiniGPT-4 uses a projection layer to align the frozen vision encoder from BLIP-2 with the frozen LLMVicuna.
  • We train MiniGPT-4 through two stages. The first traditional pre-training stage is trained using about 5 million image-text pairs in about 10 hours using 4 A100s. After the first stage, Vicuna was able to understand the image. But its generative ability has been seriously affected.
  • To address this issue and improve usability, we propose a new method for creating high-quality image-text pairs through the model and ChatGPT itself. Based on this, we created a small (3500 pairs in total) but high-quality dataset.
  • The second fine-tuning stage uses this dataset for training on dialog templates to significantly improve their generation reliability and overall usability. Surprisingly, this stage is computationally efficient and only takes about 7 minutes using a single A100.
  • MiniGPT-4 is able to generate many emerging visual language abilities similar to those exhibited in GPT-4

Use NO.2DOMO

MiniGPT-4 is developed by Chinese and can speak Chinese, but the statement is a bit cold, not as humane as ChatGPT.

f614fa2772b166cc48898c599633ae29.jpeg

This demo is rather stupid. You need to upload pictures before you can have a conversation. It is not easy to use. It is estimated that you need to use the API for secondary development.

Unable to extract text from image

0fe1bc3eef6d529e4668eb1241e026ca.jpeg

The text cannot be recognized

9c88131e0825d280428c47229a650211.jpeg

The general picture content is understandable, but the language organization is relatively lacking

21de088e323c6fc340a7b609bf8bf72c.jpeg

NO.3 Deployment requirements

installation steps

MiniGPT4 requires different configurations according to different model selections

Currently solved:

Vicuna7B:

-VRAM>12GB

-RAM>16GB

-Disk>2500GB

Vicuna13B:

-VRAM>24GB

-RAM>16GB

-Disk>2500GB

When converting weights during deployment, it is estimated that 80G of memory will be required

When training data, 2.3T image data will be downloaded as training.

This deployment uses the 13B language model for deployment

Note: The following files are all placed under /data, and some files are particularly large, so be careful not to put them on the system disk


1. Install conda

wget-chttps://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-latest-Linux-x86_64.shbashMiniconda3-latest-Linux-x86_64.sh

# After that, keep pressing Enter to see the license until the end to let you agree to the license,

Enter yes#Enter installation location/data/conda#Add official mirror address

condaconfig--addchannelsbiocondacondaconfig--addchannelsconda-forge

2. Prepare code and installation environment

gitclone https://github.com/Vision-CAIR/MiniGPT-4.gitcdMiniGPT-4condaenvcreate-fenvironment.ymlcondaactivateminigpt4

#If you exit the bash interface in the follow-up operation

It needs to be executed again at the next login to set the environment condaactivateminigpt4

3. Get the original weight

This step is the most strenuous, the weight is very large, the download is very slow, and I made a mistake the first time I downloaded it. I downloaded the original weight for two nights, and my head grew bigger.

For the first time, I went to https://github.com/facebookresearch/llama/issues/149 to check an original weight. After downloading it for a day, the file was incorrect and md5 could not be matched. I reported the file when I ran the weight conversion. Wrong, so don't use this download (prompt to find a weight for yourself)

7B:ipfs://QmbvdJ7KgvZiyaqHw5QtQxRtUd7pCAdkWWbzuvyKusLGTw 13B:ipfs://QmPCfCEERStStjg4kfj3cmCUu1TP7pVQbxdFMwnhpuJtxk

The second time I re-used the download of Xunlei seeds. This time, the md5 and checklist are correct.

Seed address:

https://github.com/RiseInRose/MiniGPT-4-ZH/blob/main/CDEE3052D85C697B84F4C1192F43A2276C0DAEA0.torrent

Just download the 13B model from Xunlei. The folder structure is as follows. Note that all the files below need to be downloaded. The final folder size is 25G.

4. Download incremental weights

You need to install git-lfs before downloading, just go to the official website to download https://git-lfs.com

Execute after downloading and installing

gitlfsinstall mkdir/data/vicuna cd/data/vicuna #It is recommended to run in the background, the file inside is too big. At the beginning, I downloaded it for a whole day, a total of almost 49 G, and hung it on bash. If the network is disconnected, it will be very uncomfortable nohupgitclonehttps: https://huggingface.co/lmsys/vicuna-13b-delta-v1.1&

5. Install fastchat

gitclonehttps://github.com/lm-sys/FastChat cdFastChat gitcheckoutv0.2.3 #安装 pipinstalle. pipinstalltransformers[sentencepiece]

6. Transform the original weights

The downloaded original weight needs to be converted (note: the incremental weight downloaded from gitclone does not need to be converted, just convert the original one)

#Store converted weights mkdir-p/data/after_conv_weights/origin mkdir/data/transformers cd/data/transformers gitclonehttps://github.com/huggingface/transformers cdtransformers #Convert weights Note that the folder directory is written correctly, input_dir only needs to be specified Go to tokenizer.model level pythonsrc/transformers/models/llama/convert_llama_weights_to_hf.py--input_dir/data/LLaMa--model_size13B--output_dir/data/after_conv_weights/origin appears: RuntimeError:Failedtoimporttransformers.models.llama.tokenization _llama_fast because of the following error ( lookuptoseeitstraceback): tokenizers>=0.13.3 is required for normal functioning of this module, but found tokenizers==0.13.2.

run

pipinstall-Utokenizers

Then re-execute the above script

After completion, run the following code directly in python to load the model and tokenizer

python fromtransformersimportLlamaForCausalLM,LlamaTokenizer tokenizer=LlamaTokenizer.from_pretrained("/data/after_conv_weights/origin") model=LlamaForCausalLM.from_pretrained("/data/after_conv_weights/origin")

7. Convert final job weights

It is estimated that about 80G of memory is required here

mkdir-p/data/after_conv_weights/final python-mfastchat.model.apply_delta--base/data/after_conv_weights/origin/--target/data/after_conv_weights/final/--delta/data/vicuna/vicuna-13b-delta-v1.1/

The final converted weight folder

98af7483a96f61e96c301b157527182c.jpeg

After conversion, modify the configuration file

/data/MiniGPT-4/minigpt4/configs/models/minigpt4.yaml llama_model:"/data/after_conv_weights/final/"0de7dd0d5ce4c814c02a5fe737a53b99.jpeg

8. Download the pretrained model checkpoint

https://drive.google.com/file/d/1a4zLvaiDBr-36pasffmgpvH5P7CKmpze/view

After downloading is a pretrained_minigpt4.pth file

Put into /data/checkpoint folder

In the /data/MiniGPT-4/eval_configs/minigpt4_eval.yaml file, modify ckpt to specify /data/checkpoint/pretrained_minigpt4.pth

9006676add61741b452a03d460342230.jpeg

Here, the basic preparations are done.

9. Try to start

cd/data/MiniGPT-4pythondemo.py--cfg-patheval_configs/minigpt4_eval.yaml--gpu-id0

Usually fails after running

The following error will occur

Question 1:

Import Error:libX11.so.6:cannotopensharedobjectfile:Nosuchfileordirectory

Solution:

yum installlibX11

Question 2:

ImportError:libXext.so.6:cannotopensharedobjectfile:Nosuchfileordirectory

Solution:

yum installlibXext

Question 3:

RuntimeError:TheNVIDIAdriveronyoursystemistooold(foundversion10020).PleaseupdateyourGPUdriverbydownloadingandinstallinganewversionfromtheURL:http://www.nvidia.com/Download/index.aspxAlternatively,goto:https://pytorch.orgtoinstallaPyTorchversionthathasbeencompiledwithyourversionoftheCUDAdriver.

NVIDIA version is too old, need to update NVIDIA version

10. Update NVIDIA version

nvidia-smi Check the current version, if not found, there is no nvidia driver

The current test can run on NVIDIA-SMI515.105.01, CUDAVersion: 11.7

Download the NVIDIA driver for the corresponding model from https://www.nvidia.cn/Download/index.aspx?lang=cn. The driver varies depending on the graphics card.

Here I am the driver of V100S

e51feed83e15c7a9257c16f0fe1c4d69.jpeg

Do not rush to install after downloading

Install gcc and dkms first

yum-yinstallgccdkms

Check the kernel version

uname-r yumlist|grepkernel-devel yumlist|grepkernel-header These three versions need to correspond, not even a minor version number. My version number is (minigpt4)[root@10-13-50-112cc_sbu]#uname-r 3.10.0-1062.9.1.el7.x86_64 At the beginning, the other two are not compatible and need to be updated from https:// buildlogs.centos.org/c7.1908.u.x86_64/kernel/20191206154625/3.10.0-1062.9.1.el7.x86_64/ Download the following two rpm packages to update kernel-devel-3.10.0-1062.9.1 .el7.x86_64.rpm kernel-headers-3.10.0-1062.9.1.el7.x86_64.rpm

Uninstall NVIDIA that has been installed in the past (ignore if not installed)

cd/usr/bin/ ./nvidia-uninstall

Install NVIDIA driver

cd/data/navida/ chmoda+xNVIDIA-Linux-x86_64-515.105.01.run ./NVIDIA-Linux-x86_64-515.105.01.run After that, press the guide point yes (the operation is the left and right arrow keys and press Enter) if It is reported that xxx/build and xxx/source are not found, then the kernel tool is wrong, and the kernel needs to be reinstalled

After installation, check the version through the nvidia-smi command

11. Start demo

Or the above command

cd/data/MiniGPT-4 pythondemo.py--cfg-patheval_configs/minigpt4_eval.yaml--gpu-id0

After execution, another error is reported

NameError:name'cuda_setup'isnotdefined

edit

vim/data/conda/envs/minigpt4/lib/python3.9/site-packages/bitsandbytes/cuda_setup/main.py

Around line 149

Add to

cuda_setup=CUDASetup.get_instance()587d7d89aff4975bdea5cb9bb873bccb.jpeg

After the modification, execute it again, and the demo will start. After starting, an address will be given, which can be accessed through this address.

https://3c70e646a6198e3ec7.gradio.live

12. Two-stage training

After minigpt4 is built, it needs two stages of training.

The first stage of training directly provides checkpoint, no need to train on your own service

The second stage of training needs to be trained by yourself

The first stage pre-training checkpoint:

https://drive.google.com/file/d/1u9FRRBB3VovP1HxCAlpD9Lw4t4P6-Yq8/view?usp=share_link

After downloading, place it under the /data/checkpoint/ directory

The second stage of fine-tuning:

download data

https://drive.google.com/file/d/1nJXhoEcy3KTExr17I7BXqY5Y9Lx_-n-9/view?usp=share_link,

Put it under /data/stage_2

And change /data/MiniGPT-4/minigpt4/configs/datasets/cc_sbu/align.yaml to point storage to /data/stage_2/cc_sbu_align

Enter the /data/MiniGPT-4/train_configs directory,

Edit minigpt4_stage2_finetune.yaml, point model.ckpt to the checkout of the first stage pre-training

That is, /data/checkpoint/pretrained_minigpt4_stage1.pth run.output_dir is set to /data/checkpoint/

At the same time, modify the three parameters under run (if you use A100, keep it as it is, because the V100GPU has insufficient memory, you need to reduce the training size):

batch_size_train:1 batch_size_eval:2 num_workers:2

Then go back to the /data/MiniGPT-4 directory and execute

torchrun--nproc-per-node1train.py--cfg-pathtrain_configs/minigpt4_stage2_finetune.yaml

After training, the /data/checkpoint/20230517153 directory will be generated, which contains four files checkpoint_1.pth-checkpoint_4.pth

Finally will

ckpt in /data/MiniGPT-4/eval_configs points to /data/checkpoint/20230517153/checkpoint_4.pth

run again

cd/data/MiniGPT-4 condaactivateminigpt4 pythondemo.py--cfg-patheval_configs/minigpt4_eval.yaml--gpu-id0


Guess you like

Origin blog.csdn.net/specssss/article/details/131132348#comments_27207596