Table of contents
Hardware and software configuration:
CPU: AMD 5800 8core 16Thread
GPU: NVIDIA RTX 3090 *1
NVIDIA TITAN RTX *1
OS: Ubuntu20.04
1. LoRA model multi-card training
1.1 Install libraries such as xformer
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
git clone https://github.com/facebookresearch/xformers/
cd xformers
git submodule update --init --recursive
export FORCE_CUDA="1"
# 进入https://developer.nvidia.com/cuda-gpus#compute
# 设置所用显卡对应的Compute Capability,3090和A5000都是8.6
export TORCH_CUDA_ARCH_LIST=8.6
pip install -r requirements.txt
pip install -e .
Download training code:
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts.git
cd LoRA_Easy_Training_Scripts
git submodule init
git submodule update
cd sd_scripts
pip install --upgrade -r requirements.txt
1.2 Set path
Generally speaking, three paths need to be set, the large model path, the image input path, and the image output path:
Next, generate the training configuration file:
accelerate config
According to the actual situation of the working machine and the training strategy, select the corresponding configuration
- This machine
- 1
- No
- NO
- NO
- NO
- 0,1
- fp16
After the configuration is complete, a training configuration file will be automatically generated.
1.3 Doka training
accelerate launch main.py
With the same model and configuration, the dual-card training takes 3:46, while the single-card training takes 7:57, which shows that the dual-card acceleration strategy is effective.
Dual card time:
single card time:
2. HyperNetwork model multi-card training
2.1 HyperNetwork training through WebUI
First select preprocessing, then select HyperNetwork training
Troubleshooting solution
Doka training error
After executing the multi-card training command accelerate launch main.py
, the following error occurs:
the reason is that the Pytorch version corresponding to xformer0.18.0 is 2.0.0, which is a higher version and should be downgraded to pytorch1.13.0,xformer0.16.0
and no longer uses xformer, ie self.xformers: bool = False
.