Refer to the following content: https://www.bilibili.com/video/BV1Qk4y1E7nv/?spm_id_from=333.337.search-card.all.click&vd_source=3969f30b089463e19db0cc5e8fe4583a
1. Two key steps to train Lora
The first step is to prepare the pictures to be used for training, that is, high-quality pictures
The second part is to mark these pictures, that is, the precise tag
2. Picture requirements
The recommended quantity is 20-50 pictures, with a maximum of 100 pictures
Bad pictures: blurry, distorted motion, blocked faces, pictures with complex backgrounds (subtract the background)
Resolution: If sd2 is used as the base model, it needs to be above 768*768
Batch image size adjustment: https://www.birme.net/?target_width=512&target_height=512
Adjust image format in batches: https://www.wdku.net/image/imageformat
3. Image marking
Two plug-ins need to be installed: Tagger and dataset tag editor (address: https://github.com/toshiaki1729/stable-diffusion-webui-dataset-tag-editor )
(1) Tagger plug-in
The image generates a txt file of tag information, and the input directory is usually the same as the output directory.
(2)Dataset Tag Editor
Process the tag
1) Delete duplicate words, Remove duplicate tags
2) Delete the tags that belong to the characteristics of the character, such as the character's eyes, eyebrows, nose, hair length and other attributes that represent the character itself. Those who are bound to the characters must be deleted. ( Because we need to generate these features directly according to the lora name in the future, the model needs to learn these features directly according to the lora name without providing other prompt words )
Refer to the following content: https://www.jianshu.com/p/e8cb3ba45b1a
4. Training
Install the training graphical tool kohya, written by the Japanese.
(1) Download
Project address: https://github.com/bmaltais/kohya_ss
The location on the server after downloading: /data/work/xiehao/kohya_ss
(2) Install project dependencies
Enter the directory and install the dependency package: pip install -r requirements.txt
(3) Generate configuration files for execution
Execute the accelerate config command, my configuration is as follows:
(4) Start the training graphical interface
Execute the command: python kohya_gui.py --listen 0.0.0.0 --server_port 12348 --inbrowser
5. Actual combat
(1) Downloaded 25 pictures of zhangluyi from Baidu
(2) The picture is cropped to 768*768
https://www.birme.net/?target_width=768&target_height=768
(3) All pictures are converted to jpt format
https://www.wdku.net/image/imageformat
(4) Use the Tag plugin to extract tags
The method of batch extraction
After execution, the corresponding txt file is generated on linux
(5) Processing tags through Dataset Tag Editor
First, remove duplicates and character trait prompts
Then, save this modification.
(6) Process the file name of the training set in the SD training module
The generated file information is as follows:
These files need to be placed in the 10_zly directory. The number_letter in front of the directory name is the number of times the network trains a single image during each training process . The naming of this directory is very important. It took me an hour to locate this bug .
(7) Training in kohya
After completing the data set preparation, it can be trained in kohya.
First, configure the base model information.
The model corresponding to the linux location specified by the Pretrained model name or path needs to include information such as model_index.json and the tokenizer directory, and there cannot be only one safetensors file . https://huggingface.co/digiplay/majicMIX_realistic_v4 (18G ) can be downloaded via git lfs clone .
This key point is very important, and the positioning plus download process took me several hours .
Then, configure the training directory
Next, configure the training parameters
The Optimizer cannot use the default value. Currently, only the following 5 types are supported in the source code:
Try one by one to see which one does not report an error.
After successful execution, the log is shown in the figure below. The training takes up about 6G of GPU memory resources, the training time is 20 minutes, and the final generated lora is about 10M.
(8) Detect the effect of the lora model in the stable diffusion webui
After the training is complete, put the Lora directory under the sd root directory extensions/sd-webui-additional-networks/models/lora
The interface operation on Webui is as follows: