Stable Diffusion+ControlNet+Lora Guiding AI+ Art Design WebUI Full Process Tutorial


It is often difficult for designers to get started with the new painting tools. This article will give a comprehensive explanation of the currently popular Stable Diffusion+ControlNet to guide the use of AI art design tools. Many people have a hunch that in the future, either designers will be replaced by graphics programmers, or designers who will use AI tools will replace traditional designers. From 2023, AI-assisted design or even leading design has become a common phenomenon.

Software and hardware environment:
OS: Ubuntu 20.04 (Stable Diffusion development requires a Linux environment, pure web tools can also run under WIndows)
CPU: AMD5800 8core 16Thread
GPU: NVIDIA RTX 3090 * 2
RAM: 16GB * 4
Pytorch-gpu=1.13
CUDA=11.7

1. Background knowledge

1.1 Stable Diffusion background knowledge

1.1.1 install stable-diffusion-webui

Since the author's system is Linux, the following configuration needs to be performed according to the operation of the official website ( https://github.com/AUTOMATIC1111/stable-diffusion-webui ):

# Debian-based:
sudo apt install wget git python3 python3-venv
bash <(wget -qO- https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh)

After downloading stable-dffusion-webui, you need to install gfpganthe package separately ( https://gitcode.net/mirrors/TencentARC/GFPGAN?utm_source=csdn_github_accelerator ), the installation method is as follows:

git clone https://github.com/TencentARC/GFPGAN.git
cd GFPGAN
# Install basicsr - https://github.com/xinntao/BasicSR
# We use BasicSR for both training and inference
pip install basicsr

# Install facexlib - https://github.com/xinntao/facexlib
# We use face detection and face restoration helper in the facexlib package
pip install facexlib

pip install -r requirements.txt
python setup.py develop

# If you want to enhance the background (non-face) regions with Real-ESRGAN,
# you also need to install the realesrgan package
pip install realesrgan

After installation, GFPGANput the directory stable-diffusion-webuiunder the directory, and rename it at the same time gfpgan. Note that if the name is not changed here, the package will not be available.
insert image description here
Then run the following command and wait for other environment dependencies to be automatically installed:

./webui.sh 

It may take a while to install the contents of requirements.rxt here:
insert image description here

1.2 ControlNet background knowledge

2. How to use

At present, most of the ways to open AI+art design tools are Web interactive interfaces. After connecting to the Internet, the cloud GPU server inside the AI ​​company is called, and the server returns the result to the user after calculation. Most of them have frequency or function restrictions, or higher fees. This section describes how to use the local GPU worker machine for web interactive AI drawing.

2.1 Environment configuration

Download the following four source code/model files:

  1. Web version of SD mainly used (third party, unofficial):

    stable-diffusion-webui

  2. Download SD official v1.5 model:
    runwayml/stable-diffusion-v1-5

  3. Download the web version of ControlNet (third party, unofficial):
    Mikubill/sd-webui-controlnet

  4. Download the official ControlNet model:
    lllyasviel/ControlNet/tree/main/models

After downloading, enter first stable-diffusion-webui, and pay attention to Mikubill/sd-webui-controlnetput the source code of extensionsunder the directory:

insert image description here
modelsCopy the downloaded ControlNet source code directory to extensionsthe following:

insert image description here

2.2 Run WebUI

Execute the script on the command line (be careful not to use the sudo command, otherwise it will fail):

./webui.sh 

Next, the script automatically installs the environment and loads the model. After loading, a local web URL will be returned. Visit this URL to interact with the local browser interface:
insert image description here
copy the url, open the browser, and you can get the SD WebUI interactive interface with ControlNet function. You can use the components on the interface for local rapid interactive design and development.
insert image description here

3. Background knowledge

3.1 Detailed Explanation of Stable Diffusion Parameters

Sampling method: sampling method

Sampling steps: Sampling iteration steps

Restore faces: face restoration

Tiling: Generate tiled textures

Highres.fix: High resolution fixes

Firstpass width: The width of the low resolution at the beginning

Firstpass height: the initial low resolution high

CFG scale: The smaller the value, the more AI diversity, the larger the more restrictions

Seed: number of seeds

Variation seed: The number of sub-seeds based on the original seed number

Denoising strength: the size of the gap with the original picture

3.2 Detailed Explanation of ControlNet Parameters

  1. 2D重绘
    Canny Edge
    HED Boundary
    M-LSD Lines
    Faske Scribbles

  2. Specialized Fields
    Depth Map
    Normal Map
    Semantic Segmentation
    Human Pose

4. Customized skills

4.1 Parameter tricks

Suggestions for deep live LoRa model training:
use the same bottom model (large model) as LoRa; it is best to use the same parameters as the LoRa author; correctly set the weight of LoRa (0.8~0.9, <1); add trigger words to the prompt words; the more LoRa is not the better.
1. Total number of trainings: It is recommended to train about 15,000 times in-depth on a 50-image data set. For larger data sets, the Dadaptation optimizer can be used to test the optimal total number of steps.
2. Training rounds: It is recommended to preset 10/5 times, and it is recommended to train 20~30 times in a single round for each picture.
3. Training resolution: 768x1024 is recommended, adjusted according to the computer's video memory.
4. Training source model: chilloutmix_NiPnjnedFp32Fix, 1.5 model is recommended.
5. Text Encoder learning rate (text encoder learning rate): Mainly affects robustness, generalization and fitting, too low is not conducive to replacing features.
6. Unet learning rate (Unet learning rate): It mainly affects the likeness of the model, affects the lost rate and fitting degree, increases the non-fitting, and reduces the over-fitting.
7. The relationship between the learning rate of the text encoder and the learning rate of Unet: there is no necessary 1/5~1/10 ratio relationship, and Unet can even be lower than Text under a huge data set.
8. Network Rank (Dimension" network size): Strengthen training details, it is recommended to be 128~192, and the increase above 128 is relatively insignificant. 9. Network
Alpha (Network Alpha): It is recommended to be above 96, weaken training details, have a regularization effect, and can increase synchronously with Dim. 10. Let AI train AI: The first training uses Dadaptation, and all learning rates are set to 1. Adjusting the learning rate achieves a balance similar to ease of
use
.
12. Lost rate control: The lower the better, the lower the model, the better the fit, but the harder it is for the model to change features, and it may even affect actions and expressions.
13. Lion optimizer: It is not recommended to be used in deep training. Too fast fitting can be very similar, but it will cause poor versatility.
14. Local in-depth training method: It can be monitored by remote operation software. During the training process, it is found that the learning rate is not suitable for remote modification.

V. Reference sources

How to train a very real Lora model (deep discussion)

[2023 latest] Lora installation and training guide

Still not sure about the lora training set? Dry goods sharing + marking explanation

Guess you like

Origin blog.csdn.net/m0_46339652/article/details/130002643