AIGC Wensheng diagram: stable-diffusion-webui deployment and use

1 Introduction to stable-diffusion-webui

Stable Diffusion Web UI is a basic application based on Stable Diffusion. It uses the gradio module to build an interactive program, and can immediately access Stable Diffusion in a low-code GUI.

  • Stable Diffusion is a portrait-generating AI capable of simulating and recreating nearly any concept imaginable in visual form, without any guidance beyond text-prompted input
  • Stable Diffusion Web UI provides a variety of functions, such as txt2img, img2img, inpaint, etc., and also contains many additional upgrades such as model fusion improvements, image quality fixes, etc.
  • Different effects can be generated by adjusting different parameters, and users can create according to their own needs and preferences.
  • We can train our own model through the Stable Diffusion Web UI, which provides a variety of training methods, and we can make our own models by mastering the training methods.

2 stable-diffusion-webui installation and deployment

2.1 conda environment installation

For conda environment preparation, see: annoconda

2.2 Construction of the operating environment

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
cd stable-diffusion-webui
conda create -n sdw python=3.9
conda activate sdw 

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Note: pytorch version download, select the corresponding version according to the local environment, download address: address

2.3 Model download

(1) Model file

Model download address: address , after the download is complete, store it in the following directory

mkdir -p models/Stable-diffusion/

mv /opt/v1-5-pruned-emaonly.safetensors models/Stable-diffusion/

(2) Download through proxy to solve the problem of slow download or download failure

[root@localhost stable-diffusion-webui]# vi modules/launch_utils.py 

will be https://github.comreplaced with https://ghproxy.com/https://github.com, i.e. proxying with Ghproxy

The modified content is as follows:

def prepare_environment():
    torch_index_url = os.environ.get('TORCH_INDEX_URL', "https://download.pytorch.org/whl/cu118")
    torch_command = os.environ.get('TORCH_COMMAND', f"pip install torch==2.0.1 torchvision==0.15.2 --extra-index-url {torch_index_url}")
    requirements_file = os.environ.get('REQS_FILE', "requirements_versions.txt")

    xformers_package = os.environ.get('XFORMERS_PACKAGE', 'xformers==0.0.20')
    gfpgan_package = os.environ.get('GFPGAN_PACKAGE', "https://ghproxy.com/https://github.com/TencentARC/GFPGAN/archive/8d2447a2d918f8eba5a4a01463fd48e45126a379.zip")
    clip_package = os.environ.get('CLIP_PACKAGE', "https://ghproxy.com/https://github.com/openai/CLIP/archive/d50d76daa670286dd6cacf3bcd80b5e4823fc8e1.zip")
    openclip_package = os.environ.get('OPENCLIP_PACKAGE', "https://ghproxy.com/https://github.com/mlfoundations/open_clip/archive/bb6e834e9c70d9c27d0dc3ecedeebeaeb1ffad6b.zip")

    stable_diffusion_repo = os.environ.get('STABLE_DIFFUSION_REPO', "https://ghproxy.com/https://github.com/Stability-AI/stablediffusion.git")
    k_diffusion_repo = os.environ.get('K_DIFFUSION_REPO', 'https://ghproxy.com/https://github.com/crowsonkb/k-diffusion.git')
    codeformer_repo = os.environ.get('CODEFORMER_REPO', 'https://ghproxy.com/https://github.com/sczhou/CodeFormer.git')
    blip_repo = os.environ.get('BLIP_REPO', 'https://ghproxy.com/https://github.com/salesforce/BLIP.git')

    stable_diffusion_commit_hash = os.environ.get('STABLE_DIFFUSION_COMMIT_HASH', "cf1d67a6fd5ea1aa600c4df58e5b47da45f6bdbf")
    k_diffusion_commit_hash = os.environ.get('K_DIFFUSION_COMMIT_HASH', "c9fe758757e022f05ca5a53fa8fac28889e4f1cf")
    codeformer_commit_hash = os.environ.get('CODEFORMER_COMMIT_HASH', "c5b4593074ba6214284d6acd5f1719b6c5d739af")
    blip_commit_hash = os.environ.get('BLIP_COMMIT_HASH', "48211a1594f1321b00f14c9f7a5b4813144b2fb9")

2.4 webui running

(1) Set to allow the root user to run (if you do not run as the root user, you can skip this step)

When we run as root user, the following error will be reported:

 [root@localhost stable-diffusion-webui]# sh webui.sh 

################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################

################################################################
ERROR: This script must not be launched as root, aborting...
################################################################

Solved by modifying webui.sh

vi webui.sh

Modify line 51 of the file as follows

# this script cannot be run as root by default
can_run_as_root=1

 (2) Start the project

For local access only, the webui is bound to the 127.0.0.1 address, and the startup command is as follows:

sh webui.sh

It can be accessed through the LAN, and the webui is bound to the 0.0.0.0 address. The startup command is as follows:

sh webui.sh --listen

It can be accessed through the LAN, webui is bound to the address of 0.0.0.0, start with --xformers, and the xformers component will be automatically installed at startup to accelerate GPU operation

sh webui.sh --listen --xformers

Commonly used parameters are as follows:

3 stable-diffusion-webui use

The startup interface can be roughly divided into 4 areas [Model] [Function] [Parameter] [Picture] four areas

(1) Model area: The model area is used to switch the models we need. After the model is downloaded, the relative path is placed in the /modes/Stable-diffusion directory. The safetensors, ckpt, and pt model files downloaded from the Internet should be placed in the above path, the model The refresh arrow for the region refreshes and becomes available for selection.

(2) Functional area: The functional area is mainly used for us to switch to use the corresponding functions and reload the UI interface after we install the corresponding plug-in, and then add the shortcut entry of the corresponding plug-in in the functional area. The common functions of the functional area are described as follows

  • txt2img (Wensheng map) --- standard text to generate images;
  • img2img (image-generated image) --- generate an image based on an image written template and combined with text;
  • Extras (more) --- optimized (clear, extended) image;
  • PNG Info --- basic image information
  • Checkpoint Merger --- Model Merger
  • Textual inversion --- train the model for a certain image style
  • Settings --- default parameter modification

(3) Parameter area: Depending on the function module you choose, the parameter settings that may need to be adjusted are also different. For example, in the Vinsen plot module you can specify parameter configurations such as the number of iterations to use, mask probabilities and image dimensions

(4) Drawing area: The drawing area is the final result of the AI ​​drawing we see. In this area, we can see the relevant parameters and other information used in the drawing.

3.1 txt2img (Vincent figure)

In the settings page, you can enter text, select a model and configure other parameters. Text is required and it will be the basis for image generation. You can choose from a predefined model or upload your own. You can also choose some other parameters like batch size, resulting image dimensions, etc. The following are descriptions of some parameters:

  • Sampling method (Sampler): This parameter allows you to select the sampling method used to generate the image. By default, this parameter is set to "Eulea", but you can also choose the new option after "DPM++", which will be more detailed than the default generated image.
  • Sampling steps: This parameter allows you to specify the number of iterations for image generation. A higher number of iterations may result in better image quality, but it will also take longer to complete the generation, starting with 50 by default.
  • Width & Height: This parameter allows you to specify the height and width of the generated image. Larger height and width require more video memory computing resources, here the default is 512*512, and we can go to more (send to extras) modules to enlarge the image with the enlargement algorithm if the image needs to be enlarged.
  • Generate batches (Batch count): This parameter allows you to specify the maximum number of iterations that the model will run for each generated image. Increase this value to generate images multiple times, but the generation time will be longer (if there are multiple images, it is recommended to reduce the batch of image generation and increase the number of images generated in a single parameter).
  • Batch size: This parameter allows you to specify the maximum number of images that can be generated at one time. This can be useful if you have limited system resources and need to generate images in smaller batches.
  • Prompt word correlation (CFG Scale): This parameter can change the consistency between the image and the prompt (increasing this value will cause the image to be closer to your prompt, but if it is too high, the color of the image will be too saturated, and the smaller the value, the better the AI ​​drawing will be. More ego space is more likely to produce creative results (default 7).
  • Seed: This parameter allows you to specify a random seed that will be used to initialize the image generation process. The same seed value produces the same set of images each time, which is useful for reproducibility and consistency. If you leave the seed value at -1, a random seed will be generated each time the text-image feature is run.
  • Restore faces: Restore faces can be selected for optimally drawing facial images. When the avatar is at a close angle, it seems that it is easy to overfit and blur. It is suitable to check this option when it is at a far angle.
  • Tiling: Used to generate an image that can be tiled.
  • Highres.fix: Use a two-step process for generation, create an image at a smaller resolution, and then improve the details without changing the composition. Selecting this part will have two new parameters Scale latent in the latent space Scale the image in . Another approach is to generate the full image from the latent representation, upscale it, and move it back to the latent space. Denoising strength determines how well the algorithm retains image content. At 0, nothing changes, while at 1, you get an irrelevant image;

3.2 imge2img (image generated image)

 The img2img function allows you to use the stable diffusion web ui to generate meta-images and images with similar composition and color, or specify a part of the content for transformation.

Compared with text2img, img2img has added a zoom mode and a redrawing strength (Denoising strength) parameter setting. In the above example, a black hair prompt is added and the image size is adjusted and the redrawing strength (Denoising strength) value is set to 1. , to help us regenerate a black-haired heroine picture, the following is a description of the relevant parameters.

  • Scaling mode: Mainly what mode we use to ensure the output effect of the picture after we adjust the size of the picture (optional: stretching/cropping/filling/just resize);
  • Denoising strength: The degree of freedom of image imitation, the higher the degree of freedom, the lower the strength, the closer to the reference image, usually less than 0.3 is basically adding a filter;

3.3 Inpaint (redraw)

 The inpaint function allows you to use the stable diffusion web ui to redraw the manually masked part of the picture. If you find a picture that is overall acceptable but the details are broken, you can click this button to start partial redrawing. The web ui will automatically generate a mask layer for you, and you can use the mouse to paint the area that needs to be repaired on the picture. Then click the "Generate" button, the web ui will generate a new picture based on the mask layer and the original picture, and display it on the right (the official has a special inpaint repair model based on 1.5), in addition to redrawing we also The pix2pix plug-in class chatgpt can be used to realize the function of changing pictures.

  • Mask blur: the blur of the color mask, which is used to control the blur degree of the mask image. Specifically, the greater the mask blur, the blurrier the edge of the mask image, and the distance between adjacent pixels The color differences are smaller, resulting in a smoother and more natural-looking repaired image. Conversely, if the mask is less blurry, the edges of the mask image will be sharper, and the inpainted image will likely be sharper and more detailed. It should be noted that a large mask blur may cause loss of details in the repaired image, so it is necessary to balance the balance between the blur and the details of the repaired image
  • Mask mode:
  • Redraw mask content: This option is to let us redraw the area we added the mask (generate a new scarf);
  • Redraw non-masked content: This option will keep the mask area we painted (the scarf in this example) and redraw the rest of the image to generate a new image;

3.4 Extras (image scaling)

The Extras section provides some additional functions, including Super Resolution (Super Resolution). For the two-dimensional image, there is a special distribution algorithm that needs to be loaded separately. The following will introduce how to use these functions.

Super Resolution

The super-resolution feature increases the clarity and detail of a low-resolution image by upscaling it to the size of a high-resolution image. To use the super-resolution function, you need to input a low-resolution image and specify the factor to enlarge it.

The steps to use the super-resolution function are as follows:

Step 1: Open Stable Diffusion and click the "Extras" button.

Step 2: Upload a low-resolution image in the "Image" field.

Step 3: Select the corresponding scaling algorithm. You can choose 2, 3 or 4 times, or specify a custom multiple.

Step 4: Click the "Process" button and wait for the processing to complete to view the resulting high-resolution image.

4 problem solving

4.1 Cannot locate TCMalloc problem

[root@localhost stable-diffusion-webui]# sh webui.sh --listen

################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################

Cannot locate TCMalloc (improves CPU memory usage)

Solved by the following command:

yum install gperftools gperftools-devel

Start again after the installation is complete, the error has disappeared

4.2 Couldn't determine Stable Diffusion's hash问题

If the git version is 1.8, upgrade the git version

把老版本的 Git 卸掉。
sudo yum -y remove git
sudo yum -y remove git-*
 
 
添加 End Point 到 CentOS 7 仓库
sudo yum -y install https://packages.endpointdev.com/rhel/7/os/x86_64/endpoint-repo.x86_64.rpm
 
 
安装 Git
sudo yum -y install git

Ignore git related folders in turn

git config --global --add safe.directory /opt/stable-diffusion-webui/repositories/stable-diffusion-stability-ai
 
git config --global --add safe.directory /opt/stable-diffusion-webui/repositories/CodeFormer
 
git config --global --add safe.directory /opt/stable-diffusion-webui/repositories/k-diffusion
 
git config --global --add safe.directory /opt/stable-diffusion-webui/repositories/BLIP
 
git config --global --add safe.directory /opt/stable-diffusion-webui/repositories/taming-transformers

The solution is detailed in: Couldn't determine Stable Diffusion's hash problem solving

Guess you like

Origin blog.csdn.net/lsb2002/article/details/131657117