Stable Diffusion XL build

Here we mainly introduce how to build Stable Diffusion XL from 0 to 1 for AI painting.

Stable Diffusion XL is an optimized version of Stable Diffusion, published by Stability AI. Compared with Stable Diffusion, Stable Diffusion XL has been fully optimized. Rocky believes that Stable Diffusion will be "YOLO" in the field of image generation, and Stable Diffusion XL is "YOLOv3".

Use ComfyUI to build Stable Diffusion XL inference process with zero foundation

ComfyUI is a node-based Stable Diffusion AI painting tool. Compared with Stable Diffusion WebUI, ComfyUI achieves more precise workflow customization and clear reproducibility by splitting the pipeline of Stable Diffusion model generation inference into independent nodes.

At the same time, its perfect model loading and image generation mechanism allows it to build a Stable Diffusion XL workflow on the 2080Ti graphics card, and can generate images with a resolution of 1024x1024. It is so computationally friendly and can be described as good news for beginners.

At present, ComfyUI is compatible with the Base model and Refiner model of Stable Diffusion XL. The following two pictures are the complete pipeline of Rocky using ComfyUI to load the Stable Diffusion XL Base model and Stable Diffusion XL Base + Refiner model and generate pictures: ComfyUI loads Stable Diffusion XL Base model  ComfyUI loads Stable Diffusion XL Base + Refiner model

Below, Rocky will take you step by step to use ComfyUI to build the Stable Diffusion XL inference process, so as to realize the generation process of the above two diagrams.

First, we need to install the ComfyUI framework, this step is very simple, just enter the following code on the command line :

git clone https://github.com/comfyanonymous/ComfyUI.git  

After installation, we can see the local ComfyUI folder.

After the ComfyUI framework is installed locally, we need to install its dependent libraries. We only need to do the following:

cd ComfyUI #进入下载好的ComfyUI文件夹中  
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple some-package 

After completing these configurations, we can configure the model. We put the Stable Diffusion XL model under the ComfyUI/models/checkpoints/ path . In this way, after we open the visualization interface, we can choose the Stable Diffusion XL model for AI painting.

Next, we can start ComfyUI! We go to the ComfyUI/ path and run main.py:

python main.py --listen --port 8888

After the operation is complete, you can see the log that appears on the command line:

To see the GUI go to: http://0.0.0.0:8888  

We input http://0.0.0.0:8888 into our local webpage, and then we can open the ComfyUI visual interface as shown in the figure above, and happily use the Stable Diffusion XL model to generate the picture we want.

Next is the explanation of the node-based module of ComfyUI. First, only the Base model is loaded: Notes used by the Stable Diffusion XL Base model

Rocky has made more detailed annotations. First, you can select our model (Stable Diffusion XL Base) in the red box, then fill in the prompt and negative prompt, and configure the parameters of the generated reasoning process (number of iterations, CFG, Seed etc.), then set the resolution of the generated image in the green box, and then click the Queue Prompt button in the purple box , and the whole reasoning process begins. After the entire reasoning process is completed, the generated picture will be displayed at the place indicated by the yellow arrow in the figure, and the generated picture will be saved to the local ComfyUI/output/ path synchronously .

After completing the reasoning process of the Stable Diffusion Base model, let's take a look at how to build the reasoning process of the Base+Refiner model: Notes used by the Stable Diffusion XL Base+Refiner model

It is very similar to the construction of the Base model. First, you can select our Refiner model (Stable Diffusion XL Refiner) in the red box. The Prompt and negative prompt used by the Refiner model are consistent with the Base model, and configure the parameters of the generated reasoning process (iterative Times, CFG, Seed, etc.), the green arrow indicates that the Latent feature output by the Base model is used as the input of the Refiner model , and then click the Queue Prompt button in the blue box , and the entire Refiner refinement process begins. After the entire reasoning process is completed, the generated picture will be displayed at the place indicated by the purple arrow in the figure, and the generated picture will be saved to the local ComfyUI/output/ path synchronously .

So far, Rocky has explained in detail how to use ComfyUI to build a Stable Diffusion XL model for AI painting, and you can follow Rocky's steps to try.

Using SD.Next to build a Stable Diffusion XL inference process with zero foundation

SD.Next was originally a branch of Stable Diffusion WebUI, and after continuous iterative optimization, it finally became an independent version.

Compared with Stable Diffusion WebUI, SD.Next contains more advanced functions, and is also compatible with Stable Diffusion, Stable Diffusion XL, Kandinsky, DeepFloyd IF and other model structures. It is a very powerful AI painting framework.

So let's start building and using SD.Next right away.

First, we need to install the SD.Next framework. This step is very simple. Just enter the following code on the command line :

git clone https://github.com/vladmandic/automatic 

After installation, we can see the local automatic folder.

After the SD.Next framework is installed locally, we need to install its dependent libraries. We only need to do the following:

cd automatic #进入下载好的automatic文件夹中  
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple some-package

In addition to installing dependent libraries, you also need to configure the repositories plug-ins required by SD.Next. We need to run the following code:

cd automatic #进入下载好的automatic文件夹中  
python installer.py 

After completing the installation of dependent libraries and repositories plug-ins, we can configure the model. We put the Stable Diffusion XL model in the /automatic/models/Stable-diffusion/ path . In this way, after we open the visualization interface, we can choose the Stable Diffusion XL model for reasoning and generating images.

After completing the above steps, we can start SD.Next! We go to the /automatic/ path and run launch.py:

python launch.py --listen --port 8888  

After the operation is complete, you can see the log that appears on the command line:

To see the GUI go to: http://0.0.0.0:8888

We input http://0.0.0.0:8888 into our local webpage, and then we can open the SD.Next visual interface as shown in the figure below, and happily use the Stable Diffusion XL model for AI painting.  automatic visual interface

After entering the SD.Next visualization interface, we can select the model in the red box, and then we need to modify the configuration in Settings so that SD.Next can load the Stable Diffusion XL model.

We click Settings in the blue box above to enter the Settings configuration interface: Automatic framework configuration modification [1]  automatic framework configuration modification [2]

As can be seen from the above illustration, the modification we need to make is to set Settings -> Stable Diffusion -> Stable Diffusion backend to diffusers, and select the Refiner model in the Stable Diffusion refiner column .

Then we need to set the Settings -> Diffusers Settings-> Select diffuser pipeline when loading from safetensors column to Stable Diffusion XL .

After completing the above configuration modification, we can use SD.Next to load Stable Diffusion XL for AI painting!

Use Stable Diffusion WebUI to build Stable Diffusion XL inference process with zero foundation

Currently, the Stable Diffusion WebUI supports the Base model in Stable Diffusion XL, but does not support the Refiner model.

Stable Diffusion WebUI is the most popular framework in the field of AI painting, and its ecology is extremely prosperous. Many upstream and downstream plug-ins can work with Stable Diffusion WebUI to complete workflows such as AI video generation, AI ID photo generation, etc., which is very playable.

Next, let's use this popular framework to build the Stable Diffusion XL inference process.

First, we need to download and install the Stable Diffusion WebUI framework, we only need to enter the following code on the command line :

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git 

After installation, we can see the local stable-diffusion-webui folder.

Next we need to install its dependent library, we enter the Stable Diffusion WebUI folder, and do the following:

cd stable-diffusion-webui #进入下载好的automatic文件夹中  
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple some-package 

Similar to the configuration process of SD.Next, we also need to configure the repositories plug-in of Stable Diffusion WebUI, we need to run the following code:

sh webui.sh  
#主要依赖包括:BLIP CodeFormer generative-models k-diffusion stable-diffusion-stability-ai taming-transformers

After completing the installation of dependent libraries and repositories plug-ins, we can configure the model. We put the Stable Diffusion XL model in the /stable-diffusion-webui/models/Stable-diffusion/ path . In this way, after we open the visualization interface, we can choose the Stable Diffusion XL model for reasoning and generating images.

After completing the above steps, we can start the Stable Diffusion WebUI! We go to the path of /stable-diffusion-webui/ and run launch.py :

python launch.py --listen --port 8888 

After the operation is complete, you can see the log that appears on the command line:

To see the GUI go to: http://0.0.0.0:8888

We enter http://0.0.0.0:8888 into our local webpage, and then we can open the Stable Diffusion WebUI visual interface as shown in the figure below, and happily use the Stable Diffusion XL model for AI painting.

Stable Diffusion WebUI visual interface

After entering the Stable Diffusion WebUI visual interface, we can select the SDXL model in the red box, then enter our Prompt and negative prompt words in the yellow box, and set the image resolution we want to generate in the green box (recommended setting into 1024x1024) , and then we can click the Generate button to start AI painting.

After waiting for a while, the image will be generated and displayed in the lower right corner of the interface, and will also be saved in the /stable-diffusion-webui/outputs/txt2img-images/ path , you can view it in the corresponding path.

Use diffusers to build Stable Diffusion XL reasoning process with zero foundation

The Stable Diffusion XL inference process can be built very well in diffusers. Since diffusers currently do not have a ready-made visual interface, Rocky will build a complete Stable Diffusion XL inference workflow in Jupyter Notebook, so that everyone can quickly master it.

First, we need to install the diffusers library and ensure that the version of diffusers >= 0.18.0, we only need to enter the command on the command line to install:

pip install diffusers --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple some-package

The following log is displayed to indicate that the installation is successful:

Successfully installed diffusers-0.18.2 huggingface-hub-0.16.4

Add in the command: -i https://pypi.tuna.tsinghua.edu.cn/simple some-package means to use Tsinghua source to download the dependent package, the speed is very fast!

Next, we continue to install other dependent libraries:

pip install transformers==4.27.0 accelerate==0.12.0 safetensors==0.2.7 invisible\_watermark -i https://pypi.tuna.tsinghua.edu.cn/simple some-package

The following log is displayed to indicate that the installation is successful:

Successfully installed transformers-4.27.0 accelerate==0.12.0 safetensors==0.2.7 invisible\_watermark-0.2.0

Note: To load the Stable Diffusion XL model with fp16 accuracy in diffusers, the version of the transformers library must be >=4.27.0

After completing the installation of the above dependent libraries, we can build the complete workflow of the Stable Diffusion XL model.

We first use the Base model in Stable Diffusion XL alone to generate images:

# 加载diffusers和torch依赖库  
from diffusers import DiffusionPipeline  
import torch

  
  
# 构建Stable Diffusion XL Base模型的Pipeline,加载Stable Diffusion XL Base模型  
pipe = DiffusionPipeline.from\_pretrained\("/本地路径/stable-diffusion-xl-base-0.9",torch\_dtype=torch.float16, variant="fp16"\)  
# "本地路径/stable-diffusion-xl-base-0.9"表示我们需要加载的Stable Diffusion XL Base模型,  
# 大家可以关注Rocky的公众号WeThinkIn,后台回复:SDXL模型,即可获得资源链接  
# "fp16"代表启动fp16精度。比起fp32,fp16可以使模型显存占用减半。  
  
# 使用GPU进行Pipeline的推理  
pipe.to\("cuda"\)  
  
# 输入提示词  
prompt = "Watercolor painting of a desert landscape, with sand dunes, mountains, and a blazing sun, soft and delicate brushstrokes, warm and vibrant colors"  
  
# 输入负向提示词,表示我们不想要生成的特征  
negative\_prompt = "\(EasyNegative\),\(watermark\), \(signature\), \(sketch by bad-artist\), \(signature\), \(worst quality\), \(low quality\), \(bad anatomy\), NSFW, nude, \(normal quality\)"  
  
# 设置seed,可以固定构图  
seed = torch.Generator\("cuda"\).manual\_seed\(42\)  
  
# Pipeline进行推理  
image = pipe\(prompt, negative\_prompt=negative\_prompt,generator=seed\).images\[0\]  
\# Pipeline生成的images包含在一个list中:\[\<PIL.Image.Image image mode=RGB size=1024x1024>\]  
#所以需要使用images\[0\]来获取list中的PIL图像  
  
# 保存生成图像  
image.save\("test.png"\)  

After completing the entire code process above, we can generate a watercolor-style desert landscape painting. If you operate according to the parameters of Rocky, you should be able to ensure that the following image is generated: Image generated by the Base model

Next, we cascade the Base model and the Refiner model to generate images:

from diffusers import DiffusionPipeline  
import torch  
  
pipe = DiffusionPipeline.from\_pretrained\("/本地路径/stable-diffusion-xl-base-0.9", torch\_dtype=torch.float16, variant="fp16"\)  
  
pipe.to\("cuda"\)  
  
prompt = "Watercolor painting of a desert landscape, with sand dunes, mountains, and a blazing sun, soft and delicate brushstrokes, warm and vibrant colors"  
  
negative\_prompt = "\(EasyNegative\),\(watermark\), \(signature\), \(sketch by bad-artist\), \(signature\), \(worst quality\), \(low quality\), \(bad anatomy\), NSFW, nude, \(normal quality\)"  
  
seed = torch.Generator\("cuda"\).manual\_seed\(42\)  
  
# 首先运行Base模型的Pipeline,输出格式为output\_type="latent"  
image = pipe\(prompt=prompt, negative\_prompt=negative\_prompt, generator=seed, output\_type="latent"\).images  
  
# 构建Stable Diffusion XL Refiner模型的Pipeline,加载Stable Diffusion XL Refiner模型  
pipe = DiffusionPipeline.from\_pretrained\("/本地路径/stable-diffusion-xl-refiner-0.9", torch\_dtype=torch.float16, variant="fp16"\)  
# "本地路径/stable-diffusion-xl-refiner-0.9"表示我们需要加载的Stable Diffusion XL Refiner模型,  
# 大家可以关注Rocky的公众号WeThinkIn,后台回复:SDXL模型,即可获得资源链接  
  
pipe.to\("cuda"\)  
  
images = pipe\(prompt=prompt, negative\_prompt=negative\_prompt, generator=seed, image=image\).images  
  
images\[0\].save\("test.png"\)  

After completing the above code flow, let's take a look at the pictures generated by cascading the Base model and the Refiner model:  Pictures generated by cascading the Base model and the Refiner model

For a more intuitive comparison, we compare the two images just generated: we can clearly see that after using the Refiner model, the overall quality and details of the image are greatly enhanced, and the composition and color are softer.

Of course, we can also use the Refiner model alone to optimize the picture:

import torch  
from diffusers import StableDiffusionXLImg2ImgPipeline  
from diffusers.utils import load\_image  
  
pipe = DiffusionPipeline.from\_pretrained\("/本地路径/stable-diffusion-xl-refiner-0.9", torch\_dtype=torch.float16, variant="fp16"\)  
  
pipe = pipe.to\("cuda"\)  
  
image\_path = "/本地路径/test.png"  
  
init\_image = load\_image\(image\_path\).convert\("RGB"\)  
  
prompt = "Watercolor painting of a desert landscape, with sand dunes, mountains, and a blazing sun, soft and delicate brushstrokes, warm and vibrant colors"  
  
negative\_prompt = "\(EasyNegative\),\(watermark\), \(signature\), \(sketch by bad-artist\), \(signature\), \(worst quality\), \(low quality\), \(bad anatomy\), NSFW, nude, \(normal quality\)"  
  
seed = torch.Generator\("cuda"\).manual\_seed\(42\)  
  
image = pipe\(prompt, negative\_prompt=negative\_prompt, generator=seed, image=init\_image\).images\[0\]  
  
image.save\("refiner.png"\)  

Here Rocky uses future mecha-style pictures for testing and comparison. As you can see from the picture below, the effect of the Refiner model on optimizing the image quality is still very obvious. The image glitches are obviously eliminated, the overall picture is more natural and soft, and the details are also better. complement and reconstruction. Stable Diffusion XL Generation Example

Example 1: Futuristic urban style whaosoft  aiot  http://143ai.com

Prompt:Stunning sunset over a futuristic city, with towering skyscrapers and flying vehicles, golden hour lighting and dramatic clouds, high detail, moody atmosphere

Negative Prompt:(EasyNegative),(watermark), (signature), (sketch by bad-artist), (signature), (worst quality), (low quality), (bad anatomy), NSFW, nude, (normal quality)

Stable Diffusion XL Base+Refiner Result: Stable Diffusion XL Result: Futuristic Urban Style

Example 2: Paradise Beach Style Prompt: Serene beach scene with crystal clear water and white sand, tropical palm trees swaying in the breeze, perfect paradise, seascape

Negative Prompt:(EasyNegative),(watermark), (signature), (sketch by bad-artist), (signature), (worst quality), (low quality), (bad anatomy), NSFW, nude, (normal quality)

Stable Diffusion XL Base+Refiner Build Results: Stable Diffusion XL Build Results: Paradise Beach Style

Example 3: Future mecha style

Prompt:Giant robots fighting in a futuristic city, with buildings falling and explosions all around, intense, fast-paced, dramatic, stylized, futuristic

Negative Prompt:(EasyNegative),(watermark), (signature), (sketch by bad-artist), (signature), (worst quality), (low quality), (bad anatomy), NSFW, nude, (normal quality)

Stable Diffusion XL Base+Refiner Result: Stable Diffusion XL Result: Future Mech Style

Example 4: Musk style

Prompt:Elon Musk standing in a workroom, in the style of industrial machinery aesthetics, deutscher werkbund, uniformly staged images, soviet, light indigo and dark bronze, new american color photography, detailed facial features

Negative Prompt:(EasyNegative),(watermark), (signature), (sketch by bad-artist), (signature), (worst quality), (low quality), (bad anatomy), NSFW, nude, (normal quality)

The result of Stable Diffusion XL Base+Refiner: The result of Stable Diffusion XL: Musk style 

 

Guess you like

Origin blog.csdn.net/qq_29788741/article/details/132052190