Image-generated image-AI image generation Stable Diffusion parameters and usage details

        This article is an original article of the blogger, and may not be reproduced without the permission of the blogger.

        This article is a series of articles in the column "Python AIGC large model training and reasoning from scratch", the address is "https://blog.csdn.net/suiyingy/article/details/130169592".

        For the detailed installation steps of Stable Diffusion webui and the detailed introduction of Wensheng map (txt2img) functions, please refer to the previous article of this column. This section will specifically introduce the detailed use of functions such as img2img, Extras, PNG Info, Checkpoint Merger, Train, Settings and Extensions Way. In addition, for specific updates of this column, you can follow the official account below the article, or follow this column. All related articles will be updated in " Python AIGC Large Model Training and Reasoning from Scratch ", the address is "https://blog.csdn.net/suiyingy/article/details/130169592". The experience effect of all AIGC-like model deployment will be launched synchronously in the RdFast applet .

Figure 1 img2img interior design generation

1 Img2img

        Img2img (image-generated image) refers to modifying the existing image through Stable Diffusion to generate a new image. As shown in the figure below, img2img includes 6 subpages including img2img, Sketch, Inpaint, Inpaint sketch, Inpaint upload and Batch. The usage of each parameter on the page will be introduced separately below.

Figure 2 img2img

1.1 img2img

        The positive and negative prompts (Prompt) and negative prompts (Negative Prompt) are consistent with the usage and effect of the prompt words of txt2img (Vensheng diagram) in the previous blog post "AI Image Generation Stable Diffusion Parameters and Usage Details". The address is "https://blog.csdn.net/suiyingy/article/details/130008913".

        interrogate CLIP / interrogate DeepBooru: Automatically generate positive prompts, which is equivalent to an intelligent understanding of existing pictures. It can be modified on the basis of the automatically generated prompt.

        Resize mode: scaling mode. (1) Just resize only resizes the image, if the input and output aspect ratios are different, the image will be stretched. If the scale is different from the original image, the image will be distorted. (2) Crop and resize cropping and resizing, if the aspect ratio of the input and output is different, the part outside the ratio will be cropped from the center of the picture to the surroundings. This mode belongs to proportional scaling, but the part beyond the range will be cropped. (3) Resize and fill Adjust the size and fill. If the input and output resolutions are different, the excess part within the ratio will be filled from the center of the picture to the surroundings. This mode also belongs to proportional scaling, but the excess part will be filled.

        Denoising strength: Redrawing strength, the value range is 0-1, the default setting is 0.75. The larger the value, the greater the change in the picture, 0 means that the picture is almost unchanged, and 1 means that it may seriously deviate from the original image. Generally, this parameter is set in the range of 0.6~0.8.

        Other parameters are exactly the same as those of txt2img in the previous blog post. For details, please go to "AI Image Generation Stable Diffusion Parameters and Usage Details", the address is "https://blog.csdn.net/suiyingy/article/details /130008913".

1.2 Sketch

        Sketch adjusts the entire picture, that is, redraws the entire picture. The original English meaning of Sketch is a sketch. Here if we input a hand-drawn sketch, the model will redraw the entire picture based on the sketch.

1.3 Inpaint

        Inpaint refers to local redrawing, that is, modifying the specified area. Many watermark removal software also use Inpaint as a name or keyword. Removing the watermark is also a kind of partial redrawing, so it can also be implemented in this function page.

        Mask blur: Mask blur. The larger the value, the smoother the transition between the drawing area and the edge of the original image. The smaller the value, the sharper the edge.

        Mask mode: Mask mode, Inpaint masked means redrawing only the painted part, Inpaint not masked means redrawing the part except the painted part. Coloring can be achieved by long pressing and dragging the left mouse button on the picture. The coloring process is recorded as the mask making process.

        Masked Content: mask content. fill means to fill with other content, and original is to redraw on the original basis, which is suitable for watermark removal mode. Latent noise and latent nothing are the other two modes, which mainly generate the image of the mask area from the middle process of the model.

        Inpaint area: redrawing area, Whole picture means redrawing the entire image area, and Only masked means redrawing only in the masked area.

        Only masked padding, pixels: Personal understanding is the number of pixels that the mask area expands to the periphery, which is equivalent to enlarging the template area to a certain extent.

        Denoising strength: Redrawing strength, the value range is 0-1, the default setting is 0.75. The larger the value, the greater the change in the picture, 0 means that the picture is almost unchanged, and 1 means that it may seriously deviate from the original image. Generally, this parameter is set in the range of 0.6~0.8.

        The image below is an example of redrawing only the ground.

 Figure 3 Inpaint ground redrawing

1.4 Inpaint Sketch

        Sketch is redrawing in combination with the input complete image, while Inpaint Sketch is mainly redrawing based on the image in the mask area.

1.5 Inpaint upload

        Here is still the Inpaint mode, the mask is no longer drawn by the mouse, but a mask image can be uploaded. Crop to fit is equivalent to the previous Crop and resize.

1.6 Batch

        Specify the input and output folder directories to perform img2img operations in batches.

2 Extras

        Extras is mainly to adjust the size of the picture, such as zooming in high-definition with equal proportions.

 Figure 4 Extras

        Scale by: image magnification, the default is 4. The larger the magnification, the larger the resolution size, the clearer the picture, and the required time will increase accordingly.

        Scale to: Resize the picture to the specified width and height, similar to the previous Resize mode.

        Upscaler 1: Upsampling model. The default is None, where LDSR takes a long time; ScuNET PSNR is suitable for animation effects; SwinIR_4x has a better effect.

        Upscaler 2: Upsampling model, the default is None, and the model options are consistent with Upscaler 1. It is equivalent to having two models for upsampling at the same time, and the weight of the second model is determined by Upscaler 2 visibility.

        Similarly, the upsampling model can also be added to GFPGAN and CodeFormer, and its corresponding weight can be set. Batch Process and Batch from Directory are batch operations for multiple images.

3 PNG info

        PNG info is used to view the image information of the picture generated by Stable Diffusion, including prompt words, reverse prompt words, number of steps, sampler, seeds and other parameters. Usually, the picture file contains the header file of the picture description, which is called exif information. This information can be read out through image tools or python. Stable Diffusion will write information related to image generation into exif. If we see images generated by Stable Diffusion that we are more interested in on the Internet, we can see the relevant parameter setting information in PNG info, so as to reproduce or fine-tune the corresponding images.

 Figure 5 PNG info

4 Checkpoint Merger

        Checkpoint Merger means that the model merges different models to generate a new model. We may fine-tune the base model to get a new model with a different generative style. Combining different fine-tuned models may be able to obtain the generative characteristics of both models at the same time. After the merger is completed, the new model will also be saved in the models/Stable Diffusion file directory, and can be directly selected and called on the web page.

Figure 6 Checkpoint Merger

        Primary model (A), Secondary model (B), Tertiary model (C) are used to select the models to be merged. It should be noted that these models preferably come from the same class of basic models and have the same structural parameters. If the model structure is different, then a parameter mismatch error will occur.

        Custom Name (Optional): Customize the merged model name.

        Multiplier (M) - set to 0 to get model A: When the models are merged, the weight ratio of the A model.

        Interpolation Method: Three model merging methods.

        (1) No interpolation: No model merging is performed, and only model A is converted. It can realize model type format conversion (ckpt/safetensors) or add VAE (coding optimization algorithm for specific functions such as face restoration).

        (2) Weighted sum, model A and model B are merged, and the merge method is A * (1 - M) + B * M.

        (3) Add difference, the deviation between model A and model B and C is merged, that is, A + (B - C) * M.

        Checkpoint format: Model saving format. Ckpt is saved in dictionary format, which can store additional script information. Safetensors are pure tensors without additional information. Therefore, the safetensors format is more secure.

        Save sa float16: Save the model with float16 (FP16) precision, which can reduce the memory usage of the model.

        Copy config from: Copy the configuration file of the model.

        Bake in VAE: Add VAE (encoding optimization algorithm for specific functions such as face restoration).

        Discard weights with matching name: Set the parameter weights that do not participate in the merger.

5 Train

        Train is used for self-training or fine-tuning Stable Diffusion to achieve better results in some specific areas. It will not be introduced here, and the next section will introduce the training mode of Stable Diffusion in detail separately.

Figure 7 Train

6 Settings

        There are many settings in Settings, including some parameter settings of the above-mentioned functions. For example, we can choose the model used for face restoration in Face restoration. After the general setting is completed, you need to click the "Apply settings" and "Reload Ul" buttons respectively.

Figure 8 Settings

7 Extensions

        Extensions are some extended functions, such as model functions such as Lora or SwinIR. The extension can also set some additional display information through Available.

 Figure 9 Extensions

8 References

        (1) "Ai Drawing Daily Part 4: The Use of Redraw Range in Stable Diffusion WebUI in Improving Image Resolution", "https://post.smzdm.com/p/ao957kxr/".

        (2) "Super Detailed! AI Painting Artifact Stable Diffusion Basic Tutorial", "https://www.uisdc.com/stable-diffusion-2".

        (3) "2023-03-22_5 minutes to learn the function of Stable Diffusion map generation", "https://zhuanlan.zhihu.com/p/616895208".

        (4) "Extras & PNG Info of Stable Diffusion Function Introduction", "https://scratchina.com/html/aihuihua/aihuihuajiaoxue/83.html".

        (5) "On stable diffusion (3)", "https://zhuanlan.zhihu.com/p/617026822"

        (6) "Can anyone introduce Stable Diffusion AI painting in detail? ", "https://www.zhihu.com/question/585008573/answer/2953494275"

9 other parts

        For specific updates of this column, you can follow the official account below the article, or follow this column. All related articles will be updated in " Python AIGC Large Model Training and Reasoning from Scratch ", the address is "https://blog.csdn.net/suiyingy/article/details/130169592". The experience effect of all AIGC-like model deployment will be launched synchronously in the RdFast applet .

        This article is an original article of the blogger, and may not be reproduced without the permission of the blogger.

        This article is a series of articles in the column "Python AIGC large model training and reasoning from scratch", the address is "https://blog.csdn.net/suiyingy/article/details/130169592".

Guess you like

Origin blog.csdn.net/suiyingy/article/details/130348402