Stable Diffusion tutorial: 4000 words to explain the picture clearly

Original text:Stable Diffusion Tutorial: 4000 words to explain the picture clearly - Zhihu

Table of contents

Basic use

Doodle drawing

local drawing

Partial painting (graffiti mask)

Partial drawing (upload mask)

Batch processing

Summarize

Download

"Pictures to generate pictures" is one of the core functions of Stable Diffusion. Its main ability is to generate new transformed pictures based on existing pictures + prompt words, which is particularly useful in daily work and life.

Without further ado, let’s see what magic it has.

Basic use

This section takes the generation of two-dimensional style photos of people as an example. The specific steps are as follows:

1. Select a two-dimensional model in the Stable Diffusion large model:

2. Upload a portrait photo on "Tu Sheng Tu" and write the corresponding prompt words and reverse prompt words. The prompt word here is optional, but not filling it in may result in poor quality of the generated image.

Prompt words:

best quality, masterpiece, super high resolution, 4k, adult women, asia, full body:1.4, long black hair, looking at viewer, beautiful detailed eyes, small breasts, white t-shirt:1.6, white pants:1.6, wide shot:1.3, strolling, beach:1.3, tree, beautiful detailed sky, blue sky

Reverse prompt words:

deformed,bad anatomy,disfigured,poorly drawn face,out of frame,bad hands,unclear eyes,cloned face,bad face, disfigured, deformed, cross-eye

3. Set the relevant parameters of Stable Diffusino:

(1) Scaling mode: Because the size of the reference image and the newly generated image may be inconsistent, we need to choose how to process the reference image when generating a new image.

There are four modes here: stretch, crop, fill and direct scaling. If you don’t have special needs, you probably won’t like the effects of stretching and direct scaling. Their effects are as shown in the figure below:

Of course, if the size of the reference image and the newly generated image are the same, there will be no problem no matter how you set the [zoom mode].

(2) Sampler and number of sampling steps: The effect of most samplers is not very different. Generally, Euler a and 20-step sampling are enough. For others, you can try the DPM+ series. There will be an article specifically about samplers later.

(3) The size of the generated image: generally remains the same as the original image, but can also be changed. When changing, the influence of [zoom mode] needs to be considered.

(4) Generation times and quantity per time: The default is 1, which means it is only generated once and only one picture is generated at a time. Increasing the [number of generations] will significantly increase the generation time, and increasing the [number of times] will significantly increase the video memory usage.

(5) Prompt word guidance coefficient: The default is 7. The larger the value, the closer the generated image will be to the description of the prompt word. The smaller the value, the more the AI will exert itself. Common settings are 5-12.

(6) Redraw strength: To what extent the newly generated picture will change the reference picture. The larger the value, the more free the new picture will be and the less it will look like the reference picture. Here we don’t want the two-dimensional photo of the character to be too different from the original image, so the value is set relatively small. The same value may have different effects on different models or pictures, so it needs to be adjusted according to the effect when actually processing the picture.

Then we can generate the picture. Below is the picture I made using different redraw amplitudes:

Doodle drawing

Graffiti drawing is to draw a shape that represents something on the original picture, and then let Stable Diffusion combine with our prompt words to generate corresponding elements in the picture. The example below is to add a pair of angel wings to a girl.

1. Choose a suitable large model and declare the elements you want to appear in the picture in the prompt.

2. Select "Graffiti Drawing" in the Generation tab below, upload a picture to be graffiti, and then use the brush tool to draw the shape you want on the picture. Here you can choose the color we want. I want the angel's wings to be white, but they shouldn't be pure white, so I chose a somewhat grayish white here.

3. There is nothing to say about the parameters of Stable Diffusion. Pay attention to the matching of scaling mode and image size. It is recommended to start the redraw intensity from 0.5, and then adjust it according to the actual image rendering effect.

4. Then you can generate the picture, see the effect below.

As you can see, the wings are drawn, but other parts of the picture have also undergone some changes. Graffiti drawing is a redrawing of the entire picture. In some scenes, this is not perfect. To solve this problem, you need to use the graffiti mask introduced below. Version.

Using this technology, we can also write a few words on the picture. Stable Diffusion will process the words into a brush writing effect, which looks better.

local drawing

Partial painting is to change only part of the original image. The specific method is to use a brush to cover some parts, and then we can choose to redraw only the covered parts, or only redraw the uncovered parts. The portion covered in Stable Diffusion is called mask content.

Here is an example of changing the girl’s hair color to blonde.

1. Choose a basic model that matches the photo style; in order to modify the color, we need to focus on describing the redrawing requirements in the prompt word. The prompt word can only describe the part to be redrawn, or it can be a complete picture description. When the redrawing range is large, it is recommended to only describe the part to be redrawn, otherwise the redrawn part may not be well connected with the original image.

2. Click "Partial Drawing" in the Generation tab, upload a picture, and use the brush tool to paint the part to be redrawn.

3. Other Stable Diffusion parameters are as follows:

(1) Mask blur: Set the blur at the connection between the redrawn part and the original image, which acts as a gradient and makes the connection look more natural. If the connection is not good, you can turn it up a little and take a look.

(2) Mask mode: draw masked content, that is, draw the masked content; draw non-masked content, that is, draw the unmasked part. How to choose? My experience is to choose whichever one is less smeared and save yourself some work. For example, if we want to redraw the hair here, the hair area is much smaller than the entire picture, so we paint the hair, and choose to draw the mask content here; if the hair area is large, we can paint other areas, and then choose to draw the non-mask content here. version content.

(3) Drawing area: When selecting the entire image, Stable Diffusion will redraw the entire image first, and then replace it with the original image when the image is finally drawn; when only masking, Stable Diffusion will only draw the part that needs to be redrawn. Personally, I feel that there is not much difference in the rendering effect. If you are interested, you can try it more for comparison.

(4) Mask-only drawing reference radius: When the drawing area is only masked, the amount of outward expansion of the drawing area, the unit is pixels. Using this parameter can solve the problem of difficult smearing on the edge of the mask. The effect is similar to that when we smear manually, smear a little more outward, so that the newly generated part can better connect with the original image.

(5) Picture size: Set according to needs, pay attention to select the appropriate scaling mode. I only changed the color here so it remains consistent with the original image.

(6) Redraw intensity: To what extent the content of the original image is changed. Because it is relatively difficult to change hair color, I set it to the maximum setting of 1. The default is 0.75, the setting needs to be adjusted according to the actual rendering effect.

4. Finally, let me show you the effect of partial redrawing:

You can see that the hair has turned golden, but the shape of the hair has also changed. This shows that Stable Diffusion is redrawing rather than simply coloring; and only the masked part is redrawn, and the other parts are redrawn. Nothing has changed.

This technology is widely used, such as changing clothes for models. Of course, if you want to change specific clothes, you must combine it with other technologies.

Partial painting (graffiti mask)

We introduced graffiti drawing above, but graffiti drawing will redraw the entire picture. Graffiti mask can solve this problem.

Let’s take the example of adding angel wings to a character.

1. Select a large model that matches the picture style, fill in the picture content you want to draw, upload the picture, choose the appropriate color, and draw the desired shape on the picture.

2. There is an additional mask transparency in the parameters of Stable Diffusion. The larger the value, the more transparent the drawn content is; in addition, we can choose a higher redraw intensity here, because the redraw only targets part of the picture, so we don’t need to worry. Other parts will be changed too much.

3. Without further ado, let’s take a look at the effects:

4. Share a few more examples:

Put a small yellow flower on the girl's head.

Give the girl a haircut.

Draw a map in the sky (the area is a bit small, you can try a larger map).

Partial drawing (upload mask)

Above we demonstrated the effect of a graffiti mask, but this hand-drawn mask still has a problem: it is not accurate enough and it is more troublesome to draw.

Uploading the mask can solve this problem. We can use other tools to make an accurate template, then upload it to local drawing, and then redraw it.

Still taking the photo of the girl in this article as an example, we will use the upload mask to redraw the character and background respectively.

1. Select a large model that matches the picture style and write the content you want to redraw in the prompt words.

Prompt word 1 (draw the mask content, replace the Asian girl with a blonde American girl):

best quality, masterpiece, super high resolution, 4k, 1girl, american, full body:1.4, long golden hair, looking at viewer, beautiful detailed eyes, brown t-shirt:1.31, blue jeans:1.31

Prompt word 2 (draw non-mask content, change the background to a war-torn street):

best quality, masterpiece, super high resolution, 4k, 1girl standing in the middle of war-torn streets

Reverse prompt words:

deformed,bad anatomy,disfigured,poorly drawn face,out of frame,bad hands,bad fingers,unclear eyes,cloned face,bad face, disfigured, deformed, cross-eye， EasyNegative

2. Upload the original image and mask image respectively in "Partial Drawing (Upload Mask)".

The mask image can be made using a cutout plug-in of Stable Diffusion WebUI:stable-diffusion-webui-rembg, my previous article It was introduced in , click here to go to the AI cutout tutorial.

3. Stable Diffusion parameter settings. Note that we need to use two mask modes to draw separately.

4. Then you can generate the picture, the effect picture is as follows:

Batch processing

"Batch processing" can complete the "picture-to-picture" processing of a group of pictures.

1. We first have to choose a large model and write relevant prompt words (but this prompt word can also be left blank, the specific reasons will be explained later).

2. Looking further down, "Batch Processing" has a special set of parameters, as shown in the figure below:

(1) Input directory: The directory where the original image is located must be set.

(2) Output directory: The storage directory for newly generated images must be set.

(3) Batch drawing mask picture directory: If you need to perform partial drawing based on the mask, specify the directory where the mask picture is located here. The file name of the mask picture needs to correspond one-to-one with the file name in the "input directory".

(4) Controlnet input directory: If Controlnet is used, the reference image used in Controlnet needs to be specified here. Leave blank to use the files in the "input directory".

(5) PNG info: Extract information from a group of pictures as parameters for each picture when batch "picture-generating". For example, we have generated a set of pictures in advance and now want to convert them to another style. By enabling this parameter, we can use the generation information of the original picture when generating new pictures to maximize the retention of various elements in the original picture. . To enable this parameter, you need to check "Append png info to prompts". It also has several sub-parameters:

"PNG info directory" sets the image directory used to extract the generation parameters. The image file names in it need to correspond one-to-one with the file names in the "input directory". The default is "input directory";
"Parameters to Take from png info" Generate parameters for the image you want to use, just check it according to your needs. Note that if the prompt word is checked, it will be appended to the prompt word of "图生图". If we want to use the prompt words of these pictures completely, we can leave the prompt words and reverse prompt words of "Pictures and Pictures" at the top of the page blank.

3. Further down, there are some standard parameters of Stable Diffusion. Note that if "Append png info to prompts" is checked, the parameters in the red box in the figure below will be replaced when generating images.

After batch generation, we can find the newly generated images in the output directory. The image processing effects have been demonstrated above and will not be shown here.

Summarize

Through the above demonstration, we can see that the graph can refer to the original image and then generate a new image according to our instructions. In this redrawing process, we can use graffiti, masks, etc. to affect the effect of the drawing. This is a kind of precise control ability, but this is not enough. If we need more precise control, we need to use ControlNet: Basics of Stable Diffusion: ControlNet for precise control .

Download

If you are interested in Stable Diffusion, I have compiled a lot of SD-related models and plug-ins, which have been uploaded to the Stable Diffusion painting resources I have compiled. They will continue to be updated in the future. If necessary, please follow/follow/public/public /No.: Yinghuo Walk AI (yinghuo6ai), send message: SD, you can get the download address.