Stable Diffusion implements AI painting and tries to move towards actual business...

Prefix:

        Since we are in the early stages of learning, we know little, and there may be misunderstandings, etc., the article is for reference only. It is intended to share knowledge and learn together, and cannot be used as a basis. If there are any errors, thank you for pointing it out. The topic of the article is divided into two parts. , the learning part is also theoretical knowledge, and the practical part mainly includes problems and solutions.

一、Stable Diffusion:

Vincent picture:

1. Prompt word classification and writing:

Tips:

①AI’s understanding of prompt words is phrase > sentence, and the complete grammatical structure is more difficult to understand.

1.1 Content prompt words:

Character and subject characteristics:

clothing white dress

Hair color red hair, long hair

Facial features: small eyes, big mouth

facial expressionsmilling

body movements stretching arms

Scene characteristics:

Indoor/outdoor

Big scene forest, city, street

Small details tree, bush, white flower

ambient lighting

day/night day/night

Specific period morning,sunset

Light environment sunlight, bright, dark

sky blue sky, starry sky

Frame angle

distanceclose-up,distant

Character proportions full body, upper body

Observation perspective from above, view of back

Lens typewide angle,Sony A7 IIII

1.2 Standardized prompt words:

Picture quality tips:

Universal high-definition best quality, ultra-detailed, masterpiece, hires, 8k specific high-resolution type extremely detailed CG unity 8k wallpaper (ultra-fine 8KUnity game CG), unreal engine rendered (unreal warning rendering)

Tips for painting style: illustration, painting, paintbrush, anime, comic, game CG, photorealistic, realistic, photograph

2. Weight and negative prompt words:

①Set of brackets () (white flower) = *1.1 weight

②(white flower:1.5) Add directly in the brackets: and the required weight

③(((white flower))) The weight of each additional layer is 1.1 times the original one, here it is 1.331 times

④Multiple brackets { { {white flower}}} Each set has an extra *1.05 times

⑤ Square brackets [[[white flower]]] One layer per set, extra *0.9 times

3. Reverse prompt words:

As the name suggests, reverse prompt words are content that does not need to appear on the screen.

4. Detailed explanation of drawing parameters:

4.1 Sampling method

A specific algorithm used by AI to generate images

Euler (a)

Suitable for illustration style

DPM 2M 和2M Karras

high speed

SDE Karras

Richer details

Tips:

Band ++ is generally an improved algorithm

4.2 Facial restoration

Use adversarial algorithms to identify people’s faces and repair them, similar to the smart p-face of Meitu app

4.3 Tiling/Tiling

Used to generate textured images that can seamlessly fill the entire screen. It is easy to make mistakes when checking the box.

4.4 Prompt word relevance

As the name suggests, it is the degree of relationship with related words. The general safe range is 7-12. If it is too high or too low, it will easily deform.

4.5 Random seeds

The degree of picture relationship, based on my current use, using a fixed random number seed can ensure that the content can be changed on the original basis.

4.6 Generate batch/quantity

Because the pictures generated by AI based on the same prompt word are random, the production batch generates different pictures under the same prompt word. Finally, the pictures that meet the expectations can be selected through comparison.

The number of each batch is the number of images generated in each batch. The default is one per batch. If there are multiple video memories, it may explode. It is easy to understand that the amount of parallelism is more, but due to video memory problems, it may not be supported. In operation, theoretically improving efficiency depends on performance.

2. Difficulties and solutions to practical business problems in AI painting (including unresolved items)

1. Drawing skills and error handling:

①Handling common errors in drawing

ValueError: images do not match

Time taken: 10.55s

Torch active/reserved: 4200/5126 MiB, Sys VRAM: 7450/8192 MiB (90.94%)

reason:

Many times, in order to ensure that the pixels are the same as the original image, sizes such as width 417, height 714, etc. will be set, which will lead to such an error.

②Canyan skills in redrawing pictures:

How to deal with the excessive impact of canyan edge monitoring on drawing:

 

Reduce the weight and guide steps. If the line is too complex, you can increase the low Threshold and increase the High Threshold appropriately.

        Under the premise that the above parameters are met, the redraw amplitude is around 0.7, which makes it easier to produce a satisfactory picture in my test.

2.AI model dressing problem:

When partially redrawing:

full image:

①Suitable for changing the background

Mask only:

① If it is a full-body photo, selecting only the template for partial redrawing will cause picture deformation, such as flying hair, extended clothes, etc.

②In the graffiti redraw mode, only a certain part of the picture is modified, such as eye disharmony, etc., which is suitable for repairing small parts.

Mask blur:

The larger the value, the weaker the boundary feeling; conversely, the larger the value, the blurr the boundary feeling. If it is too big, it will easily produce redundant limbs.

In partial redraw mode, the task is not changed but the background is changed. The problem of derived limbs, hair or clothing from the original model is solved:

Prefix:

① It may be possible to solve the problem by selecting the entire image instead of just the mask for the partial redraw area.

② The mask covers the content. Selecting the original image can preserve the non-mask area to the greatest extent. Although filling can also maintain the original image, it may extend the hair, etc., which is unstable and sometimes good or bad.

                                                        Original picture

 

                                                filling 

        

                                              latent space noise

Advanced: (If you have set up the basics correctly and still have questions, you can consider the following)

① Just modify the 3D openpose to make it smaller than the original one, and just use the scroll wheel to shrink it ② Reduce the mask blur value

Tips:

AI painting is random. Even if the above is correct, problems may still occur. You can make some comments under the prompt words. Since the model is actually still generated, we can operate on the possible derivative models, such as short hair or bald head (to prevent Hair derived), underwear (to prevent clothes and pants from being too long)

You only need to modify the width and height to a multiple of 8, such as 400 600 or 600 800. It depends on the situation to avoid errors.

3.AI model face-changing business (still under optimization):

Original picture:

Rendering:

Due to space issues, I won’t show too much. Let’s talk about the implementation steps and attention to details.

Implementation steps:

1.tag tag configuration:

Mine are all default configurations without much optimization. For example, if you want your skin to be whiter, you can add it before the positive prompt word, or you can choose (white face) (white face: 1.2) to increase the weight. By default, everyone Yes, this is very simple and I won’t elaborate much.

Positive tag:

best quality, ultra high res, (photorealistic:1.4),1girl,close mouth

Reverse tag:

(((gape))),paintings, sketches, (worst quality:2), (low quality:2), (normal quality:2), lowres, normal quality, ((monochrome)), ((grayscale)), skin spots, acnes, skin blemishes, age spot, glans,bikini,medium breast,,Nevus, skin spots,nsfw,(((neck too long))),(((wrinkle)))

 2. Select local redraw:

Select the area that needs to be redrawn by graffiti, and the dot on the right can be pulled to adjust the thickness of the brush.

3. Configuration parameters

Just copy the picture below. It is worth noting that the width and height should be similar to the original picture. The generation effect will be better and it is best to be a multiple of 8. It cannot be 417 714, which will cause a drawing error. The default number of iteration steps is 20, 28 It can improve the fineness of the picture. It is best to check the box for facial repair. It is not easy to appear hideous faces. The mask blur is similar to the sharpening of PS. Depending on the situation, 4-8 is recommended as the best. The mask covers the content and chooses to fill the screen. More harmonious. I won’t go into too much detail about the specific parameters.

 4. Open 3D openpose to adjust character movements

①Click on the file as shown, select Detect from Picture, import the image that needs to be modified, and the background image and posture as shown below will be generated. Then in the settings, click on the movement mode and check it, and then adjust the bones and background to It is almost integrated, as shown in Figure 2. You can adjust the distance, that is, the size of the bones, through the mouse wheel. The closer it is, the closer it will be to the camera. Just keep it similar to the background image, and then click Generate. By the way, don’t forget to set the width and height in the upper right corner to be consistent with the canvas. For example, if I am 600 800, then I will set 600 800.

 5.Send to controlnet

Send the first one to the 0 channel port. If it is occupied, it can be sent to 1 2 3, etc., and the others can be closed. Then click the graph to generate the graph to return to the drawing interface.

 6. Enable at controlNet

Click Enable at controlNet, otherwise it will be invalid, then select None for preprocessing, and select Openpose for the model. There is also an adjustment for width and height below. Just keep the canvas the same as before. For example, mine is 600 800. In short, it involves width and height. It would be better if they were all unified, rather than being easily confused.

 7. Generate pictures and fine-tune details

After completing the above steps, click the generate button to generate the picture. If you find something wrong, make fine adjustments, such as defining the hairstyle in the tag. It is difficult for AI drawing to generate a satisfactory picture at one time.

 

 Manually adjust the position on the left side, or fine-tune the input parameters on the right side. The left side is prone to deformities and is more difficult to modify. If some parameters on the right side still output better, then continue to generate, send to controlNet, and click preprocessing Preview it and then continue generating until you are satisfied with the image output.

 Through repeated adjustments, a more satisfactory picture is finally generated. You can also partially redraw and modify the hairstyle based on the picture.

 

4. Local optimization of AI painting:

1. First explain the business scenario in the picture above

        AI has a unique understanding of edge processing, and always likes to add superfluous information, but this cannot be blamed on him. Maybe it is the interference of parameters and other reference pictures, so what we can do is to explain our needs as clearly as possible, as follows, I hope to The original image has modified the background and face, but there are many unsatisfactory derivatives on the edges. How can we deal with it as expected?

                                                                        Original picture 

                                            Modification 1.0

 2. Implementation method:

        When there are derivative parts and you want to change the color through graffiti redrawing, it is best to use the following parameters, especially the mask blur is 2. If it is too low, the boundary will be too strong, if it is too high, the impact will be too large. Select 0.3 for the redrawing range , and then Select the original image mode for mask area content processing to best restore the desired effect.

        The specific parameters are as follows:

3. Specific implementation:

        ① Use the FastStone Capture (you can download it if you don’t have a Baidu name) color-picking pen to pick the color of the jeans, change the derived part to the color that needs to be modified, and then select the redraw amplitude 0.3, mask blur 2, and process the content of the mask area These three parameters of the original image are very important, and then the posture and content are restricted through openpose+canyan (openpose and canyan are not difficult for everyone by default, and there are tutorials on station b).

 4.Result display:

                                                Modification 2.0

5. Summary:

       This is already relatively close to the requirements. AI drawing is actually not difficult. We just use levels. If there is something that does not meet the requirements, just continue to fine-tune it. In fact, the steps of drawing are not as simple as what I mentioned above. I will make a slightly satisfactory one. Drawing inevitably takes time. The pain point of AI drawing is: adjusting parameters takes up 10% of the time, drawing takes up 80%, and adjusting the test script takes 10%. I am currently in the learning stage and I hope you can tell me if there is a better method.

Guess you like

Origin blog.csdn.net/weixin_54515240/article/details/130563820