AI painting: get started quickly with stable diffusion

Click ↑ above ↑ blue " programmed " to follow me~

c89da018bb36abf8cc6642cbbb7c9b4e.png

This is Yasin's 89th original article

358c34ed1361cef1a4dfae89f2d6dd95.png

mj vs sd

Recently, with the popularity of Chat GPT, AI painting has also become popular. Especially midjourney (hereinafter referred to as mj), can generate AI pictures through text keywords, and can also specify various styles.

Here are some random pictures I drew with mj:

f173d7dd9dcaa35a334137190df1df0f.jpeg

It's easier to get started with mj, so I won't go into details here, interested students can go to station b or Douyin to search, there are more tutorials. Friendly reminder: "Using mj requires scientific Internet access . "

In the process of learning about mj, I learned about another AI painting tool called stable diffusion (hereinafter referred to as sd). It is found that compared to mj, sd has the following advantages:

  1. Local operation, no need to surf the Internet scientifically: After downloading the corresponding resources, there is no need to surf the Internet scientifically;

  2. Open source and free: sd is an open source and free painting platform, while mj charges a fee. Each account has a free quota, and you can draw dozens of pictures;

  3. Highly customizable: sd has various models and plug-ins that are open source, and you can also train your own style, partially modify, adjust details, and even use your own photos and pet photos to highly customize AI paintings that belong to your desired style.

So today I will sort out the basic information for getting started with sd for your reference.

Install

sd is an AI painting platform, the bottom layer is various AI painting models (with the SD model as the base), and the upper layer is the UI interface.

First you need to install the sd web interface. It is a github open source project, the address is: GitHub - AUTOMATIC1111/stable-diffusion-webui: Stable Diffusion web UI.

For mac os M series chips, there are special installation documents: Installation on Apple Silicon · AUTOMATIC1111/stable-diffusion-webui Wiki · GitHub

The installation process is relatively slow, and a large amount of python resources need to be downloaded. You can use this command to set the python source as Tsinghua University in China, so that the download speed will be much faster:

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

After installing the web interface, you can download the corresponding SDK model. It should have built-in sd 1.5, but 2.0 has some optimizations for the landscape type, and you can also install a 2.0 and 2.1, and you can switch at any time later.

05a84804009154aec1c8cd9a4b82e4fd.png

There is also a website called easy diffusion, which claims to be able to install with one click, but the installation failed in the middle of my own practice, and I did not find a solution, so I went to the above github address to install it myself.

interface operation

After installation, run webui.sh to start. The opening interface generally looks like this:

a77d3a782f522f03f39ea3f4008a3820.png

Roughly explain:

  1. Model selection: You can choose an official model, or a third-party model loaded by yourself. For example, my current model is a two-dimensional style model based on sd1.5 training in the community. Click the refresh button next to it to refresh the newly installed model.

  2. Operation types, the more commonly used ones are text to picture, picture to picture. Some of the latter are plug-ins for training, so I won’t expand on them.

  3. Prompt words: The most important part of generating pictures is to use words to describe your needs, and there are negative prompt words to remove certain elements. There are some special rules for prompt words, which will be explained in a separate article later.

  4. In the parameter setting area, there are many parameters. Generally, painters will check the face repair, and then set the width and height of the picture, and the degree of free play. The lower you are, the more free you are, and the higher you are, the more suitable your words are. Generally, 7~10 is recommended. You can set how many sheets per batch and how many batches to run in total.

  5. In the picture operation area, you can save the generated picture, or send it to the picture for subsequent processing.

Model and Lora

The official sd models are 1.5, 2.0, 2.1 on the installation page above, and you can also download them from the website huggingface.co.

The community model recommends the website civitai.com. There are a large number of models or Lora trained by others, with various styles. Although I don't know why they are basically girls' models...

1c67baa378eb159bf50ecb7896204d55.jpeg


Here is an explanation of the general difference between the model and Lora:

  • Model: Based on the original model of sd, the model trained through a large number of pictures can be understood as changing a painter.

  • Lora: Based on the original model of sd, the plug-in after training can be used in combination with the model, which can be understood as adding a layer of ps.

The general model is relatively large (a few GB in size), while Lora is relatively small, usually at the MB level. There are other model types, but the base model and Lora are the ones I usually use more often. The principles and differences of the specific models will not be discussed here.

89e1201acc3dd22ddddf6ec8b5c5ef38.png

Here are a few recommended models or plugins:

  1. Two-dimensional: anything-v3

  2. National style: Chinese ancient style game character model, with 2.5D texture

  3. dreamlike: close to mid style, bright colors, cool, illustration feeling

  4. Live style: protogen

highly customizable

Optimization of details

Because SD is not yet mature enough to deal with some details. So there are some community plug-ins to solve this kind of problem, the more famous one is controlnet.

Simply put, you can control the general position and shape of the generated image through a sketch or an existing image. Very useful for controlling pose and layout of generated images.

f8f9c988e2d361593c5f3a10e97ad3f9.png

Controlnet has many courses on station b, you can learn to use posture:

5c9f9413a3bcd9738035608df169c29b.jpeg

train your own style

SD can also use pictures to train models of its own style (for example, the girls drawn are all the faces of the pictures you trained, and the dogs drawn are similar to the pictures of the dogs you trained). It is also integrated into the web-ui, which is more convenient to use. I haven't used this one specifically yet, so I'll introduce it in detail when I use it later.

d27e1cc34436b517c8aad1afb8b37582.png

Bonus: some pictures drawn today

The following is the same group of words, the same parameter, and the pictures generated under different models:

prompt:  lora:cuteGirlMix4_v10:1 1 girl, sitting posture, desk, books, backpack, school uniform, big eyes, smile, looking upward, frontal view

black prompt: ((disfigured)), ((bad art)), ((deformed)),((extra limbs)),((close up)),((b&w)), wierd colors, blurry, (((duplicate))), ((morbid)), ((mutilated)), [out of frame], extra fingers, mutated hands, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), (((deformed))), ((ugly)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))), out of frame, ugly, extra limbs, (bad anatomy), gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), mutated hands, (fused fingers), (too many fingers), (((long neck))), Photoshop, video game, ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, mutation, mutated, extra limbs, extra legs, extra arms, disfigured, deformed, cross-eye, body out of frame, blurry, bad art, bad anatomy,

782fb07c63940891ef3ebdc93877e8fb.png e62716fd10c82975c5170eee4597e53c.png 969897823f1bc44d9ff95f266e8912da.png 2d1e27390a3b52ad358a10041269a739.png

em... I still like real people style, how about you?

5a5bcd66a65ea8efdec8a3c8ab15577c.png

about the author

I'm Yasin, a techie who loves to blog

WeChat public account: compiled a program (blgcheng)

Personal website: https://yasinshaw.com

Welcome to pay attention to this public number9bf156576ecc6762c66528c2b92ac0ee.png

80a3046e4294f4b5bb3c874e8eed0bd7.png

おすすめ

転載: blog.csdn.net/yasinshaw/article/details/129828377