Click ↑ above ↑ blue " programmed " to follow me~
This is Yasin's 89th original article
mj vs sd
Recently, with the popularity of Chat GPT, AI painting has also become popular. Especially midjourney (hereinafter referred to as mj), can generate AI pictures through text keywords, and can also specify various styles.
Here are some random pictures I drew with mj:
It's easier to get started with mj, so I won't go into details here, interested students can go to station b or Douyin to search, there are more tutorials. Friendly reminder: "Using mj requires scientific Internet access . "
In the process of learning about mj, I learned about another AI painting tool called stable diffusion (hereinafter referred to as sd). It is found that compared to mj, sd has the following advantages:
Local operation, no need to surf the Internet scientifically: After downloading the corresponding resources, there is no need to surf the Internet scientifically;
Open source and free: sd is an open source and free painting platform, while mj charges a fee. Each account has a free quota, and you can draw dozens of pictures;
Highly customizable: sd has various models and plug-ins that are open source, and you can also train your own style, partially modify, adjust details, and even use your own photos and pet photos to highly customize AI paintings that belong to your desired style.
So today I will sort out the basic information for getting started with sd for your reference.
Install
sd is an AI painting platform, the bottom layer is various AI painting models (with the SD model as the base), and the upper layer is the UI interface.
First you need to install the sd web interface. It is a github open source project, the address is: GitHub - AUTOMATIC1111/stable-diffusion-webui: Stable Diffusion web UI.
For mac os M series chips, there are special installation documents: Installation on Apple Silicon · AUTOMATIC1111/stable-diffusion-webui Wiki · GitHub
The installation process is relatively slow, and a large amount of python resources need to be downloaded. You can use this command to set the python source as Tsinghua University in China, so that the download speed will be much faster:
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
After installing the web interface, you can download the corresponding SDK model. It should have built-in sd 1.5, but 2.0 has some optimizations for the landscape type, and you can also install a 2.0 and 2.1, and you can switch at any time later.
There is also a website called easy diffusion, which claims to be able to install with one click, but the installation failed in the middle of my own practice, and I did not find a solution, so I went to the above github address to install it myself.
interface operation
After installation, run webui.sh to start. The opening interface generally looks like this:
Roughly explain:
Model selection: You can choose an official model, or a third-party model loaded by yourself. For example, my current model is a two-dimensional style model based on sd1.5 training in the community. Click the refresh button next to it to refresh the newly installed model.
Operation types, the more commonly used ones are text to picture, picture to picture. Some of the latter are plug-ins for training, so I won’t expand on them.
Prompt words: The most important part of generating pictures is to use words to describe your needs, and there are negative prompt words to remove certain elements. There are some special rules for prompt words, which will be explained in a separate article later.
In the parameter setting area, there are many parameters. Generally, painters will check the face repair, and then set the width and height of the picture, and the degree of free play. The lower you are, the more free you are, and the higher you are, the more suitable your words are. Generally, 7~10 is recommended. You can set how many sheets per batch and how many batches to run in total.
In the picture operation area, you can save the generated picture, or send it to the picture for subsequent processing.
Model and Lora
The official sd models are 1.5, 2.0, 2.1 on the installation page above, and you can also download them from the website huggingface.co.
The community model recommends the website civitai.com. There are a large number of models or Lora trained by others, with various styles. Although I don't know why they are basically girls' models...
Here is an explanation of the general difference between the model and Lora:
Model: Based on the original model of sd, the model trained through a large number of pictures can be understood as changing a painter.
Lora: Based on the original model of sd, the plug-in after training can be used in combination with the model, which can be understood as adding a layer of ps.
The general model is relatively large (a few GB in size), while Lora is relatively small, usually at the MB level. There are other model types, but the base model and Lora are the ones I usually use more often. The principles and differences of the specific models will not be discussed here.
Here are a few recommended models or plugins:
Two-dimensional: anything-v3
National style: Chinese ancient style game character model, with 2.5D texture
dreamlike: close to mid style, bright colors, cool, illustration feeling
Live style: protogen
highly customizable
Optimization of details
Because SD is not yet mature enough to deal with some details. So there are some community plug-ins to solve this kind of problem, the more famous one is controlnet.
Simply put, you can control the general position and shape of the generated image through a sketch or an existing image. Very useful for controlling pose and layout of generated images.
Controlnet has many courses on station b, you can learn to use posture:
train your own style
SD can also use pictures to train models of its own style (for example, the girls drawn are all the faces of the pictures you trained, and the dogs drawn are similar to the pictures of the dogs you trained). It is also integrated into the web-ui, which is more convenient to use. I haven't used this one specifically yet, so I'll introduce it in detail when I use it later.
Bonus: some pictures drawn today
The following is the same group of words, the same parameter, and the pictures generated under different models:
prompt: lora:cuteGirlMix4_v10:1 1 girl, sitting posture, desk, books, backpack, school uniform, big eyes, smile, looking upward, frontal view
black prompt: ((disfigured)), ((bad art)), ((deformed)),((extra limbs)),((close up)),((b&w)), wierd colors, blurry, (((duplicate))), ((morbid)), ((mutilated)), [out of frame], extra fingers, mutated hands, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), (((deformed))), ((ugly)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))), out of frame, ugly, extra limbs, (bad anatomy), gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), mutated hands, (fused fingers), (too many fingers), (((long neck))), Photoshop, video game, ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, mutation, mutated, extra limbs, extra legs, extra arms, disfigured, deformed, cross-eye, body out of frame, blurry, bad art, bad anatomy,
em... I still like real people style, how about you?
about the author
I'm Yasin, a techie who loves to blog
WeChat public account: compiled a program (blgcheng)
Personal website: https://yasinshaw.com
Welcome to pay attention to this public number