Stable Diffsuion resource directory
Brief introduction of SD
Stable Diffusion is a deep learning text-to-image generation model released in 2022. It is mainly used to 文本的描述
generate details 图像
, although it can also be applied to other tasks, such as inner complement drawing, outer complement drawing, and translation of map-generated maps guided by prompt words.
The technical principle used in s is called diffusion algorithm, and the specific algorithm design is a very complicated subject. The general principle is to blur a picture, and then关键词
make the content bit by bit concrete. In the end we got what we described.
I have known about this technology since September 2022 last year. However, there is no detailed understanding of the practical operation plan of this technology.
This year, I also looked at the Alibaba Cloud operation of this technology and the content of some public accounts intermittently this year. There are also relatively popular hidden QR codes and light and shadow street maps with hidden text, which are also concerned. But it didn't actually work out.
This weekend, I found and re-read the materials I had seen before, and practiced it. Below is my summary of this technique.
sd install
If sd is operated locally, a good discrete graphics card is required. If you can't meet the requirements of this graphics card, you can choose Alibaba Cloud server or some cloud desktops to operate.
- The local installation environment is divided into three steps, first install the python environment, and configure it into the environment variable. Secondly, you need to install the cuda support that you need to support first, and
nvidia-smi
you can check the version supported by your deviceCUDA version
. - You can download the corresponding version of cuda through the link of the cuda download address
- Then you need to install it
git
, because you need togithub
clone the project from abovewebui
, and some plug-ins can also be passedgit
,clone
the webui project address on github
- In addition to this fully manual method, you can also find some great god launcher installation packages from station B. The installation packages are all packaged with the operating environment, which is very worry-free to use.
Model download
After the project starts, it is necessary to download some necessary large models, and there are many models required by sd.
checkpoint
This is the source of data needed for sd to run. Large models have large file sizes. Generally, it will be above 2 G. It is equivalent to a large dictionary needed for sd painting.VAE
This is to color the screen, which can be understood as a filter. Make the picture more vivid and eye-catching.Embedding
SD creation screens need to be described by text, that is, keywords. This model is equivalent to directly packaging some keywords. When describing, there is no need to describe a lot of content, and the model is directly imported into the corresponding folder. Then fill in the corresponding Embedding in the keyword.Hypernetwork
This is equivalent to a bookmark, because the dictionary is very large, and this can make the pictures made by sd more in line with our requirements.Lora
Lora is equivalent to an upgraded version of Hypernetwork. It is equivalent to a more detailed description on the basis of a large model, and a more detailed description of the requirements. Fine-tune results for large models.LyCORIS
LyCORIS is an upgrade of Lora, which is more sophisticated in algorithm generation and more adjusted.
All the models are designed to provide a basis for our description content and to allow sd to have a range of painting, because the data set is very large when training a large model, and it needs to be accurate through other smaller models when it is actually running. control. So there are these models.
The following are some commonly used model download addresses:
- of the state
- huggingface
- esheep (the connection is more stable to the first two)
- liblibai (the connection is more stable to the first two)
- openart
keyword, descriptive sentence
Prompt words: forward text prompt words, reverse text prompt words
The most important thing is to confirm what content you want to express, that is, the theme, because the content serves the theme
- Characters and Subject Characteristics
- Scene Features
- ambient light
- Supplementary, frame angle
In addition to the description of the content, some keywords for describing the image quality are also required
- high-definition
- specific high resolution
- Tips for style of painting (illustration, two-dimensional, realistic)
Prompt word weight: You can increase the weight by adding parentheses, or you can control it by adding specific multiples after keywords.
Prompt word website:
plugin management
After installing sd from github, you can start drawing. But in order to make the software more usable, we can install some plug-ins to make sd more usable
- Chinese Sinicization Language Plugin
- Gallery browser plugin (image broswer)
- Prompt word plugin
- Ultimate Upscale (enlarges the picture, appears as a script in Vincent's diagram)
- Local Latent Couple (LLUl) (local detail optimization plugin)
- after-detailer (helps beautify face or hand details)
- The controlNet
plug-ins are basically managed on github, and can also be downloaded and installed through other code platforms such as gitee.
controlNet
controlNet is a plug-in for fine-grained control of the screen. Some metadata of the picture is needed, for example
通道信息
, this information can be obtained through ps.
- Control your posture
- Control the depth of the picture
- Take control of the edges of the screen
- Control the light and shadow
- Generate Art QR Code
train the model yourself
If the above content is not enough to meet our requirements, then we can carry out model training by ourselves
- Find the data source, some photos of an object
- Tag the picture through sd, that is, identify the content on the screen. as a basis for learning
- give it to sd to learn
- The data source should preferably be of uniform size
- If the local graphics card is not enough, you can get three months of free PAI cloud service through aliyun
- Use the project on github , Lora_Trainer on colab for data training
- After the training is completed, you need to select the model with the best effect through the xyz script of sd