InsCode Stable Diffusion Tutorial [InsCode Stable Diffusion Photo Event Phase 1]

Record how to use InsCode Stable Diffusion for AI drawing and how to use it.

1. Background introduction

Currently, there are only two authoritative AI painting software on the market that can be used in work. One is called Midjourney (MJ for short), and the other is Stable Diffusion (SD for short). MJ needs to be paid to use, while SD is open source and free, but the difficulty of getting started and the cost of learning are slightly higher, and it is very demanding on computer configuration (graphics card, memory).

Compared with MJ, SD's biggest advantage is open source, which means that Stable Diffusion has huge potential and rapid development. Due to the open source and free attributes, SD has gained a large number of active users, and the developer community has provided a large number of free and high-quality external pre-training models (fine-tune) and plug-ins for this purpose, and continues to maintain and update them. With the support of third-party plug-ins and models, SD has richer personalized functions than Midjourney.

Introduction to Stable Diffusion

Stable Diffusion is a deep learning text-to-image generation model released in 2022. It is mainly used to generate detailed images based on text descriptions (ie txt2img application scenarios), although it can also be applied to other tasks such as inpainting, Outpainting, and the translation (img2img) of generating images under the guidance of prompt words (English).

Model principle

insert image description here
Citing a well-known SD schematic (from the paper https://arxiv.org/abs/2112.10752 ), the model can be divided into three main parts:

  • Variational encoder (Vector Quantized Variational AutoEncoder, VQ-VAE)
  • Diffusion Model (DM), which plays the most important role in generating pictures
  • Conditional Controller (Conditioning)

For a detailed introduction to the principle, please refer to the article Introduction to Stable Diffusion

Summarize the principle of the SD model in one sentence: the picture is converted to a low-dimensional space through VAE, and the DM of Conditioning is used to generate new variables, and then the generated variables are converted into pictures through VAE .

Recommended computer configuration

Stable Diffusion has certain requirements for computer configuration, and the recommended configuration is as follows:

Operating system : SD is more suitable for windows. It is recommended to use windows10, windows11.

Memory : more than 8GB, it is recommended to use 16GB or more memory. In the case of relatively small memory, it may be necessary to increase the virtual memory to accommodate model files.

Hard disk : more than 40GB of available hard disk space, it is recommended to prepare more than 60GB of space, preferably a solid-state hard disk.

Graphics card : A minimum of 2GB of video memory is required, and no less than 4GB of video memory is recommended, and more than 8GB is recommended. In terms of models, because CUDA acceleration is needed, the N card supports it well. A card can be used, but the speed is significantly slower than that of Nvidia graphics card. Of course, if your computer does not have a graphics card, you can use the CPU to generate hundreds of times of time.

The following is a comparison of the speed of mainstream graphics cards when generating 512x images:

insert image description here

Stable Diffusion WebUI

At present, there are some webui open source projects based on Stable Diffusion package. Stable-diffusion can be used through interface interaction, and more capabilities can be obtained through plug-ins, which greatly reduces the threshold of use. The following are some popular ones. The webui project:

These projects are different from the usual software installation methods. They are not ready-to-use software after downloading and installing. They need to prepare the execution environment, compile the source code, and make some manual adjustments for different operating systems (operating system dependencies) and different computers (hardware dependencies). , which requires the user to have certain experience in program development.

2. Online address of Stable Diffusion model

InsCode's Stable Diffusion environment is mainly used to learn and use Stable Diffusion. It has installed related software and component libraries, and can directly start Stable Diffusion WebUI online for creation. You can also buy computing power with one click and train large models, which greatly reduces the threshold for using AI graphics.

Stable Diffusion model online use address : https://inscode.csdn.net/@inscode/Stable-Diffusion

After entering, click Run and use, and a window for purchasing computing power resources will pop up. Since it is a trial, it does not involve occasions such as continuous generation of multiple pictures, so the computing power is sufficient, just choose RTX 3080 (0.51 yuan/hour), and the current event is a free trial.

After the operation is completed, it will jump to the InsCode workbench interface. In the computing resources, you can already see the machine you just selected.

insert image description here
After it is initialized, there are three options on the right, select Stabel Diffusion WebUI to enter. Enter the interface as shown below:

insert image description here

3. Stable Diffusion WebUI interface introduction and parameter analysis

  • Part 1: At the top of the interface stable diffusion ckpt, you can choose the model file. InsCode provides several commonly used models, such as chilloutmix, GuoFeng3, and Cute_Animals to choose from. To install one of your favorite models for InsCode Stable Diffusion, check it out here!

  • The second part is the main function and setting operation of the stable diffusion webui project

    文生图: As the name implies, it is to generate a picture through the description of the text
    图生图: Use a picture to generate a similar picture
    附加功能: Additional settings
    图片信息: If the picture is a picture generated by AI, after uploading a picture, it will prompt the relevant prompt keywords of the picture and Model parameter setting
    模型合并: multiple models can be merged, and there are multiple model weights to generate pictures
    训练: model training, you can provide your own pictures for model training, so that others can use their own trained models to generate pictures
    设置: UI interface settings
    扩展: Plug-in extension, here you can install some open source plug-ins, such as Sinicization plug-ins

  • The third part: it is the positive (negative) side prompt word input box, we need to enter the description information of the picture in this box, the positive prompt word is the text we want to generate the picture, and the negative prompt word is the picture text we do not want to generate .

    At the beginning, I don’t know how to write prompt words. You can refer to excellent style templates as a starting point. You can also use description tools and websites to produce more pictures and do more research. After mastering the rules of drawing pictures, you can gradually write prompt words by yourself. , Write prompt words to write as detailed as possible. The process of running AI is like drawing cards, draw a bunch of cards, and choose the ones that look good in your aesthetic category.

  • The fourth part: the bottom left of the interface is the relevant parameters of the model input:

    采样方法: There are a lot of sampling algorithms, each with its own advantages and disadvantages. For the specific effect of each algorithm, you can try it yourself
    采样迭代步数: The number of steps for model iteration
    平铺: Generate an image that can be tiled
    面部修复: The face repair function can provide facial details, but Turning on face repair for non-realistic characters may lead to facial collapse
    高清修复: low-resolution photos can be adjusted to high-resolution
    宽度, 高度: The size of the output image
    提示词相关性CFG: a higher value will improve the matching degree of the generated result and the prompt
    随机种子: the same as the seed In some cases, you can generate relatively similar pictures, remember to keep your favorite seeds, so that similar images can be generated again next time
    生成批次: the number of groups of images generated each time. The number of images generated in one run is 生成批次 * 每批数量
    每批数量: how many images are generated at the same time

  • The fifth part: It is the button to generate a picture with one click. After we have set the above parameters, click the generate button to automatically generate a picture.

    Generate the following 5 small icons (from left to right respectively)

    • Restore the prompt words for the last generated picture (automatically recorded)
    • Clear all current prompt words
    • Open the model selection interface
    • Apply the selected style template to the current prompt word
    • Archive current positive (negative) side prompt words
  • Related pictures will be displayed in the sixth section area.

4. How to install the model for Stable Diffusion in InsCode

Commonly used model download URLs

Currently, the two sites with the largest number of models are civitai and Huggingface .

Civitai, also known as station C, has many wonderful models. With these models, the website is blocked in China. Logging in requires scientific internet access.

insert image description here
Huggingface is relatively unpretentious, and the review of the model will be stricter. But the advantage is that there is no need to surf the Internet scientifically, and the Internet speed is very fast.

insert image description here
In addition, the AI ​​map station can find many models that have been removed from the C station, which is also quite good.

Commonly used models and descriptions

If you click on the above website to download the model, you will find that there are various types of models.

The models on CivitAI are mainly divided into four categories: Checkpoint, LoRA, Textual Inversion, and Hypernetwork, which correspond to four different training methods.

  • Checkpoint: It is the basic model that SD can draw, so it is called the large model, the bottom model or the main model, and it is called the Stable Diffusion model on the WebUI. After installing the SD software, it must be used with the main model. Different main models will focus on their painting styles and areas of expertise. Checkpoint models contain everything needed to generate images, no additional files are required. But they are quite bulky, usually 2G-7G. Stored in the Stable-diffusion directory of models in the Stable Diffusion installation directory.

  • LoRA: A lightweight model fine-tuning training method is to fine-tune the model on the basis of the original large model to output people or things with fixed characteristics. It is characterized by good image output for specific style features, fast training speed, and small model files, generally 10-200 MB, which need to be used with large models. Stored in the Lora directory of models in the Stable Diffusion installation directory

  • Embedding/Textual lnversion: A method of using text prompts to train models, which can be simply understood as a set of packaged prompt words for generating people or things with fixed characteristics. The feature is that it has a good image output effect for specific style features, and the model file is very small, generally dozens of K, but the training speed is slow, and it needs to be used with a large model. Stored in the Embeddings directory under the Stable Diffusion installation directory.

  • Hypernetwork: At present, Hypernetworks is not very useful. It is similar to LoRA, but the model effect is not as good as LoRA. Generally, it is tens of K, and it needs to be used with a large model. It is stored in the Hypernetworks directory under models in the Stable Diffusion installation directory.

Model recommendation: Checkpoint > LoRA > Textual Inversion > Hypernetwork

Usually, the Checkpoint model is used with the LoRA or Textual Inversion model to obtain better drawing results.

Supplement: There is also a type of VAE model. Simply understand that its function is to improve the color effect of the image, so that the picture does not look so gray, and fine-tune the image details.

several recommended models

  • DreamShaper

    Competent with multiple styles (realistic, original painting, 2.5D, etc.), it can generate great Checkpoint models for portraits and landscapes.

    insert image description here

  • Chilloutmix/Chikmix

    Chilloutmix is ​​the famous Asian beauty model. A large number of AI beauties you see on the market are basically generated by this model. The hottest picture at that time should be the following series.

    insert image description here

    It is also this model that makes AI painting completely out of the circle.

  • Cetus-Mix

    This is a two-dimensional hybrid model, which combines many two-dimensional models, and the actual use effect is not bad. The requirements for prompt words are not high.

    insert image description here

  • Guofeng series

    This is a gorgeous ancient Chinese style model, it can also be said to be an ancient game character model, with a 2.5D texture. The latest version is GuoFeng3.4.

    insert image description here

  • blindbox

    It can generate a LoRA model of blind box style. When using it, it is recommended to choose ReV Animated as the main model.

    insert image description here

How to install Lora for Stable Diffusion in InsCode

  1. First, download the Lora file that needs to be installed on your computer, and start the GPU through Jupyter Lab, as shown in the following figure:

    insert image description here

  2. Open the JupyterLab interface, find the upload entry, and upload the downloaded Lora to the GPU.

    insert image description here

  3. Open Terminal, and copy the Lora file that has been uploaded to the GPU to the /release/stable-diffusion-webui/models/Lora folder.

    insert image description here
    Specific commands:

    # cd /root/workspace
    # ls
    jupyterlab.log  shinkai_makoto_offset.safetensors  stable-diffusion-webui.log
    # cp shinkai_makoto_offset.safetensors /release/stable-diffusion-webui/models/Lora 
    # cd /release/stable-diffusion-webui/models/Lora
    # ls
    Cute_Animals.safetensors     SuoiresnuStyle-Rech44.safetensors  ZhouShuyi.safetensors   capi-09.safetensors                 mix4.safetensors
    GuoFeng3.2_Lora.safetensors  YaeMiko_mixed.safetensors          cZhouShuyi.safetensors  koreanDollLikeness_v15.safetensors  	shinkai_makoto_offset.safetensors
    

    Note that the shinkai_makoto_offset.safetensors here is the Lora file I downloaded

  4. When you see that the downloaded Lora model file already exists in the Lora folder, reopen the Stable Diffusion WebUI, click the icon in the red circle on the right, wait a moment, and you can see that the Lora interface is opened

    insert image description here

  5. After clicking Lora, you can see the Lora installed by Stable Diffusion. If you find the Lora you uploaded, a line of reference to the Lora will be generated in the prompt.

    insert image description here

So far, the current version of Stable Diffusion has already installed a favorite Lora. Similarly, you can install Checkpoint, Embedding, etc. in the same way.

Next we use InsCode Stable Diffusion for AI drawing.

5. Use InsCode Stable Diffusion for AI drawing

Here is a pictorial representation of some of my generated examples, with parameters set to hint words and seeds:

Generate Figure 1

insert image description here
Parameter configuration:

Steps(采样迭代步数): 30
Sampler(采样方法): Euler a
生成批次:1
批次数量:1
CFG scale: 7
Size: 768x1024
Model hash: 7234b76e42
Model: chilloutmix-Ni
Version: v1.2.0
Seed: 162297642

Prompt words:

Prompt: Best quality,raw photo,seductive smile,cute,realistic lighting,beautiful detailed eyes,(collared shirt:1.1),bowtie,pleated skirt,floating long hair,beautiful detailed sky,
Negtive Prompt: Negative prompt: nsfw, ng_deepnegative_v1_75t,badhandv4, (worst quality:2), (low quality:2), (normal quality:2), lowres,watermark, monochrome

Generate Figure 2

insert image description here
Parameter configuration:

Steps(采样迭代步数): 30
Sampler(采样方法): Euler a
生成批次:1
批次数量:1
CFG scale: 7
Size: 768x1024
Model hash: 74c61c3a52
Model: GuoFeng3
Version: v1.2.0
Seed: 1110161009

Prompt words:

Prompt: best quality,red clothes,smile,handsome girl,fairy and elegant aura,delicate makeup,
Negtive Prompt: nsfw,ng_deepnegative_v1_75t,badhandv4,(worst quality:2),(low quality:2),(normal quality:2),lowres,watermark,monochrome,modern element,topless female,

Generate Figure 3

In Figure 3 and Figure 4, Makoto Shinkai's Lora model is used to generate Makoto Shinkai's style pictures

insert image description here
Parameter configuration:

Steps(采样迭代步数): 30
Sampler(采样方法): Euler a
生成批次:1
批次数量:1
CFG scale: 7
Size: 1440x810
Model hash: 9c321174ae
Model: ghostmix_v11
Version: v1.2.0
Seed: 2262843784

Prompt words:

Prompt: ((Best quality)), ((masterpiece)), abandoned brutalist architecture of Pripyat,sunlight,cloudy weather, hyper realistic DSLR photo, Nikon D5 lora:add_detail:1,mist,
Negtive Prompt: ng_deepnegative_v1_75t,easynegative,(worst quality:2), (low quality:2), (normal quality:1.8), lowres, ((monochrome)), ((grayscale)),sketch,ugly,morbid, deformed,logo,text, bad anatomy,bad proportions,disfigured,extra arms, extra legs, fused fingers,extra digits, fewer digits, mutated hands, poorly drawn hands,bad hands, (loli, young, child, infant, teenager:1.5), ((((turned on lights))))

Generate Figure Four

insert image description here
Parameter configuration:

Steps(采样迭代步数): 30
Sampler(采样方法): Euler a
生成批次:1
批次数量:1
CFG scale: 7
Size: 1440x810
Model hash: 9c321174ae
Model: ghostmix_v11
Version: v1.2.0
Seed: 4267252388

Prompt words:

Prompt: shinkai makoto, kimi no na wa., air conditioner, antennae, architecture, building, cable, city, cloud, cloudy sky, comet, crane (machine), house, industrial pipe, japan, light, night, night sky, no humans, outdoors, pipeline, satellite dish, shinjuku (tokyo), sky, star (sky), tokyo (city), window,lora:shinkai_makoto_offset:1
Negtive Prompt: (painting by bad-artist-anime:0.9), (painting by bad-artist:0.9), watermark, text, error, blurry, jpeg artifacts, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, artist name, (worst quality, low quality:1.4), bad anatomy

6. Experience

This function evaluation is over. In general, the experience of running Stable Diffusion online on InsCode is very good. However, sometimes it freezes and needs to restart the GPU. In addition, chilloutmix tends to generate astringent images when there are few negative cues. . . Not good for young people to learn

Interested friends can try it for themselves!

Guess you like

Origin blog.csdn.net/a2360051431/article/details/131719124