The speed is improved, and the latest workstation of Nvidia allows Stable Diffusion to generate 40 pictures in one minute, so what is the principle of Stable Diffusion...

abe22829edc02839473099ffc7980f04.png

Nvidia has released a new one-stop solution - RTX Workstation, several data worthy of attention:

- Support up to 4 RTX 6000 GPUs

- The fine-tuning of GPT3-40B with 860 million tokens can be completed within 15 hours

- Stable Diffusion XL generates 40 images per minute, 5 times faster than 4090

Many people have used Stable Diffusion, so what is the technical principle of SD?

1caf4c374d078624c962481ccdd02090.png

shadow

hi kids! Today I want to tell you an interesting scientific story, its name is "Stable Diffusion", it sounds great! In fact, it is a technology that allows computers to learn to create paintings.

9cae978867fd21d3a44cb7438b276e33.png

07031508024b90aac52c1540d50a2349.gif

First of all, we know that the computer can't draw by itself, it needs us to tell it what to draw. And "Stable Diffusion" is a way to let the computer create paintings according to our instructions.

The principle of this method is a bit complicated, but I will understand it with you in simple words. First, we need to enter some text descriptions, such as "paradise", "vast", "beach", and then the computer will generate a picture that matches the description based on these texts.

f557041a340f050e1709f18e8559efd5.png

So how do computers do this? It uses a model called CLIP, which converts text into a mathematical representation that computers can understand , and then instructs the "Unet" model to continuously denoise randomly generated noisy images.

1cba42d68c1058cb7d339d630ee9de3b.png

Use step to represent the number of denoising times, and gradually convert the pure noise image into a vector containing rich semantic information by continuously removing noise. Finally, through the image decoder, the semantic vector is generated into a picture with semantic information.

7b0a20810e6306d8167d204f143efa33.png

This article uses MixCopilot to complete the generation of popular science stories.

1/ Enter the original text:

jalammar.github.io/illustrated-stable-diffusion

2/ MixCopilot workflow completes output

Welcome to exchange:

51b2363d1e8205cff3c2e2324f3d8225.jpeg

#Knowledge miner demo v1.0

de17f647553e208999806010d8fc08cc.png

Guess you like

Origin blog.csdn.net/shadowcz007/article/details/132200356