Nvidia has released a new one-stop solution - RTX Workstation, several data worthy of attention:
- Support up to 4 RTX 6000 GPUs
- The fine-tuning of GPT3-40B with 860 million tokens can be completed within 15 hours
- Stable Diffusion XL generates 40 images per minute, 5 times faster than 4090
Many people have used Stable Diffusion, so what is the technical principle of SD?
↓
shadow
hi kids! Today I want to tell you an interesting scientific story, its name is "Stable Diffusion", it sounds great! In fact, it is a technology that allows computers to learn to create paintings.
First of all, we know that the computer can't draw by itself, it needs us to tell it what to draw. And "Stable Diffusion" is a way to let the computer create paintings according to our instructions.
The principle of this method is a bit complicated, but I will understand it with you in simple words. First, we need to enter some text descriptions, such as "paradise", "vast", "beach", and then the computer will generate a picture that matches the description based on these texts.
So how do computers do this? It uses a model called CLIP, which converts text into a mathematical representation that computers can understand , and then instructs the "Unet" model to continuously denoise randomly generated noisy images.
Use step to represent the number of denoising times, and gradually convert the pure noise image into a vector containing rich semantic information by continuously removing noise. Finally, through the image decoder, the semantic vector is generated into a picture with semantic information.
This article uses MixCopilot to complete the generation of popular science stories.
1/ Enter the original text:
jalammar.github.io/illustrated-stable-diffusion
2/ MixCopilot workflow completes output
Welcome to exchange: