Midjourney VS StableDiffusion, the two most popular AI drawing tools recently


Hello everyone, today I would like to introduce to you two recent AI drawing tools, Midjourney ( official website ) and stable diffusion ( official website ).


The following will compare the difficulty of getting started, the effect of drawing, the efficiency of drawing, and the cost of use.


1, skill difficulty


First, let's look at the difficulty of getting started.

Midjourney provides a friendly interface to help users understand and operate each step. It also offers a simple template that lets users get started quickly and easily create and publish their messages. midjourney is a painting platform built on the basis of discord. First register in discord, and use the discord account to log in to midjourney directly. After completion, you can enter the corresponding dialog box of Midjourney, and call different functions by entering different commands. In addition, Midjourney also allows users to use existing media, can easily add pictures and so on.


insert image description here


After entering, you can see the public painting area of ​​midjourney, where many people generate pictures and update them in real time.


insert image description here


You can browse channels in the left column, and there are some newbies-110, newbies-140 which are novice areas, you can click in to generate your own pictures.


insert image description here

StableDiffusion is currently open source, that is, it can be deployed directly locally. It requires users to have in-depth technical knowledge before using it to publish information. It requires the user to be proficient in coding and database techniques to fully utilize all of its features. In addition, StableDiffusion requires users to import images, videos and other media by themselves, instead of providing ready-made media like Midjourney. Of course, there are also requirements for hardware. Basically, a graphics card with a graphics card above 3g must have at least 8gb of video memory. At the same time, the StableDiffusion model occupies more hard disks, at least 20-30G of space. Of course, it can also be deployed through the cloud, such as deploying StableDiffusion through Google's clap, which can be accessed and used through url. Of course, if you want to get more colorful picture effects, you need to combine different models such as loray.


The switch of the StableDiffusion large model is first in the upper left corner of the interface, where you can switch the model you installed.


insert image description here


Vincent diagram, the difference from Midjourney here is that StableDiffusion has a reverse keyword (meaning that nothing appears in the picture), and the keyword format of Vincent diagram is basically the same as that of Midjourney.


insert image description here

The following parameters are to set some basic parameters of the picture, such as resolution, size, etc. See below for specific parameters


insert image description here

It is also easy to understand the picture-generated picture. Put the reference picture and enter the prompt word, which is consistent with other AI painting software.

insert image description here


Picture information, which means the picture you drew with SD, put it here, and some parameters of the picture, including keywords, will be displayed on the right.


insert image description here


Generally speaking, in terms of ease of use, Midjourney is easier to use, and StableDiffusion is slightly more difficult to use!


2. The drawing effect of the two major AI drawing tools

The underlying principle of Midjourney is based on generative confrontation network and deep learning technology. Of course, we still pay more attention to its drawing effect. We can see that the pictures generated by Midjourney are more exquisite. The current version has been upgraded to v5, and the generation of real people will be more realistic, and the details are handled in place. However, there will still be flaws in the processing of the fingers. The controllability of the images generated by Midjourney is not strong, and the prompt needs to be adjusted continuously, so this is currently the biggest bottleneck.


For example, if you want to enter: cute girl, holding flowers, with Valentine's Day balloon decoration in the background, use online translation to help translate into English keywords, enter: Lovely girl, holding flowers, with Valentine's Day balloons in the background

The robot will generate 4 pictures in one minute:


insert image description here


While StableDiffusion uses the ldm latent diffusion model, its goal is to eliminate the continuous application of Gaussian noise on the training image, which can be regarded as a series of denoising autoencoders. Stable diffusion combines different models, and lauray can generate pictures of various styles. Through positive prompts and reverse prompts, you can better generate the desired pictures. Stable diffusion not only has the functions of tattoo image, image-sound image, image, and voiceprint, but its embedded control net solves the problem of spatial consistency. Previously there was simply no effective way to tell the AI ​​model which parts of the input image to keep from the control net. By introducing a method to change this is StableDiffusion, which can use additional input conditions to tell the model what to do exactly, and even introduce three doopen poses to generate poses or actions of characters to precisely control the pictures that need to be generated. This is currently a highly recognized function in AI drawing.


Example: oil painting

prompt: portrait of bob barker playing twister with scarlett johansson, an oil painting by ross tran and thomas kincade


insert image description here

For example, such a cyberpunk style

prompt: Cyberpunk, 8k resolution, castle, the rose sea, dream


insert image description here


Judging from the drawing effect, StableDiffusion is slightly better in controllability and application breadth.


3. Drawing efficiency

StableDiffusion has a disadvantage, that is, it is very slow, and it takes half an hour to start.

Midjourney is an AI art project laboratory joined by Somnai, the original author of Disco Diffusion. Midjourney has improved Disco Diffusion and can produce pictures in an average of 1 minute.


4. The cost of using ai drawing tools

(1), midoriy If you enter a keyword in the input box for the first time, the following prompt will pop up.

insert image description here


Explain that free users are not eligible to generate pictures and need to pay. The previous version said that new users have 25 opportunities to generate pictures for free, but in fact, every input of keywords, click to enlarge, and fine-tuning are counted as one time, so 25 free opportunities are very expensive. It is almost used up, if you want to continue using it, you still need to purchase a membership.


The current version does not seem to have a free trial quota for new users. The free usage permission has been closed and must be subscribed to use.


At present, the annual and monthly payment can be divided into three levels. The basic plan and standard plane plan are mainly different in the number of drawings, the fast mode time, the fast drawing mode, and the number of parallels.


insert image description here


insert image description here


(2)、StableDiffusion

Although StableDiffusion is open source and can run locally, it has high hardware requirements. If you don't consider starting with a hardware partner, you can also consider purchasing a cloud service like google collab p for deployment. Let's take Google clap as an example, there are two subscription plans: coo lab pro, co lab pro plus. The two differ in computing units. At the same time, Collab Pro Plus also supports background execution, and finally there is a psugo solution, which allows you to purchase more computing units, and the cost of use varies from person to person. If you just want to generate some pictures, both can be used, and the cost is almost the same. If you want to use the ai drawing tool to assist in some work for a long time, I would suggest starting with a graphics card with a high configuration and deploying stable defense locally. From the perspective of long-term investment, the cost is the lowest option.


(3) Cost comparison chart


insert image description here

Guess you like

Origin blog.csdn.net/lizhong2008/article/details/130378277