ControlNet big update! Just rely on prompt words to be able to accurately P-picture, keep the painting style unchanged!

Click the card below to follow the " CVer " official account

AI/CV heavy dry goods, delivered in the first time

Click to enter —>【Transformer】WeChat Technology Exchange Group

Feng Se is sent from Aufei Temple
and reproduced from: Qubit (QbitAI)

Stable Diffusion plug-in, "AI painting detail control master" ControlNet ushered in a major update:

Just use the text prompts to modify the details of the image arbitrarily while maintaining the main characteristics of the image .

For example, change the look of the beauty from hair to clothes , and make the expression more friendly:

88fcbcf98b47c1891a01bfe347d08ffa.png

Or let the model switch from the sweet girl next door to the high-cold Yujie , the orientation of the body and head , and the background are all changed:

486c7ad404a9a82e7783d9731609f78b.png

——No matter how the details are modified, the "soul" of the original picture is still there.

In addition to this style, it can also handle the animation type just right:

c023f0c51d7f9fdba776fe77a594b278.png

AI design blogger @sundyme from Twitter said:

The effect is better than expected!

Only one reference picture is needed to complete the above transformation, and some pictures can almost achieve the effect of customizing large models .

Ahem, friends in the AI ​​painting circle, cheer up, it’s fun again.

(ps. The first three renderings are from YouTube blogger @Olivio Sarikas, and the second one is from Twitter blogger @sundyme)

New on ControlNet: Image retouching function that preserves the original image style

The above update actually refers to a preprocessor called "reference-only" .

It does not require any control model, and directly uses the reference image to guide the diffusion.

According to the author, this function is actually similar to the "inpaint" function, but it will not cause the image to collapse.

(Inpaint is a partial redrawing function in the Stable Diffusion web UI, which can redraw unsatisfactory areas that are manually masked.)

Some senior players may know a trick, which is to use inpaint for image diffusion.

Say you have a 512x512 image of a dog, and want to generate another 512x512 image of the same dog.

At this point you can join the 512x512 dog image and the 512x512 blank image into a 1024x512 image, then use the inpaint function to mask out the blank 512x512 part and diffuse the dog image with a similar appearance.

In this process, because the images are simply and roughly stitched together, and there will be distortion, the effect is generally not satisfactory.

With "reference-only" it's different:

It can directly link the attention layer of SD ("Stable Diffusion") to any independent image, so that SD can directly read these images as a reference.

That is to say, if you want to make changes while maintaining the style of the original image, you can use the prompt words to operate directly on the original image.

As shown in the official example picture, change a standing puppy into a running action:

0402841e3b26a484deefd39512fc262a.png

You only need to upgrade your ControlNet to version 1.1.153 or above, then select "reference-only" as the preprocessor, upload the picture of the dog, and input the prompt word "a dog running on grassland, best quality...", SD will be I will only use this picture of yours as a reference for revisions.

Netizen: One of the best features of ControlNet so far

As soon as the "reference-only" function came out, many netizens started to experience it.

Some call this one of ControlNet's best features to date:

Pass an anime picture with a character pose, and then write a hint that seems to have nothing to do with the original picture. Suddenly, the effect you want is based on the original image. Really strong, even to the point of changing the rules of the game.

022a95623dda2a45500c7b06197d5eba.png

Others claim:

It's time to pick up all the waste pictures that were discarded before and restore them.

8138a6b62f1403fd83c35373b608f69c.png

Of course, there are some who think it is not so perfect (for example, the earrings of the beautiful woman in the first rendering are wrong, and the hair in the second picture is also incomplete), but netizens still said that "the direction is right after all".

a9abf747826dbdd1efcc73cd2999ff7b.png

The following are the effects tried by three Twitter bloggers, mainly in anime style, let’s enjoy it together:

75debfd558859ec5f33dcd34d9d045ab.png

△ AI illustration new by Rari Shingu

9820d1a94d245295d09f9fc651e28f92.png

△From @br_d, the one on the left is the original picture

106362ce0c22046c01d6e4790c10cf8b.png

△From @br_d, the previous one is the original picture


1ce15c0e2e840c5b4201e0b4df16412d.png

△From @uoyuki667, the left one is the original picture

Did it hit your heart?db4d54ad3a1e24c57c75654a1c358294.png

参考链接:
[1]https://github.com/Mikubill/sd-webui-controlnet/discussions/1236
[2]https://twitter.com/sundyme/status/1657605321052012545
[3]https://twitter.com/uoyuki667/status/1657748719155167233
[4]https://twitter.com/br_d/status/1657926233068556289
[5]https://twitter.com/aiilustnews/status/1657941855773003776

Click to enter —>【Transformer】WeChat Technology Exchange Group

The latest CVPR 2023 papers and code download

 
  

Background reply: CVPR2023, you can download the collection of CVPR 2023 papers and code open source papers

Background reply: Transformer review, you can download the latest 3 Transformer review PDFs

目标检测和Transformer交流群成立
扫描下方二维码,或者添加微信:CVer333,即可添加CVer小助手微信,便可申请加入CVer-目标检测或者Transformer 微信交流群。另外其他垂直方向已涵盖:目标检测、图像分割、目标跟踪、人脸检测&识别、OCR、姿态估计、超分辨率、SLAM、医疗影像、Re-ID、GAN、NAS、深度估计、自动驾驶、强化学习、车道线检测、模型剪枝&压缩、去噪、去雾、去雨、风格迁移、遥感图像、行为识别、视频理解、图像融合、图像检索、论文投稿&交流、PyTorch、TensorFlow和Transformer等。
一定要备注:研究方向+地点+学校/公司+昵称(如目标检测或者Transformer+上海+上交+卡卡),根据格式备注,可更快被通过且邀请进群

▲扫码或加微信号: CVer333,进交流群
CVer计算机视觉(知识星球)来了!想要了解最新最快最好的CV/DL/AI论文速递、优质实战项目、AI行业前沿、从入门到精通学习教程等资料,欢迎扫描下方二维码,加入CVer计算机视觉,已汇集数千人!

▲扫码进星球
▲点击上方卡片,关注CVer公众号

It's not easy to organize, please like and watch6a70e5c686db81cdbf328e7a12973daa.gif

Guess you like

Origin blog.csdn.net/amusi1994/article/details/130717000