Visual new missions! ReVersion: Relation customization in image generation

Click the card below to follow the " CVer " official account

AI/CV heavy dry goods, delivered in the first time

Click to enter —> [Target Detection and Transformer] Exchange Group

New Mission: Relation Inversion   

This year, the work of diffusion model and related personalization is becoming more and more popular, such as DreamBooth, Textual Inversion, Custom Diffusion, etc. This kind of method can extract the concept of a specific object from the picture and add it to the In the pre-trained text-to-image diffusion model, in this way, people can customize the objects they are interested in, such as specific anime characters, or sculptures at home, water glasses, etc.

Existing customization methods mainly focus on capturing object appearance. However, in addition to the appearance of objects, there is another important pillar of the visual world, which is the inextricable relationship between objects. At present, no work has explored how to extract a specific relationship (relation) from the picture and apply the relation to the generation task. To this end, we propose a new task: Relation Inversion.

d1500e573671037712acb349b772411d.png

As shown in the figure above, given several reference pictures, there is a coexisting relation in these reference pictures, such as "object A is installed in object B", the goal of Relation Inversion is to find a relation prompt to describe this interaction relationship, and It is used to generate a new scene, so that the objects in it can also interact according to this relation, such as putting Spider-Man into a basket.

a321c5eff42ec244433e316bc0177c27.png

Reply in the background of the CVer WeChat public account: ReVersion, you can download the pdf and code of this paper

●Paper: arxiv.org/abs/2303.13495

●Code: github.com/ziqihuangg/ReVersion

●Home page: ziqihuangg.github.io/projects/reversion.html

●Video: www.youtube.com/watch?v=pkal3yjyyKQ

●Demo:huggingface.co/spaces/Ziqi/ReVersion       

6bfd910346943794cfbe3a73f62b543e.jpeg

ReVersion Framework

As a first attempt at the Relation Inversion problem, we propose the ReVersion framework: 

dc53ee6aa3a865394b946437b081c796.jpeg     

Compared with the existing Appearance Inversion task, the difficulty of the Relation Inversion task is how to tell the model that what we need to extract is the relatively abstract concept of relation, rather than the aspect of the appearance of the object that has significant visual features.

We propose a relation-focal importance sampling strategy to encourage more attention to high-level relations; at the same time, we design a relation-steering contrastive learning to guide more attention to relation rather than the appearance of objects. See the paper for more details.

ReVersion Benchmark

We collected and provided the ReVersion Benchmark:

https://github.com/ziqihuangg/ReVersion#the-reversion-benchmark

It contains a variety of relations, each relation has multiple exemplar images and human-labeled text descriptions. At the same time, we provide a large number of inference templates for common relations. You can use these inference templates to test whether the learned relation prompts are accurate, or to combine them to generate some interesting interactive scenarios.

Result display  

  • Rich and diverse relation  

We can invert rich and diverse relations and apply them to new objects

20a848f152523a6d7509b2e3ea164d81.jpeg          fadf697cb758183f9ede8eff2789dfbc.jpeg

  • Rich variety of backgrounds and styles  

The relation we get can also connect objects in different styles and background scenes in a specific way.

23f731402a695ca4b9d5f628db37c219.jpeg

  • The same Relation, rich and diverse object combinations   

         5a1d7f3934021c7b5680dc86c799df33.jpeg

Reply in the background of the CVer WeChat public account: ReVersion, you can download the pdf and code of this paper

 
  

Click to enter —> [Target Detection and Transformer] Exchange Group

ICCV/CVPR 2023 Paper and Code Download

 
  

Background reply: CVPR2023, you can download the collection of CVPR 2023 papers and code open source papers

后台回复:ICCV2023,即可下载ICCV 2023论文和代码开源的论文合集
目标检测和Transformer交流群成立
扫描下方二维码,或者添加微信:CVer333,即可添加CVer小助手微信,便可申请加入CVer-目标检测或者Transformer 微信交流群。另外其他垂直方向已涵盖:目标检测、图像分割、目标跟踪、人脸检测&识别、OCR、姿态估计、超分辨率、SLAM、医疗影像、Re-ID、GAN、NAS、深度估计、自动驾驶、强化学习、车道线检测、模型剪枝&压缩、去噪、去雾、去雨、风格迁移、遥感图像、行为识别、视频理解、图像融合、图像检索、论文投稿&交流、PyTorch、TensorFlow和Transformer、NeRF等。
一定要备注:研究方向+地点+学校/公司+昵称(如目标检测或者Transformer+上海+上交+卡卡),根据格式备注,可更快被通过且邀请进群

▲扫码或加微信号: CVer333,进交流群
CVer计算机视觉(知识星球)来了!想要了解最新最快最好的CV/DL/AI论文速递、优质实战项目、AI行业前沿、从入门到精通学习教程等资料,欢迎扫描下方二维码,加入CVer计算机视觉,已汇集数千人!

▲扫码进星球
▲点击上方卡片,关注CVer公众号

It's not easy to organize, please like and watch582b43ef0e53ccbda776bd4d5748e41a.gif

Guess you like

Origin blog.csdn.net/amusi1994/article/details/132614002