Memory-guided Unsupervised Image-to-Image Translation [Memory-guided Unsupervised Image-to-image Translation]

Background : Existing methods are often unable to handle images with many distinct objects. They apply a global style to the entire image without taking into account larger style differences between an instance and the background or within an instance.

Methods : We propose a class-aware memory network that explicitly accounts for local style variation. A key-value memory structure with a set of read/update operations is introduced to record category style changes. Key stores are used to allocate domain-independent content representations of memory items, while values ​​encode domain-specific style representations. We also propose a feature contrastive loss to improve the discriminative power of remembered items.

Main innovation: classified treatment

Our contributions can be summarized as follows:
• We propose a memory-guided unsupervised I2I translation (MGUIT) framework that stores and propagates instance-level style information in the visual domain. To our knowledge, this is the first work exploring memory networks in I2I translation.
• We introduce a key-value memory structure to efficiently record different style changes and access them during I2I translation. Our model does not require an explicit object detection module at test time . We also propose a feature contrastive loss to improve the diversity and discriminative power of our memorized items.
• Our method produces realistic translation results while preserving instance details well; it outperforms recent state-of-the-art methods on standard benchmarks.

This article is a new method in the field of image style conversion, introducing the class-aware-memory-network module to improve the quality of conversion results

The basic idea of ​​style conversion is to use the convolution operation to divide the image into content and style, then replace the style, and use its own content and new style to generate a result image to achieve the purpose of style conversion. Most of the single methods only consider the switching of the global style, ignoring the differences between instance objects, resulting in the loss of details in the result graph.

Our goal is to infer instance styles at both training and test time to produce more realistic results. To this end, we employ novel memory networks that store style information during training and read appropriate style representations for inference.

A memory network is a learnable neural network module that stores information in external memory and reads relevant content from memory. Introduced the key-value memory network, which utilizes key-value structural memory to read documents. Given a query, the key is used to retrieve the associated memory, and its corresponding value is returned .

We use key-value memory to store domain-independent content representations and domain-specific style representations.

Network structure:
insert image description here
Class-aware Memory Network
insert image description hereinsert image description hereinsert image description hereinsert image description here
insert image description here

Read
insert image description here
uses the content C to weight the item.
insert image description here
insert image description here
Update
insert image description here
insert image description here
insert image description here

Loss function:
insert image description here
2. The purpose of Adversarial loss
is to minimize the distribution difference between two different functions
content discriminato: the content confrontation loss function between Cx and Cy
makes the content of x keep the original content
domain discriminator under the y style : X and Y domain adversarial loss function

3. KL loss: Make the style representation close to the prior Gaussian distribution

4. Latent regression loss Llatent: Enforces that the mapping between styles and images is reversible

5. Feature comparison loss:

insert image description here
Ablation experiment
insert image description here

Guess you like

Origin blog.csdn.net/weixin_44021553/article/details/124714508