Summary of face-changing methods: generating confrontation network GAN, diffusion model, etc.

1、One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2

  • StyleGAN's high-fidelity portrait generation has gradually overcome the low-resolution limitations of single-sample facial video-driven reproduction, but these methods rely on at least one of the following: explicit 2D/3D priors, based on optical flow as a description of motion, Off-the-shelf encoders etc., which limit their performance (e.g., inconsistent predictions, failure to capture fine facial details and ornaments, poor generalization, artifacts).

  • This paper presents an end-to-end framework that simultaneously supports facial attribute editing, facial motion and deformation, and facial identity control for video generation. It uses a hybrid latent space to encode a given frame in the W+ and SS spaces of StyleGAN2, respectively.

  • The model supports generating realistic reenactment-driven videos with other potential semantic edits (e.g., beard, age, makeup, etc.). Qualitative and quantitative analyzes are carried out with advanced methods, demonstrating the superiority of the proposed method.

  • The project page is at: https://trevineooorloff.github.io/FaceVideoReenactment_HybridLatents.io/

02f2efec2ef712b1dcc869f9df91d51b.png

2、DiffFace: Diffusion-based Face Swapping with Facial Guidance

  • In this paper, we propose a novel diffusion model-based face-swapping framework called DiffFace, which consists of training ID-conditional DDPM, face-guided sampling, and object-preserving mixture.

  • Specifically, during training, an ID-conditional DDPM is trained to generate facial images with desired identities. During the sampling process, off-the-shelf face models are used, enabling the model to convey source identities while faithfully preserving target attributes. In this process, in order to preserve the background of the target image, an object-preserving hybrid strategy is additionally proposed. It helps the model to transfer the identity of the source face while keeping the attributes of the target face immune to noise. Furthermore, without any retraining, the model can flexibly apply additional face guidance and adaptively control the ID attribute tradeoff to achieve the desired results.

  • This paper claims that this is the first approach to apply diffusion models to the task of face swapping. Compared with previous GAN-based methods, DiffFace achieves better advantages in terms of training stability, high fidelity, and controllability by utilizing the diffusion model for face-swapping tasks. Extensive experiments show that DiffFace is comparable to or better than state-of-the-art methods on standard face swapping benchmarks.

  • Project page: https://hxngiee.github.io/DiffFace/

920c8d685bc6332aef9c9f76e4b81a78.png

3、FlowFace: Semantic Flow-guided Shape-aware Face Swapping

  • This work proposes a two-stage framework, FlowFace, for semantic flow guidance for shape-aware face swapping. Unlike most previous methods that focus on transferring the internal facial features of the source and ignore the facial contours, FlowFace can transfer them all to the target face, achieving a more realistic face-swapping effect.

  • Specifically, FlowFace consists of a face reshaping network and a face swapping network. The face reshaping network resolves the shape contour difference between the source face and the target face. It first estimates the semantic flow (i.e., face shape difference) between source and target faces, and then explicitly warps to fit the target face shape using the estimated semantic flow. After reshaping, the face-swapping network generates internal facial features that reveal the identity of the source face. Extract facial features from source and target faces using a pretrained Face Mask Autoencoder (MAE). Compared with previous methods that use identity embeddings to preserve identity information, the features extracted by the encoder can better capture facial appearance and identity information.

  • A cross-attention fusion module is also developed to adaptively fuse the internal facial features of the source face with the attributes of the target face to better preserve the identity. Extensive quantitative and qualitative experiments show that FlowFace outperforms the state-of-the-art.

4a2e481e3cc4797d2697d57d85b18689.png

4、Landmark Enforcement and Style Manipulation for Generative Morphing

  • Morph generation using generative adversarial networks (GANs) produces high-quality morphs that are not affected by spatial artifacts caused by keypoint-based methods, but identity features are significantly lost using conventional GAN-based morphing methods.

  • This paper proposes a novel StyleGAN deformation generation technique that addresses this issue by introducing a keypoint execution method. The latent space of the model is explored using Principal Component Analysis (PCA), and identity loss is addressed by latent domain averaging. Furthermore, to improve high-frequency reconstruction in warping, the trainability of the StyleGAN2 model on noisy inputs is investigated.

d37bf69b57feae1279bc0f006953f13a.png

5、SimSwap: An Efficient Framework For High Fidelity Face Swapping

  • This paper proposes an efficient method, called SimSwap, which aims at generalized and high-fidelity face swapping. Unlike previous methods that either lack the ability to generalize to arbitrary identities or fail to preserve attributes such as facial expression and gaze direction, our method is able to convert the identity of any source face to any target face while preserving the attributes of the face.

  • First, ID Injection Module (IIM) is proposed, which transfers the identity information of the source face to the target face at the feature level. By using this module, the architecture of an identity-specific face-swapping algorithm is extended to a framework for arbitrary face-swapping.

  • Second, a weak feature matching loss is proposed, which effectively helps the method preserve facial attributes in an implicit manner. Experiments show that SimSwap is able to achieve competitive identity performance while preserving attributes better than previous state-of-the-art methods.

  • Code: https://github.com/neuralchen/SimSwap

4c74d89e8a22688ecce3cc7d918dab03.png

6、FaceDancer: Pose- and Occlusion-Aware High Fidelity Face Swapping

  • This work proposes a novel single-stage method for identity-independent face swapping and identity transformation, named FaceDancer. Two main contributions: Adaptive Feature Fusion Attention (AFFA) and Interpreted Feature Similarity Regularization (IFSR).

  • The AFFA module is embedded in the decoder and adaptively learns to fuse attribute features and features conditioned on identity information without any additional face segmentation process.

  • IFSR leverages intermediate features in the identity encoder to preserve important attributes in the target face, such as head pose, facial expression, illumination, and occlusion, while still transferring the identity of the source face with high fidelity.

  • Extensive quantitative and qualitative experiments are performed on various datasets, and show that the proposed FaceDancer outperforms other state-of-the-art networks in identity transformation, while having better pose preservation than most previous methods.

c4e7dbcf3824f108eddf70acb1f06be2.png

7、StyleMask: Disentangling the Style Space of StyleGAN2 for Neural

Face Reenactment

  • This paper addresses the face reenactment problem. In this task, given a pair of source and target facial images, it is necessary to transfer the pose of the target (defined as the head pose and its facial expression) to the source image by simultaneously Preserve identity features of the source (e.g., facial shape, hairstyle, etc.), even in challenging cases where the source and target faces belong to different identities. In doing so, this paper addresses some of the limitations of state-of-the-art methods, namely, a) they rely on paired training data (i.e., source and target faces have the same identity), and b) they rely on on labeled data, and c) they do not preserve identity across large variations in head pose.

  • Our method uses unpaired randomly generated face images to learn to separate the identity features of a face from its pose by incorporating the recently introduced style space S of StyleGAN2, which is a latent representation space with certain disentanglement properties. By exploiting this, we successfully mix a pair of source and target style codes using the supervision of 3D models. The generated latent codes are then used for replay, consisting of latent units corresponding only to the facial pose of the target and units only to the identity of the source, resulting in significantly improved replay performance compared to recent states.

  • It is shown quantitatively and qualitatively that the proposed method produces higher quality results even under extreme pose variations. Code and pre-trained model: https://github.com/StelaBou/StyleMask

f80dd129fbfb2e7437e89da7c698eda1.png

8、StyleSwap: Style-Based Generator Empowers Robust Face Swapping

  • Given its wide application, many attempted works have been done on the face-swapping task. Although most existing methods rely on cumbersome network models and loss designs, they still suffer from difficulties in information balance between source and target faces, and often produce visible artifacts.

  • This work introduces a concise and effective framework, StyleSwap, whose core idea is to utilize a style-based generator for high-fidelity and robust face swapping, so that the advantages of the generator can be exploited to optimize identity similarity. With minimal modifications, the StyleGAN2 architecture can successfully process the required information from both source and target. In addition, inspired by the ToRGB layer, a swap-driven mask branch is further designed to improve information mixing. Furthermore, the advantages of StyleGAN inverse mapping can be taken. In particular, a swap-guided ID mapping strategy is proposed to optimize identity similarity.

  • Extensive experiments verify that the method produces high-quality face swapping results, outperforming the state-of-the-art methods both qualitatively and quantitatively. Videos, codes and models at: https://hangz-nju-cuhk.github.io/projects/StyleSwap

2d5e0d66a8ce8ca8d5671c86eed1255a.png

9、Region-Aware Face Swapping

  • This paper proposes a new Region-Aware Face Swapping (RAFSwap) network to achieve reasonably high-resolution face generation with consistent identities in a local-global manner: 1) The local face region-aware (FRA) branch passes The identity-related features are enhanced in the following way, and Transformer is introduced to effectively model misplaced cross-scale semantic interactions. 2) The global source feature adaptation (SFA) branch further complements the global identity-related cues for generating identity-consistent face-swapping effects.

  • Furthermore, a Face Mask Predictor (FMP) module combined with StyleGAN2 is proposed to predict identity-related soft masks in an unsupervised manner, which is more practical for generating harmonious high-resolution faces. Extensive experiments qualitatively and quantitatively demonstrate that our method outperforms state-of-the-art methods in generating more identity-consistent high-resolution face swaps.

5157ec8139c1a5ab3a0874ae1009d4a4.png

10、MobileFaceSwap: A Lightweight Framework for Video Face Swapping

  • The advanced face-changing technology has achieved good results. However, these parameters and calculations are large, and it is challenging to deploy them on edge devices such as mobile phones. This work proposes a lightweight identity-aware dynamic network (IDN), which dynamically adjusts model parameters according to identity information to achieve an individual-independent face-swapping effect.

  • In particular, an efficient Identity Injection Module (IIM) is designed by introducing two dynamic neural network techniques, including weight prediction and weight modulation. Once the IDN is updated, it can be applied to face swapping given any target image or video. The proposed IDN contains only 0.50M parameters and requires 0.33G FLOPs per frame, making it capable of real-time video face swapping on mobile phones. In addition, a stable training method based on knowledge distillation is introduced, and a loss reweighting module is adopted to obtain better comprehensive results. method achieves comparable results with the teacher model and other state-of-the-art methods.

c96e6ae8f5117fced88548e1d749476f.png

I guess you'll like:

Simple explanation of stable diffusion: Interpretation of the potential diffusion model behind AI painting technology

1ca498a528f82026731a72d19e3967ac.png Click me to view GAN's series albums~!

Take out a lunch, become the forefront of CV vision!

The latest and most complete 100 summary! Generate Diffusion Models Diffusion Models

ECCV2022 | Summary of some papers on generating confrontation network GAN

CVPR 2022 | 25+ directions, the latest 50 GAN papers

 ICCV 2021 | Summary of GAN papers on 35 topics

Over 110 articles! CVPR 2021 most complete GAN paper combing

Over 100 articles! CVPR 2020 most complete GAN paper combing

Dismantling the new GAN: decoupling representation MixNMatch

StarGAN Version 2: Multi-Domain Diversity Image Generation

Attached download | Chinese version of "Explainable Machine Learning"

Attached download | "TensorFlow 2.0 Deep Learning Algorithms in Practice"

Attached download | "Mathematical Methods in Computer Vision" share

"A review of surface defect detection methods based on deep learning"

A Survey of Zero-Shot Image Classification: A Decade of Progress

"A Survey of Few-Shot Learning Based on Deep Neural Networks"

"Book of Rites·Xue Ji" has a saying: "Learning alone without friends is lonely and ignorant."

Welcome to join the GAN/diffusion model-communication WeChat group!

Scan the QR code below, add operation girl friends, and pull you into the group. When sending the application, please note that the format is: research direction + region + school/company + name . Such as  diffusion model + Beijing + Beihang + Wu Yanzu

4aaf303cf56bd54bf4b4140c6b03e3f9.jpeg

Please note the format: research direction + region + school/company + name

306dd9ce06da8be74728338641397c19.jpeg

Click for  a lunch delivery and become the frontier of CV vision! , receive coupons, and join  the planet of AI-generated creation and computer vision  knowledge!

Guess you like

Origin blog.csdn.net/lgzlgz3102/article/details/129106866