Community feed | Analysis of the principles of FaceChain open source project

1. Background description

Various AI photo-taking software have exploded in the circle of friends due to their accurate personal images and exquisite generated effects. ID photos meet users' immediate needs, and period photos and other style photos meet users' needs for "beautiful photos".

FaceChain is a deep learning model tool that can be used to create a personal digital image. Users only need to provide a minimum of three photos to get their own digital avatar of their own personal image. FaceChain supports the use of model training and inference capabilities in Gradio's interface, and also supports senior developers to use Python scripts for training and inference. At the same time, FaceChain welcomes developers to continue developing and contributing to this Repo. The project has been open sourced for 2 weeks and has nearly 4K stars. Everyone is welcome to click on the link to experience it.

GitHub open source address:
https://github.com/modelscope/facechain 
(If you find it useful, please star it~~)

HuggingFace Space experience address:

https://huggingface.co/spaces/modelscope/FaceChainhttps://huggingface.co/spaces/modelscope/FaceChainhttps://

3754b283ea998428e75c98ae82aac6d9.png 78a04c75dd15b48a1c0f936116bdaedc.png

2. Functional characteristics

One-click experience of multiple styles and images with one ID:

b9f25ee234d03f10fe0a1fd41b8dcb8d.png

Ready-made style models are plug-and-play, allowing users to select different style models during training to generate personal digital images of different styles. The picture below is an example of the Fengguanxiapei style model xiapei lora model. For more high-quality style lora models, please refer to Civitai :

Example address: 
https://www.liblibai.com/modelinfo/f746450340a3a932c99be55c1a82d20c

Civitai URL: 
https://civitai.com/

99749a8c3df2a1eba6a71b0a1d25b79b.png

Personalized prompts support users to add personalized prompts to achieve effects such as cross-dressing, as shown below, clothing selection prompt words: The lord of the rings, ELF, Arwen Undomiel, beautiful, upper_body, best quality, Professional

723e5a0d5180b73f95bddbe9d646a7e9.png

Other features ongoing:

  • Based on ControlNet or Composer, it supports specifying poses

  • Add personalized beauty effects module

  • Base model upgrade, SD 1.5 upgraded to SDXL

  • Compatible with WebUI

3. Algorithm introduction

Fundamental

The ability of the personal portrait model comes from the Vincentian graph function of the Stable Diffusion model. It inputs a piece of text or a series of prompt words and outputs the corresponding image. We consider the main factors that affect the effect of personal photo generation: photo style information, and user character information. To this end, we use the offline trained style LoRA model and the online trained face LoRA model to learn the above information. LoRA is a fine-tuned model with fewer trainable parameters. In Stable Diffusion, the information of the input image can be injected into the LoRA model by performing Vincentian graph training on a small number of input images. Therefore, the ability of the personal portrait model is divided into two stages: training and inference. The training stage generates image and text label data used to fine-tune the Stable Diffusion model to obtain the face LoRA model; the inference stage generates based on the face LoRA model and style LoRA model. Personal portrait images.

be8559306e94d60eb79708d4a0e5a6fb.png

training phase

Input: User-uploaded image containing clear face area
Output: Face LoRA model

Description: First, we use the image rotation model based on orientation judgment and the face refinement rotation method based on face detection and key point model to process user uploaded images to obtain images containing forward faces; Next, we use Human body parsing model and portrait skin beautification model to obtain high-quality face training images; subsequently, we use the face attribute model and text annotation model, combined with label post-processing methods, to generate refined labels for the training images; finally, we use The above image and label data fine-tune the Stable Diffusion model to obtain the face LoRA model.

reasoning stage

Input: The user uploads the image during the training phase, and the preset input prompt words are used to generate personal portraits.
Output: Personal portrait images.

Description: First, we fuse the weights of the face LoRA model and the style LoRA model into the Stable Diffusion model; next, we use the Vincent graph function of the Stable Diffusion model to initially generate a personal portrait image based on the preset input prompt words; then , we use the face fusion model to further improve the face details of the above photo images, in which the template faces used for fusion are selected in the training images through the face quality evaluation model; finally, we use the face recognition model to calculate the generated photo images Based on the similarity with the template face, the portrait images are sorted, and the top-ranked personal portrait images are output as the final output result.

4. Global Developer Invitation

The Kuwa FaceChain project has been open sourced. We plan to continue to work with the open source community to continuously polish the open source project, unlock more advanced gameplay (such as character emoticons, character comic stories, virtual fitting rooms...), and conduct deeper research. Algorithm innovation, publish corresponding top conference papers. If you are interested in this open source project and have vision and belief in the future of this open source project, you are welcome to sign up.

8327a03b096b3b766b9415bdb5d74798.png

Click to read the original text to register~



This article is reproduced from content provided by the community and does not represent the official position. To learn more, please follow Zhihu’s “ModelScope Assistant”.

If you have good articles that you would like to share with more people through our platform, please contact us through this link: 

https://hf.link/tougao

Guess you like

Origin blog.csdn.net/HuggingFace/article/details/132506920