1. Background description
Various AI photo-taking software have exploded in the circle of friends due to their accurate personal images and exquisite generated effects. ID photos meet users' immediate needs, and period photos and other style photos meet users' needs for "beautiful photos".
FaceChain is a deep learning model tool that can be used to create a personal digital image. Users only need to provide a minimum of three photos to get their own digital avatar of their own personal image. FaceChain supports the use of model training and inference capabilities in Gradio's interface, and also supports senior developers to use Python scripts for training and inference. At the same time, FaceChain welcomes developers to continue developing and contributing to this Repo. The project has been open sourced for 2 weeks and has nearly 4K stars. Everyone is welcome to click on the link to experience it.
GitHub open source address:
https://github.com/modelscope/facechain
(If you find it useful, please star it~~)
HuggingFace Space experience address:
https://huggingface.co/spaces/modelscope/FaceChainhttps://huggingface.co/spaces/modelscope/FaceChainhttps://
2. Functional characteristics
One-click experience of multiple styles and images with one ID:
Ready-made style models are plug-and-play, allowing users to select different style models during training to generate personal digital images of different styles. The picture below is an example of the Fengguanxiapei style model xiapei lora model. For more high-quality style lora models, please refer to Civitai :
Example address:
https://www.liblibai.com/modelinfo/f746450340a3a932c99be55c1a82d20c
Civitai URL:
https://civitai.com/
Personalized prompts support users to add personalized prompts to achieve effects such as cross-dressing, as shown below, clothing selection prompt words: The lord of the rings, ELF, Arwen Undomiel, beautiful, upper_body, best quality, Professional
Other features ongoing:
Based on ControlNet or Composer, it supports specifying poses
Add personalized beauty effects module
Base model upgrade, SD 1.5 upgraded to SDXL
Compatible with WebUI
3. Algorithm introduction
Fundamental
The ability of the personal portrait model comes from the Vincentian graph function of the Stable Diffusion model. It inputs a piece of text or a series of prompt words and outputs the corresponding image. We consider the main factors that affect the effect of personal photo generation: photo style information, and user character information. To this end, we use the offline trained style LoRA model and the online trained face LoRA model to learn the above information. LoRA is a fine-tuned model with fewer trainable parameters. In Stable Diffusion, the information of the input image can be injected into the LoRA model by performing Vincentian graph training on a small number of input images. Therefore, the ability of the personal portrait model is divided into two stages: training and inference. The training stage generates image and text label data used to fine-tune the Stable Diffusion model to obtain the face LoRA model; the inference stage generates based on the face LoRA model and style LoRA model. Personal portrait images.
training phase
Input: User-uploaded image containing clear face area
Output: Face LoRA model
Description: First, we use the image rotation model based on orientation judgment and the face refinement rotation method based on face detection and key point model to process user uploaded images to obtain images containing forward faces; Next, we use Human body parsing model and portrait skin beautification model to obtain high-quality face training images; subsequently, we use the face attribute model and text annotation model, combined with label post-processing methods, to generate refined labels for the training images; finally, we use The above image and label data fine-tune the Stable Diffusion model to obtain the face LoRA model.
reasoning stage
Input: The user uploads the image during the training phase, and the preset input prompt words are used to generate personal portraits.
Output: Personal portrait images.
Description: First, we fuse the weights of the face LoRA model and the style LoRA model into the Stable Diffusion model; next, we use the Vincent graph function of the Stable Diffusion model to initially generate a personal portrait image based on the preset input prompt words; then , we use the face fusion model to further improve the face details of the above photo images, in which the template faces used for fusion are selected in the training images through the face quality evaluation model; finally, we use the face recognition model to calculate the generated photo images Based on the similarity with the template face, the portrait images are sorted, and the top-ranked personal portrait images are output as the final output result.
4. Global Developer Invitation
The Kuwa FaceChain project has been open sourced. We plan to continue to work with the open source community to continuously polish the open source project, unlock more advanced gameplay (such as character emoticons, character comic stories, virtual fitting rooms...), and conduct deeper research. Algorithm innovation, publish corresponding top conference papers. If you are interested in this open source project and have vision and belief in the future of this open source project, you are welcome to sign up.
Click to read the original text to register~
This article is reproduced from content provided by the community and does not represent the official position. To learn more, please follow Zhihu’s “ModelScope Assistant”.
If you have good articles that you would like to share with more people through our platform, please contact us through this link:
https://hf.link/tougao