Foreword: Google Research and NVIDIA proposed DreamPose, which incorporates pose information by modifying the starting point noise and fine-tuning the VAE-CLIP adapter to inject image information to achieve pose&image-to -video effect. It is one of the few image-to-video works in diffusion models. This blog explains the paper "DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion" in detail.
Table of contents