AI painting upgrade! One-click to help you generate infinite zoom effect video with Stable Diffusion

In this article , we introduce the use of OpenVINO™ to optimize and accelerate the inference of the Stable Diffusion model. On the Intel® discrete graphics card, we can quickly generate our favorite AI paintings according to the instructions (prompt) we input. Today, we have upgraded this application scenario again. In addition to being able to draw, using OpenVINO’s support and optimization of the Stable Diffusion v2 model, we can also quickly generate videos with infinite zoom effects on Intel® discrete graphics cards, enabling AI to draw The effect is more dynamic, and the effect is more shocking. Not much to say, let us draw the key points and see how it is realized. 

Please click here for all the codes generated by this infinite zoom Stable Diffusion v2 video . An AI painting blog 

The deep learning model we applied this time is the Stable Diffusion v2 model. Compared with its previous generation v1 model, it has a series of new features, including a new robust encoder OpenCLIP, created by LAION, and obtained the Stability With the help of AI , this version significantly enhances the generated photos compared to the V1 version. In addition, the v2 model adds an updated restoration module (inpainting) to the previous model. This text-guided fix makes switching parts in images easier than ever. It is also based on this new feature that we can use the stabilityai/stable-diffusion-2-inpainting model to generate videos with infinite zoom effects. 

In image editing, Inpainting is the process of restoring missing parts of a picture. Most commonly used to reconstruct old degraded images and remove cracks, scratches, dust spots or red eyes from photos. But with the power of AI and Stable Diffusion model, Inpainting can achieve more functions. For example, it can be used to render something completely new on any part of an existing picture, rather than just restoring missing parts of an image. As long as you use your imagination, you can make more works with cool effects. 

The following workflow diagram explains how the Stable Diffusion inpainting pipeline for Inpainting works:

In this code example, we will complete the following steps: 

  1. Convert PyTorch model to ONNX format. 

  1. Use the Model Optimizer tool to convert the ONNX model to the OpenVINO IR format. 

  1. Run the Stable Diffusion v2 inpainting pipeline to generate infinite zoom effect videos. 

Now, let's focus on how to configure the code for the inference pipeline. 

There are three main steps here: 

  1. Load models on inference hardware devices Load models on device 

  1. Configure tokenizer and scheduler Configure tokenizer and scheduler 

  1. Example of creating the OvStableDiffusionInpaintingPipeline class 

We load the model and run the inference on a machine equipped with an Intel Sharp™ discrete graphics card, so we choose "GPU" for the inference device. By default it uses "AUTO" and will automatically switch to the detected GPU. code show as below: 

from openvino.runtime import Core 

 

core = Core() 

 

tokenizer = CLIPTokenizer.from_pretrained('openai/clip-vit-large-patch14') 

 

text_enc_inpaint = core.compile_model(TEXT_ENCODER_OV_PATH_INPAINT, "GPU") 

unet_model_inpaint = core.compile_model(UNET_OV_PATH_INPAINT, " GPU ") 

vae_decoder_inpaint = core.compile_model(VAE_DECODER_OV_PATH_INPAINT, "GPU") 

vae_encoder_inpaint = core.compile_model(VAE_ENCODER_OV_PATH_INPAINT, "GPU") 

 

ov_pipe_inpaint = OVStableDiffusionInpaintingPipeline( 

    tokenizer=tokenizer, 

    text_encoder=text_enc_inpaint, 

    unet=unet_model_inpaint, 

    vae_encoder=vae_encoder_inpaint, 

    vae_decoder=vae_decoder_inpaint, 

    scheduler=scheduler_inpaint, 

) 

  

 Next, let's enter the text prompt and run the code generated by the video.

import ipywidgets as widgets 

 

zoom_prompt = widgets.Textarea(value="valley in the Alps at sunset, epic vista, beautiful landscape, 4k, 8k", description='positive prompt', layout=widgets.Layout(width="auto")) 

zoom_negative_prompt = widgets.Textarea(value="lurry, bad art, blurred, text, watermark", description='negative prompt', layout=widgets.Layout(width="auto")) 

zoom_num_steps = widgets.IntSlider(min=1, max=50, value=20, description='steps:') 

zoom_num_frames = widgets.IntSlider(min=1, max=50, value=3, description='frames:') 

mask_width = widgets.IntSlider(min=32, max=256, value=128, description='edge size:') 

zoom_seed = widgets.IntSlider(min=0, max=10000000, description='seed: ', value=9999) 

zoom_in = widgets.Checkbox( 

    value=False, 

    description='zoom in', 

    disabled=False 

) 

 

widgets.VBox([zoom_prompt, zoom_negative_prompt, zoom_seed, zoom_num_steps, zoom_num_frames, mask_width, zoom_in]) 

In this step, I set the step to 20. Ideally, I'd use 50, which gives the best looking results. In addition, here you can also set the number of generated pictures by yourself, and all the generated pictures will be combined to form the final infinite zoom effect video. Of course, we also generated GIF files so that you can visualize the generated results in various forms. 

Final Results. 

stable_diffusion_video

Summarize


At the moment, if you want to understand how "Stable Diffusion" works, and how Intel hardware accelerates it, OpenVINO Notebooks is undoubtedly the first choice. If you have any questions or want to showcase some of your best work, please comment below or via our GitHub discussion board! Happy coding, everyone.

Guess you like

Origin blog.csdn.net/m0_59448707/article/details/129929318