AI video style conversion: Stable Diffusion+EBSynth

The video converted this time is relatively stable. Let me show you the effect first.

The video cannot be uploaded here, so it is better to put it on the disk:  https://www.aliyundrive.com/s/5mzfjLViyDa

Continuing from the above, in the previous article, we first used TemporalKit to extract the key frame images of the video, then used Stable Diffusion to redraw these images, and then used TemporalKit to complete the gaps between the redrawn key frame images. sequence frames and stitched together into a video.

In this article, we will use a new tool EBSynth to complete the sequence frames between the redrawn keyframe pictures. In other steps, we will continue to use the original method, but we need to make some adjustments to the relevant parameters. Please watch me explain slowly. .

The reason why I want to use the EBSynth tool is because many people introduce this method instead of the method mentioned in my previous article, so I will try it out and make a comparison.

I won’t mention the installation of plug-ins here, please read the previous article. We start by extracting keyframes.

Extract keyframes

Why extract keyframes? Extracting key frames is to convert the scenes with relatively large movement changes in the video into pictures. The next step is to redraw these pictures. If you do not extract key frames, but redraw each frame of the video, firstly, the workload will be large, and secondly, each redrawn picture may be a little different, and the picture may flicker seriously.

Find Temporal-Kit in the home tab of SD WebUI and click to open it. Then click "Pre-Processing" and upload the video to be processed in the video area. This is a section I intercepted from Douyin (the download address of this video will be provided at the end of the article). Don’t click “Run” right away, there are still some settings, please continue reading below.

You can see these settings below the video, which are settings for extracting images:

Sides: The sides of the generated image contain several video frames. If it is 2, it represents 4 video frames, which is 2*2; if it is 3, it represents 9 video frames, which is 3*3; the minimum setting is 1, that is, one picture contains one video frame. This should be set together with the Height Resolution later.

Height Resolution: Generate the pixel value of the height of the picture. The suggestion is: the height of the video * Sides. For example, my video is 1080*720, and the height of a single video frame is 720, but the previous Sides setting is 2, so it is 720*2= 1440. But this formula is not absolute. You can also write 720 or 2048. This value needs to consider the performance of the graphics card. If the graphics card is not good enough, do not set it too high.

frames per keyframe: How many video frames are used to extract a keyframe. The denser the keyframes, the smoother the action connection. However, the more redrawing changes may occur, causing the screen to flicker.

EBSynth Model: Because we will use EBSynth for processing later, so check it here, and the generated file name will have a certain format.

fps: How many frames does the video contain per second? You can generally get it by viewing the video details on your computer.

Target Folder: The output location of the keyframe image will actually be output to an input folder created in this directory. The intermediate files for subsequent processing are all in this folder, which is equivalent to a project directory, so it is recommended to create an input folder for each video. Different folders are created for different processes. Note that if it is a cloud, this needs to be a directory on the server.

Batch Settings: Because we need to process the entire video here, we need to check this Batch Run.

EBSynth Settings: Because the video time that EBSynth can process is relatively short, the video must be divided into several segments for processing.

After setting the parameters, click "Run" on the right side of the page.

After all the key frame images are extracted, the first image extracted should be displayed in this area of ​​the image, but I did not display it here. I don’t know if it is due to communication reasons. Please pay attention to the processing progress.

  • If a picture is displayed here, just click "Picture Drawing" to redraw it.
  • If there are no pictures displayed here, you can go to the file directory, find the first one, download it, and then upload it manually in "Pictures".

Convert style

Now enter the "Picture Drawing" interface.

The watercolor painting last time was rather blurry, so this time we directly used a model with simpler lines: toonyou. This is a comic style model, and there is no need to match the Lora model. You can get the download address at the end of the article.

My prompt words are posted here for easy copying.

Prompt words: a man, epic scene, a poster, flat color,

Reverse prompt words: easy_negative,beard

Then there are some parameter settings. Please adjust them according to the actual situation. If the effect is not good, adjust them.

Note two points:

  • Picture width and height: Generally, just set it according to the actual setting. If the number is too large, it is recommended to make it smaller first, and then use super high-definition to enlarge it.
  • Redraw intensity: This is turned on to the maximum. The previous article said not to make it too big. Because the repetition range is large, the pictures may change greatly, and the combined video may flicker. However, the actual measurement found that the impact of the redrawing range on the toonyou model seems to be relatively small. You can try it according to your actual situation, and it does not have to be exactly the same.
  • Generate seed: first set to -1 until a satisfactory picture is produced.

Next, continue to use ControlNet to control the drawing to avoid excessive changes in redrawing and try to stabilize the picture. I chose the Tile model here, but you can also try line-drawing models such as SoftEdge, Canny, and Lineart.

Then it’s time to draw cards and keep generating pictures until you are satisfied.

Pay attention to record the generation seeds of satisfactory pictures, which will be used in batch generation soon.

Switch Tushengtu to "Batch Processing" and fill in the two directories:

  • Input directory: The directory of the output image in the keyframe extraction step.
  • Output directory: The directory where redrawn pictures are saved, fixed value output, just fill it in.

Fill in the seeds for generating satisfactory pictures here. However, it is difficult for one seed to stably output various elements in the picture. You can experience it yourself.

The last step is to click the "Generate" button and wait for the batch processing to complete.

When you see this sentence below the image output area, the processing is basically completed. Sometimes the progress of the WebUI is not updated in time. Please pay attention to the output of the console or shell.

EBSynth handling

preprocessing

When extracting video key frames before, we divided the original video into several sub-videos. Preprocessing is to extract the original video frames and key frames in each sub-video separately and put them into relevant folders.

Find Temporal-Kit in the home tab of SD WebUI, and then click "Ebysynth-Process". Pay attention to the order of operations here, just follow the instructions on the picture.

Enter the project directory in "Input Folder" and click "read_last_settings". The video and most parameters will be automatically loaded.

Enter the resolution of the video in "output resolution". This will not be loaded automatically.

Finally click the button "prepare ebsynth" to process.

The folders of the sub-videos are as follows:

Open a sub-video folder and you can see these contents. The pictures in the frames and keys directories are written after this processing. In the next step, we will use EBSynth to continue processing these pictures.

If your SD WebUI is running in the cloud, it needs to be downloaded locally first, because the EBSynth tool needs to be run locally. You can use the following command to package it into a zip file first (note to replace it with your own project directory):

zip -q -r dehua2.zip /root/autodl-tmp/webui_outputs/temporal-kit/dehua2

Then download it through the method supported by the platform.

Generate sequence frames

This step requires the EBSynth software. The official version provides WIndows and Mac versions. However, when I run it on Mac, it will exit directly. I didn’t study the reason, so I just switched to the Windows environment.

Software download address: EbSynth - Transform Video by Painting Over a Single Frame , click the Download button, and click "No thanks, just start the download" at the bottom of the pop-up window. The website will automatically identify the operating system and download the corresponding version of the program file. .

After starting EBSynth, we only need to do the following:

Drag the frames directory of the sub-video to the Video input box;

Drag the sub-video's keys directory to the keyframes input box;

The picture tasks to be processed at this time will be automatically displayed in the lower part of the software.

Finally, click "Run All" at the bottom of the software to start processing this sub-video.

This process takes a long time, so please be patient. Each sub-video needs to be treated this way.

After all sub-videos have been processed, you can enter the stage of synthesizing the video.

If your SD WebUI is running in the cloud, you need to upload it. You can first package it into a zip file locally, then upload it through the method provided by the cloud platform, and then decompress it and cover it to the project directory. Refer to the following command:

unzip -o /root/autodl-tmp/root.zip -d /

-o is followed by the directory of the zip file, and -d is followed by the directory to be decompressed. Because the first-level directory in my zip file is root, a slash is used here. Pay attention to replacing the path of your zip file and the directory to be decompressed. Directory.

Synthetic video

Now we have entered the exciting video synthesis phase again, returning to the Temporal-Kit page in SD WebUI.

This step is very simple, just click the button "recombine ebsynth" and wait for the video to be generated. Under normal circumstances, the speed is very fast because all the frames required for the video have been processed.

When Batch-Warp was used before, it also needed to generate a bunch of frame pictures, so it was slow at the time; in the method of this article, this work is handed over to EBSynth.

Download

You can download and install the relevant models, plug-ins and materials introduced in this article by yourself according to the methods mentioned in the article, or you can use the ones I prepared by following the public account: Yinghuo Walk AI (yinghuo6ai), reply: Convert the video style to get the download address.

Guess you like

Origin blog.csdn.net/bossma/article/details/131915719