AI Painting Stable Diffusion Research (14) SD Drawing + Cutting to Make Characters Speaking Video


Hi everyone, I am Rain or Shine.


In the previous article, we introduced in detail the case of using SadTlaker to create digital human videos. Interested friends please check out: AI Painting Stable Diffusion Research (13) SD Digital Human Production Tool SadTlaker Usage Tutorial .


For those who have not installed the SadTlaker plug-in, you can check out this article: AI Painting Stable Diffusion Research (12) SD Digital Human Production Tool SadTlaker plug-in installation tutorial .


Friends who have used SadTlaker must know that there are two less than ideal aspects of using the SadTlaker plug-in to produce videos of digital people talking:

(1). It takes a long time to generate videos. Especially for those with low graphics cards and memory, if you want to make a long video, the efficiency will be even lower.

The author personally tested: using a 3060 12G graphics card, it takes about 10 minutes to create a video of about 15 seconds.

(2). The current SadTlaker picture characters can only be taken from the front, which makes people feel more abrupt.


Is there any other way to make a video of a digital human, which can not only achieve the effect of making the digital human open his mouth to speak, but also be more efficient, and can also use non-frontal images?

The answer is yes, and that is today’s topic: SD drawings to create pictures of characters opening their mouths and talking + cutting to create talking videos.


1. SD production of pictures of characters opening their mouths to speak


1. Switch to SD Picture Drawing->Partial Redrawing Interface and upload a picture of a character.


Insert image description here


2. Reverse forward prompt words


Insert image description here


3. Rewrite the positive prompt words to let the characters open their mouths


For those who have not installed the prompt word plug-in, please check this article AI Painting Stable Diffusion Research (6) sd prompt word plug-in for detailed installation steps.


(1) We use the prompt word plug-in. At the prompt word, first enter Chinese: Open your mouth

The prompt word plug-in will automatically convert Chinese prompt words into English.


As shown in the picture:

Insert image description here


(2) In order to make the effect of the character's mouth opening more obvious and not be ignored by SD, we need to increase the weight of the mouth opening prompt word


Select the open mouth prompt word, and the weight operation button will pop up. Then we click three times to increase the weight icon button. At this time, the prompt word input box will automatically increase the weight of the prompt word.


As shown in the picture:

Insert image description here


Insert image description here


Insert image description here


(3) Select and redraw the area

In the local redraw interface, select the brush on the right side, and use ctrl+mouse wheel to adjust the brush thickness.


Insert image description here


(4) Adjust the redraw size

Insert image description here


(5) Enable controlnet and save the character’s posture unchanged

  • enable controlnet
  • Control type selection: openpose
  • Preprocessor: openpose_full
  • Model: control_v11p_sd15_openpose

Insert image description here


(6) Click Generate to get a picture of the character opening his mouth.

The comparison chart is as follows:

Insert image description here


Above we have obtained the picture of the character opening his mouth, so next, we use clipping, adding dubbing and subtitles to create a video of the character speaking.


2. Cut and create videos of characters talking

1. Preparation

  • Install the clip. The specific installation steps will not be detailed here. It is very simple. Please install it by yourself.

  • Prepare two pictures of the character opening and closing his mouth

  • Prepare audio files


2. Open the clip and click to start production.

Insert image description here


3. Import audio and pictures

As shown in the picture:

qyDT-1692773049806)(assets/image-20230823003231110.png)]

Insert image description here


4. Drag the audio into the audio track below

Insert image description here


5. Generate subtitles

Click the "Text" button on the menu bar, then click the "Smart Subtitles" button on the left, and then click the "Start Recognition" button to generate subtitles


Insert image description here


Subtitles are generated as follows:

Insert image description here


6. Drag the picture into the track, and then switch the mouth shape

(1) How to change the mouth shape to make it look like speaking?


Friends who have used clipping must know that 1 second of clipping is 30 frames.

Then normal people speak about 5-6 words in 1 second.

Therefore, we can calculate that there are about 5 frames per mouth shape.

Therefore, we first use the picture with the mouth closed, and then switch to the picture with the mouth open at 5 frames.


(2), production steps


  • step 1: Import the picture with the mouth open and the picture with the mouth closed into the two picture tracks respectively

  • step 2: Drag the right track zoom to the far right, you can clearly see the number of frames on the track, such as 1f \2f\4f\6f

    As shown in the picture:

Insert image description here


step 3: Move the positioning line to frame 5, and then click the split button to split both the open and closed mouth images.

as follows:


Insert image description here


Step 4: Process the shut-up image. After segmentation, we delete the first 5 frames and the unused shut-up image.

as follows:


Insert image description here


step 5: Continue to count 5 frames and then divide

Insert image description here


step 6: Delete the redundant parts of the open mouth and closed mouth images


Insert image description here


step 7: Drag the open mouth and closed mouth into the same track and group them

Insert image description here


Select two clips, then right-click, create a new composite clip, and group


Insert image description here


The current effect is a fragment of closing and opening the mouth.

As shown in the picture after grouping:

Insert image description here


step 8: Copy and paste multiple snippets until you have finished your sentence

Then align the end of the picture with the subtitles.

As shown in the picture:

Insert image description here


step 9: When you are not talking, use the shut-up image

Note: Align the end of the picture with the end of the blank space

as follows:

Insert image description here


step 10: Continue to process the subsequent speaking part, repeat step 8 until each sentence is processed

Insert image description here


step 11: Adjust the video ratio to 9:16 and then export the video

Click the export button in the upper right corner to export.

Insert image description here


Insert image description here


Okay, this video production ends here, let’s take a look at the effect:

SD pictures + cutouts to create character talking videos

To be honest, this video only shows the effect of opening and closing the mouth, without changes in facial expressions, and it does look rather stiff.

However, the focus here is to introduce the production ideas and editing methods. Friends who are interested can try it out.

Guess you like

Origin blog.csdn.net/lizhong2008/article/details/132452218
Recommended