How to use ModelScope to realize "AI face change" video

foreword

        Nowadays, video content is hot, and controversial or contrasting face-swapping videos always attract people's attention. Although AI face changing has been popular in the market for a long time, there are countless related production tools or mobile applications. However, most production tools are either memberships or fees in most cases, and the replacement templates are limited. From the perspective of actual combat, use Ali ModelScope's image and face fusion to realize AI video face swapping.

process

       Provide a video and a replacement face picture, use opencv-python to split the video into pictures according to the frame rate, and use FFmpeg to extract the audio from the video into a separate file (mp3). Traverse the picture of each frame in the directory, pass in the new face and frame rate picture through the face fusion model of ModelScope, and get the frame picture that replaced the face. Finally, the replaced face pictures are combined into a new video through opencv-python, and FFmpeg adds the extracted audio files.

environment

1. Python 3.7.16

2. ModelScope 1.4.2

3. OpenCV-Python 4.7.0

4. FFmpeg 12.2.0

Environment installation

1. Add Python virtual environment

conda create -n modelscope python=3.7 && conda activate modelscope

2. Install ModelScope, using the domestic mirror source

pip install modelscope --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple

3. Install OpenCV

pip install opencv-python -i https://pypi.tuna.tsinghua.edu.cn/simple

4. Install FFmpeg

Because the picture alone is not available, the installation method is placed in the video face change below

face swap

1. Material preparation

     Here I have prepared a picture with a frontal face, a side face and two faces in the picture, and then replaced it with a picture, and finally ran the code to see the effect. (Maybe it’s because of the model, I feel that the face change doesn’t look much different just looking at the pictures, but it’s a bit like just having a beautification, or it may be that the two actors are a bit similar, but they are a little different when you look carefully).

2. Code section

import cv2
from modelscope.outputs import OutputKeys
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

image_face_fusion = pipeline(Tasks.image_face_fusion, 
                       model='damo/cv_unet-image-face-fusion_damo')
template_path = '181.jpg'
user_path = 'face.jpg'
result = image_face_fusion(dict(template=template_path, user=user_path))

cv2.imwrite('result.png', result[OutputKeys.OUTPUT_IMG])
print('finished!')

Video Face Swap

1. FFmpeg installation

If it is windows10, you can choose according to the following. Shared is a dynamic version, and the one without is a static version. All functions are gathered together.

2. FFmpeg environment configuration

After downloading, decompression will generate a directory, put the bin file into the computer environment variable, and then check whether the installation is successful through ffmpeg -version.

3. FFmpeg usage

3.1. Extract audio from video (the addresses of input video and output audio can be relative paths)

ffmpeg -i videos\11.mp4 -q:a 0 -map a audio\audio.mp3 

 3.2. Add an independent audio file to the video (receive input video, input audio, output new video)

ffmpeg -i videos/ldh.mp4 -i audio/audio.mp3 -c:v copy -c:a aac -strict experimental videos/new_ldh.mp4

4. Start coding

from pathlib import Path
import cv2
import os

def video2mp3_img(video_path, save_path):
    def video_split(video_path, save_path):

        if not os.path.exists(save_path):
            os.makedirs(save_path)
        cap = cv2.VideoCapture(video_path)
        i = 0
        while True:
            ret, frame = cap.read()
            if ret:
                cv2.imwrite(save_path + '/' + str(i) + '.jpg', frame)
                i += 1
            else:
                break
        cap.release()
    if not os.path.exists(save_path):
        os.makedirs(save_path)
        
    # 视频分割
    video_split(video_path, save_path)
    
    # 视频转音频
    os.system("ffmpeg -i {} -q:a 0 -map a {}/audio.mp3".format(video_path, save_path))

def face_replace(user_path=""):

    from pathlib import Path
    import cv2
    from modelscope.outputs import OutputKeys
    from modelscope.pipelines import pipeline
    from modelscope.utils.constant import Tasks
    import os
    os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'

    def my_function(img_path):
    
        image_face_fusion = pipeline(Tasks.image_face_fusion, model='damo/cv_unet-image-face-fusion_damo')
        template_path = img_path
        filename = os.path.splitext(os.path.basename(img_path))[0]
        
        # 替换面部依赖
        result = image_face_fusion(dict(template=template_path, user=user_path))
        cv2.imwrite(f'video_imgout/{filename}.jpg', result[OutputKeys.OUTPUT_IMG])
            
    threads = []
    BASE_PATH = os.path.dirname(__file__)
    
    for dirpath, dirnames, filenames in os.walk(r"D:\3code\3Python\modelscope\mv_face_change-main"):
        for filename in filenames:
            print(filename)
            if filename.endswith('.jpg'):
                file_path = Path(os.path.join(dirpath, filename))
                print(file_path)
                my_function(str(file_path))

def img2mp4(video_path, save_name):
    BASE_PATH = "D:\3code\3Python\modelscope\mv_face_change-main"
    img = cv2.imread("video_img/0.jpg")
    imgInfo = img.shape
    size = (imgInfo[1], imgInfo[0])
    
    files = []
    for dirpath, dirnames, filenames in os.walk(r"D:\3code\3Python\modelscope\mv_face_change-main\video_imgout"):
        for filename in filenames:
            fileName = Path(os.path.join(dirpath, filename))
            files.append(os.path.join(dirpath, filename))
    
    files = [file.replace('\\', '/') for file in files]
    files.sort(key=lambda x: int(x.split('/')[-1].split('.')[0]))
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')
    videoWrite = cv2.VideoWriter(r"D:\3code\3Python\modelscope\mv_face_change-main\videos\ldh.mp4", fourcc, 25, size)  # 写入对象 1 file name  3: 视频帧率
    for i in files:
        print(i)
        img = cv2.imread(str(i))
        videoWrite.write(img)
    
    # 将video_img中的音频文件添加到视频中
    os.system("ffmpeg -i {} -i {} -c:v copy -c:a aac -strict experimental {}".format("videos/ldh.mp4", "audio/audio.mp3", "videos/newlest_ldh.mp4"))

if __name__ == '__main__':
    BASE = os.path.dirname(__file__)
    video_path = os.path.join(BASE, "videos/demo.mp4")  
    save_path = os.path.join(BASE, "video_img")         

    # 视频  ==> imgs
    video2mp3_img(video_path, save_path)
    
    # 人脸替换
    face_replace(user_path='zsy.jpg')
    
    # imgs ==> 视频
    img2mp4(video_path, save_name='zsy')

5. Error summary

When running the above code, if the output file does not contain any stream appears, it is an error reported in the two places of separating audio or appending audio to video. In most cases, the output path is incorrect or the command parameters are incorrect. There is another error that I did not record, that is, the video has no sound at all, and an error will be reported when the separation operation is performed. This is a video test I took casually at work (because I can’t wear headphones, and the video just happens to have no sound), so I tried my best to test and report an error. It’s fine to change the video. The key is that the error message did not say that the video has no sound.

6. Effect demonstration

        Due to time constraints, Yang Guo's video was not used, so the face-swapping demonstration was performed with a video without sound. In the future, multi-threaded processing will be performed on face-changing image replacement.

Guess you like

Origin blog.csdn.net/qq_35704550/article/details/130196528