Made a word cloud video of a young lady dancing with Python

Pay attention to the official account : [ Xiao Zhang Python ], I have prepared 50+ Python fine e-books and 50G + high-quality video learning materials for you. The back-end reply keyword: 1024 can be obtained; if you have any questions about the content of the blog post, please use the official account back-end Add the author [personal WeChat], you can communicate directly with the author

This article uses python to make a word cloud image video. The left half of the video is a dance video of the young lady, and the right half is a word cloud video generated based on actions. Let’s take a look at the effect.

Python made a word cloud video to see young lady dancing from another angle

The production process is divided into the following parts

1. Video download

First, you need to download a video of a young lady dancing. Here I use the you-get tool, which can be installed with the help of Python's pip command

pip install you-get

You-get supports download platforms including: Youtube, Blili, TED, Tencent, Youku, iQiyi (covering all video platform download links),

Take youtube video as an example, you-get download command

you-get -o ~/Videos(存放视频路径) -O zoo.webm(视频命名) 'https://www.youtube.com/watch?v=jNQXAC9IVRw'

image

Here, the you-get download command is implemented through the os module, and three parameters can be passed in when using it: 1, the video link, 2, the file path to store the video; 3, the video naming;

def download(video_url,save_path,video_name):
   '''
   youget 下载视频
   :param video_url:视频链接
   :param save_path: 保存路径
   :param video_name: 视频命名
   :return:
   '''

   cmd = 'you-get -o {} -O {} {}'.format(save_path,video_name,video_url)
   res = os.popen(cmd,)
   res.encoding = 'utf-8'
   print(res.read())# 打印输出

For more usage of you-get, please refer to the official website. The usage introduction is very detailed:

https://you-get.org/#getting-started

image

2. Barrage download from station B;

Text data support is required to make word cloud images. Here we select the bullet screen of station B as the material; regarding the downloading method of the video bullet screen of station B, here is a shortcut method. Use requests to access the API interface of the specified video to get all the bullets under the video. screen

http://comment.bilibili.com/{cid}.xml # cid 为B站视频的cid 编号

But the structure of the API interface needs to know the cid number of the video

How to obtain the cid number of the video of station B: F12 opens the developer mode->NetWork->XHR->v2?cid=... link , there is a string of "cid=a string of numbers" in the link of the webpage, and the one after the equal sign The consecutive numbers are the cid number of the video

image

Take the above video as an example, 291424805 is the cid number of this video,

After having cid, through requests request API interface, you can get the barrage data inside

http://comment.bilibili.com/291424805.xml

image

image

def download_danmu():
    '''弹幕下载并存储'''
    cid = '141367679'# video_id
    url = 'http://comment.bilibili.com/{}.xml'.format(cid)

    f = open('danmu.txt','w+',encoding='utf-8') #打开 txt 文件
    res = requests.get(url)
    res.encoding = 'utf-8'
    soup = BeautifulSoup(res.text,'lxml')
    items = soup.find_all('d')# 找到 d 标签

    for item in items:
        text = item.text
        print('---------------------------------'*10)
        print(text)

        seg_list = jieba.cut(text,cut_all =True)# 对字符串进行分词处理,方便后面制作词云图
        for j in seg_list:
            print(j)
            f.write(j)
            f.write('\n')
    f.close()

3. Video frame cutting, portrait segmentation

After downloading to the video, first split the video into one frame of image;

vc = cv2.VideoCapture(video_path)
    c =0
    if vc.isOpened():
        rval,frame = vc.read()# 读取视频帧
    else:
        rval=False

    while rval:
        rval,frame = vc.read()# 读取每一视频帧,并保存至图片中

        cv2.imwrite(os.path.join(Pic_path,'{}.jpg'.format(c)),frame)
        c += 1
        print('第 {} 张图片存放成功!'.format(c))

image

Identify and extract the young lady in each frame, that is, portrait segmentation , here with the help of Baidu API interface,

APP_ID = "23633750"
    API_KEY = 'uqnHjMZfChbDHvPqWgjeZHCR'
    SECRET_KEY = '************************************'

    client = AipBodyAnalysis(APP_ID, API_KEY, SECRET_KEY)
    # 文件夹
    jpg_file = os.listdir(jpg_path)
    # 要保存的文件夹
    for i in jpg_file:
        open_file = os.path.join(jpg_path,i)
        save_file = os.path.join(save_path,i)
        if not os.path.exists(save_file):#文件不存在时,进行下步操作
            img = cv2.imread(open_file)  # 获取图像尺寸
            height, width, _ = img.shape
            if crop_path:# 若Crop_path 不为 None,则不进行裁剪
                crop_file = os.path.join(crop_path,i)
                img = img[100:-1,300:-400] #图片太大,对图像进行裁剪里面参数根据自己情况设定
                cv2.imwrite(crop_file,img)
                image= get_file_content(crop_file)
            else:

                image = get_file_content(open_file)

            res = client.bodySeg(image)#调用百度API 对人像进行分割
            labelmap = base64.b64decode(res['labelmap'])
            labelimg = np.frombuffer(labelmap,np.uint8)# 转化为np数组 0-255
            labelimg = cv2.imdecode(labelimg,1)
            labelimg = cv2.resize(labelimg,(width,height),interpolation=cv2.INTER_NEAREST)
            img_new = np.where(labelimg==1,255,labelimg)# 将 1 转化为 255
            cv2.imwrite(save_file,img_new)
            print(save_file,'save successfully')

Convert the image containing the portrait into a binary image, the foreground is the character, and the rest is the background

image

Before using the API, you need to create a human body analysis application on Baidu Smart Cloud Platform with your own account, which requires three parameters: ID, AK, SK

image

For how to use Baidu API, please refer to official documentation

image

4. Make a word cloud image of the segmented image

According to the mask of the young lady's portrait obtained in step 3,

image

With the help of wordcloud word cloud database and collected barrage information, draw a word cloud image for each binary image (before making, please make sure that each image is a binary image, all black pixel images need to be removed)

image

word_list = []
    with open('danmu.txt',encoding='utf-8') as f:
        con = f.read().split('\n')# 读取txt文本词云文本
        for i in con:
            if re.findall('[\u4e00-\u9fa5]+', str(i), re.S): #去除无中文的词频
                word_list.append(i)

    for i in os.listdir(mask_path):
        open_file = os.path.join(mask_path,i)
        save_file = os.path.join(cloud_path,i)

        if not os.path.exists(save_file):
            # 随机索引前 start 频率词
            start = random.randint(0, 15)
            word_counts = collections.Counter(word_list)
            word_counts = dict(word_counts.most_common()[start:])
            background = 255- np.array(Image.open(open_file))

            wc =WordCloud(
                background_color='black',
                max_words=500,
                mask=background,
                mode = 'RGB',
                font_path ="D:/Data/fonts/HGXK_CNKI.ttf",# 设置字体路径,用于设置中文,

            ).generate_from_frequencies(word_counts)
            wc.to_file(save_file)
            print(save_file,'Save Sucessfully!')

5. Picture stitching and video synthesis

After all the word cloud images are generated, it would be boring if you look at the images one by one. It would be even cooler if you synthesize the processed word cloud images into a video!

In order to compare the effect of the video before and after, I have added one more step. Before merging the original image and the word cloud image, the synthesis effect is as follows:

image

 num_list = [int(str(i).split('.')[0]) for i in os.listdir(origin_path)]
    fps = 24# 视频帧率,越大越流畅
    height,width,_=cv2.imread(os.path.join(origin_path,'{}.jpg'.format(num_list[0]))).shape # 视频高度和宽度
    width = width*2
    # 创建一个写入操作;
    video_writer = cv2.VideoWriter(video_path,cv2.VideoWriter_fourcc(*'mp4v'),fps,(width,height))

    for i in sorted(num_list):
        i = '{}.jpg'.format(i)
        ori_jpg = os.path.join(origin_path,str(i))
        word_jpg = os.path.join(wordart_path,str(i))
        # com_jpg = os.path.join(Composite_path,str(i))
        ori_arr = cv2.imread(ori_jpg)
        word_arr = cv2.imread(word_jpg)

        # 利用 Numpy 进行拼接
        com_arr = np.hstack((ori_arr,word_arr))
        # cv2.imwrite(com_jpg,com_arr)# 合成图保存
        video_writer.write(com_arr) # 将每一帧画面写入视频流中
        print("{} Save Sucessfully---------".format(ori_jpg))

Coupled with background music, the video can be upgraded to another level~

At last

For all the code acquisition methods used in this article, follow the WeChat public account : Xiao Zhang Python , and reply to the keyword 210204 in the background to get it.

Regarding the material in the video, we hereby declare

Up barrage from the main station B semi-immortal Buddha "[Buddha] half tea joining in the end you know how deceptive it?

The dancing video of my little sister is taken from Youtube Channel Lilifilm Official "LILI's FILM #3-LISA Dance Performance Video"

Finally, thank you all for reading, see you in the next issue!

Guess you like

Origin blog.csdn.net/weixin_42512684/article/details/113750357