[python+you_get+moviepy] Get ghost videos and extract audio--don't let the player limit your love for music

I. Introduction

1.1 Sources of demand

Many people like to listen to music, but now the forms of music are becoming more and more abundant. For example, short videos on Douyin and ghost videos on Station B have many excellent sound sources. These works are a kind of enjoyment even if you don’t watch the video and just listen to the sound. However, it is difficult to find these sound sources on major music platforms, let alone some niche audio.
So, how do we get the audio based on the video link ?

1.2 Preparation

 python: 3.11.3 (≥3.7)
 you-get library: 0.4.1650
 moviepy library: 1.0.3

1.3 Analysis of ideas

you-get is a very popular video extraction project that supports extracting video files from numerous web links; moviepy is a commonly used library for python audio and video processing, similar to ffmpeg, but it is more convenient and faster to install and use, and can convert MP4 files as MP3 files.
If you want to add lyrics, you can also add an lrc file with the same name in the folder later.

1.4 Algorithm process design

 1). Write a video download function, and use the process pool to achieve synchronous downloading
 2). Record the changes in the folder before and after downloading, identify the flv file and convert it to MP4 format
 3). Write the video to audio function, and achieve synchronous conversion through the process pool

2. Code analysis

2.1 Install you-get

Run in the file:

os.system("pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ you-get")

You can also enter the command line in the cmd window:

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ you-get

First time installation:
ou-get first installation

Non-first-time installation: You can comment out this code
you-get non-first-time installation
after the first-time installation .

2.2 Process pool to implement video downloading

Download the video file according to the link and specify the save path:

def downloadMp4(filePath, url):
    os.system(f"you-get -o {
      
      filePath} {
      
      url}")
    print(f"{
      
      url} 下载完成")

Use the process pool to use computer resources to speed up the download of multiple videos:

def manyDownload(filePath, mp4Urls):
    p = Pool(4)
    for url in mp4Urls:
        p.apply_async(downloadMp4, args=(filePath, url,))
    p.close()
    p.join()

2.3 Recognize flv files and convert them to MP4 format

The file list of the folder must be recorded before and after the video, otherwise the previously saved video files may be extracted:

files_before = os.listdir(filePath)
...
files_after = os.listdir(filePath)

Define the recognition function and change the format to return a list of file names in the new MP4 format:

def getMp4List(files_before, files_after):
    flvList = []
    mp4List = []
    # 找到刚下载的 flv 文件
    for file in files_after:
        if file not in files_before:
            if file.split('.')[-1] in ['flv', 'mp4']:
                flvList.append(file)
    # 将 flv 后缀改为 mp4
    for file in flvList:
        if file.split('.')[-1] == 'flv':
            new_name = file[:-3] + "mp4"
            os.rename(f"{
      
      filePath}{
      
      file}", f"{
      
      filePath}{
      
      new_name}")
            mp4List.append(new_name)
        elif file.split('.')[-1] == 'mp4':
            mp4List.append(file)
    return mp4List

2.4 Process pool realizes the simultaneous conversion of multiple video files into audio

Video to audio function:

# 从视频文件中提取音频
def getMp3(filePath, mp4File):
    video = VideoFileClip(f"{
      
      filePath}{
      
      mp4File}")
    audio = video.audio
    mp3File = mp4File[:-1] + "3"
    audio.write_audiofile(f"{
      
      filePath}{
      
      mp3File}")
    print(f"{
      
      mp3File}音频提取完成!")

Process pool implements synchronous acceleration:

# 利用进程池同时从视频中提取音频
def manyTransfer(filePath, mp4List):
    p = Pool(4)
    for mp4File in mp4List:
        p.apply_async(getMp3, args=(filePath, mp4File,))
    p.close()
    p.join()

3. Run screenshots

Download videos simultaneously:
Download videos simultaneously
Note that this is synchronized, but there is a problem with the display. When the second file is downloaded, the first file has also been downloaded.
Synchronously converted to audio:
Convert to audio simultaneously
View the folder at this time:
Folder status

4. Complete code

4.1 Heavyweight

With process pool, suitable for scenarios where a large number of videos are downloaded and converted:

import os
from moviepy.editor import *
from multiprocessing import Pool


# 根据链接下载视频文件
def downloadMp4(filePath, url):
    os.system(f"you-get -o {
      
      filePath} {
      
      url}")
    print(f"{
      
      url} 下载完成")


# 使用进程池,利用电脑资源加速多个视频的下载
def manyDownload(filePath, mp4Urls):
    p = Pool(4)
    for url in mp4Urls:
        p.apply_async(downloadMp4, args=(filePath, url,))
    p.close()
    p.join()


# 获取新增的 flv 文件,并转化为 MP4 格式
def getMp4List(files_before, files_after):
    flvList = []
    mp4List = []
    # 找到刚下载的 flv 文件
    for file in files_after:
        if file not in files_before:
            if file.split('.')[-1] in ['flv', 'mp4']:
                flvList.append(file)
    # 将 flv 后缀改为 mp4
    for file in flvList:
        if file.split('.')[-1] == 'flv':
            new_name = file[:-3] + "mp4"
            os.rename(f"{
      
      filePath}{
      
      file}", f"{
      
      filePath}{
      
      new_name}")
            mp4List.append(new_name)
        elif file.split('.')[-1] == 'mp4':
            mp4List.append(file)
    return mp4List


# 从视频文件中提取音频
def getMp3(filePath, mp4File):
    video = VideoFileClip(f"{
      
      filePath}{
      
      mp4File}")
    audio = video.audio
    mp3File = mp4File[:-1] + "3"
    audio.write_audiofile(f"{
      
      filePath}{
      
      mp3File}")
    print(f"{
      
      mp3File}音频提取完成!")


# 利用进程池同时从视频中提取音频
def manyTransfer(filePath, mp4List):
    p = Pool(4)
    for mp4File in mp4List:
        p.apply_async(getMp3, args=(filePath, mp4File,))
    p.close()
    p.join()


if __name__ == "__main__":
    filePath = "D:/Music/"  # 注意,最后“/”要有,否则会导致后续路径错误
    files_before = os.listdir(filePath)  # 获取运行前的文件列表,用于跟下载后的做对比
    # 复制鬼畜《念诗之王》、《敢杀我的马》的视频链接,也可以只有一个视频链接
    mp4Urls = ["https://www.bilibili.com/video/BV1bW411n7fY/?spm_id_from=333.337.search-card.all.click",
               "https://www.bilibili.com/video/BV1yt4y1Q7SS/?spm_id_from=333.337.search-card.all.click"]

    # 第一次运行时安装下面的you-get库,非首次运行时可注释掉;也可以通过命令行安装;还可以通过 PyCharm 或 Anaconda 安装
    os.system("pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/  you-get")

    # 利用进程池同时下载多个视频
    manyDownload(filePath, mp4Urls)

    # 对比文件夹变化,获取 flv 文件,并改为 MP4 格式
    files_after = os.listdir(filePath)
    mp4List = getMp4List(files_before, files_after)

    # 利用进程池同时从视频中提取音频
    manyTransfer(filePath, mp4List)

4.2 Lightweight

Without process pool, suitable for single video download and conversion:

import os
from moviepy.editor import *
from multiprocessing import Pool


# 根据链接下载视频文件
def downloadMp4(filePath, url):
    os.system(f"you-get -o {
      
      filePath} {
      
      url}")
    print(f"{
      
      url} 下载完成")


# 获取新增的 flv 文件,并转化为 MP4 格式
def getMp4List(files_before, files_after):
    flvList = []
    mp4List = []
    # 找到刚下载的 flv 文件
    for file in files_after:
        if file not in files_before:
            if file.split('.')[-1] in ['flv', 'mp4']:
                flvList.append(file)
    # 将 flv 后缀改为 mp4
    for file in flvList:
        if file.split('.')[-1] == 'flv':
            new_name = file[:-3] + "mp4"
            os.rename(f"{
      
      filePath}{
      
      file}", f"{
      
      filePath}{
      
      new_name}")
            mp4List.append(new_name)
        elif file.split('.')[-1] == 'mp4':
            mp4List.append(file)
    return mp4List


# 从视频文件中提取音频
def getMp3(filePath, mp4File):
    video = VideoFileClip(f"{
      
      filePath}{
      
      mp4File}")
    audio = video.audio
    mp3File = mp4File[:-1] + "3"
    audio.write_audiofile(f"{
      
      filePath}{
      
      mp3File}")
    print(f"{
      
      mp3File}音频提取完成!")


if __name__ == "__main__":
    filePath = "D:/Music/"  # 注意,最后“/”要有,否则会导致后续路径错误
    files_before = os.listdir(filePath)  # 获取运行前的文件列表,用于跟下载后的做对比
    # 复制鬼畜《念诗之王》、《敢杀我的马》的视频链接,也可以只有一个视频链接
    mp4Urls = ["https://www.bilibili.com/video/BV1bW411n7fY/?spm_id_from=333.337.search-card.all.click",
               "https://www.bilibili.com/video/BV1yt4y1Q7SS/?spm_id_from=333.337.search-card.all.click"]

    # 第一次运行时安装下面的you-get库,非首次运行时可注释掉;也可以通过命令行安装;还可以通过 PyCharm 或 Anaconda 安装
    os.system("pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/  you-get")

    # 利用进程池同时下载多个视频
    for url in mp4Urls:
        downloadMp4(filePath, url)

    # 对比文件夹变化,获取 flv 文件,并改为 MP4 格式
    files_after = os.listdir(filePath)
    mp4List = getMp4List(files_before, files_after)

    # 利用进程池同时从视频中提取音频
    for mp4 in mp4List:
        getMp3(filePath, mp4)

PS: Please do not use this method for commercial purposes. Use Internet resources reasonably within the scope permitted by law! For more interesting applications of Python, please pay attention to subsequent updates~

Guess you like

Origin blog.csdn.net/weixin_44844635/article/details/131345293