Full text at a glance
I. Introduction
1.1 Sources of demand
Many people like to listen to music, but now the forms of music are becoming more and more abundant. For example, short videos on Douyin and ghost videos on Station B have many excellent sound sources. These works are a kind of enjoyment even if you don’t watch the video and just listen to the sound. However, it is difficult to find these sound sources on major music platforms, let alone some niche audio.
So, how do we get the audio based on the video link ?
1.2 Preparation
python: 3.11.3 (≥3.7)
you-get library: 0.4.1650
moviepy library: 1.0.3
1.3 Analysis of ideas
you-get is a very popular video extraction project that supports extracting video files from numerous web links; moviepy is a commonly used library for python audio and video processing, similar to ffmpeg, but it is more convenient and faster to install and use, and can convert MP4 files as MP3 files.
If you want to add lyrics, you can also add an lrc file with the same name in the folder later.
1.4 Algorithm process design
1). Write a video download function, and use the process pool to achieve synchronous downloading
2). Record the changes in the folder before and after downloading, identify the flv file and convert it to MP4 format
3). Write the video to audio function, and achieve synchronous conversion through the process pool
2. Code analysis
2.1 Install you-get
Run in the file:
os.system("pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ you-get")
You can also enter the command line in the cmd window:
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ you-get
First time installation:
Non-first-time installation: You can comment out this code
after the first-time installation .
2.2 Process pool to implement video downloading
Download the video file according to the link and specify the save path:
def downloadMp4(filePath, url):
os.system(f"you-get -o {
filePath} {
url}")
print(f"{
url} 下载完成")
Use the process pool to use computer resources to speed up the download of multiple videos:
def manyDownload(filePath, mp4Urls):
p = Pool(4)
for url in mp4Urls:
p.apply_async(downloadMp4, args=(filePath, url,))
p.close()
p.join()
2.3 Recognize flv files and convert them to MP4 format
The file list of the folder must be recorded before and after the video, otherwise the previously saved video files may be extracted:
files_before = os.listdir(filePath)
...
files_after = os.listdir(filePath)
Define the recognition function and change the format to return a list of file names in the new MP4 format:
def getMp4List(files_before, files_after):
flvList = []
mp4List = []
# 找到刚下载的 flv 文件
for file in files_after:
if file not in files_before:
if file.split('.')[-1] in ['flv', 'mp4']:
flvList.append(file)
# 将 flv 后缀改为 mp4
for file in flvList:
if file.split('.')[-1] == 'flv':
new_name = file[:-3] + "mp4"
os.rename(f"{
filePath}{
file}", f"{
filePath}{
new_name}")
mp4List.append(new_name)
elif file.split('.')[-1] == 'mp4':
mp4List.append(file)
return mp4List
2.4 Process pool realizes the simultaneous conversion of multiple video files into audio
Video to audio function:
# 从视频文件中提取音频
def getMp3(filePath, mp4File):
video = VideoFileClip(f"{
filePath}{
mp4File}")
audio = video.audio
mp3File = mp4File[:-1] + "3"
audio.write_audiofile(f"{
filePath}{
mp3File}")
print(f"{
mp3File}音频提取完成!")
Process pool implements synchronous acceleration:
# 利用进程池同时从视频中提取音频
def manyTransfer(filePath, mp4List):
p = Pool(4)
for mp4File in mp4List:
p.apply_async(getMp3, args=(filePath, mp4File,))
p.close()
p.join()
3. Run screenshots
Download videos simultaneously:
Note that this is synchronized, but there is a problem with the display. When the second file is downloaded, the first file has also been downloaded.
Synchronously converted to audio:
View the folder at this time:
4. Complete code
4.1 Heavyweight
With process pool, suitable for scenarios where a large number of videos are downloaded and converted:
import os
from moviepy.editor import *
from multiprocessing import Pool
# 根据链接下载视频文件
def downloadMp4(filePath, url):
os.system(f"you-get -o {
filePath} {
url}")
print(f"{
url} 下载完成")
# 使用进程池,利用电脑资源加速多个视频的下载
def manyDownload(filePath, mp4Urls):
p = Pool(4)
for url in mp4Urls:
p.apply_async(downloadMp4, args=(filePath, url,))
p.close()
p.join()
# 获取新增的 flv 文件,并转化为 MP4 格式
def getMp4List(files_before, files_after):
flvList = []
mp4List = []
# 找到刚下载的 flv 文件
for file in files_after:
if file not in files_before:
if file.split('.')[-1] in ['flv', 'mp4']:
flvList.append(file)
# 将 flv 后缀改为 mp4
for file in flvList:
if file.split('.')[-1] == 'flv':
new_name = file[:-3] + "mp4"
os.rename(f"{
filePath}{
file}", f"{
filePath}{
new_name}")
mp4List.append(new_name)
elif file.split('.')[-1] == 'mp4':
mp4List.append(file)
return mp4List
# 从视频文件中提取音频
def getMp3(filePath, mp4File):
video = VideoFileClip(f"{
filePath}{
mp4File}")
audio = video.audio
mp3File = mp4File[:-1] + "3"
audio.write_audiofile(f"{
filePath}{
mp3File}")
print(f"{
mp3File}音频提取完成!")
# 利用进程池同时从视频中提取音频
def manyTransfer(filePath, mp4List):
p = Pool(4)
for mp4File in mp4List:
p.apply_async(getMp3, args=(filePath, mp4File,))
p.close()
p.join()
if __name__ == "__main__":
filePath = "D:/Music/" # 注意,最后“/”要有,否则会导致后续路径错误
files_before = os.listdir(filePath) # 获取运行前的文件列表,用于跟下载后的做对比
# 复制鬼畜《念诗之王》、《敢杀我的马》的视频链接,也可以只有一个视频链接
mp4Urls = ["https://www.bilibili.com/video/BV1bW411n7fY/?spm_id_from=333.337.search-card.all.click",
"https://www.bilibili.com/video/BV1yt4y1Q7SS/?spm_id_from=333.337.search-card.all.click"]
# 第一次运行时安装下面的you-get库,非首次运行时可注释掉;也可以通过命令行安装;还可以通过 PyCharm 或 Anaconda 安装
os.system("pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ you-get")
# 利用进程池同时下载多个视频
manyDownload(filePath, mp4Urls)
# 对比文件夹变化,获取 flv 文件,并改为 MP4 格式
files_after = os.listdir(filePath)
mp4List = getMp4List(files_before, files_after)
# 利用进程池同时从视频中提取音频
manyTransfer(filePath, mp4List)
4.2 Lightweight
Without process pool, suitable for single video download and conversion:
import os
from moviepy.editor import *
from multiprocessing import Pool
# 根据链接下载视频文件
def downloadMp4(filePath, url):
os.system(f"you-get -o {
filePath} {
url}")
print(f"{
url} 下载完成")
# 获取新增的 flv 文件,并转化为 MP4 格式
def getMp4List(files_before, files_after):
flvList = []
mp4List = []
# 找到刚下载的 flv 文件
for file in files_after:
if file not in files_before:
if file.split('.')[-1] in ['flv', 'mp4']:
flvList.append(file)
# 将 flv 后缀改为 mp4
for file in flvList:
if file.split('.')[-1] == 'flv':
new_name = file[:-3] + "mp4"
os.rename(f"{
filePath}{
file}", f"{
filePath}{
new_name}")
mp4List.append(new_name)
elif file.split('.')[-1] == 'mp4':
mp4List.append(file)
return mp4List
# 从视频文件中提取音频
def getMp3(filePath, mp4File):
video = VideoFileClip(f"{
filePath}{
mp4File}")
audio = video.audio
mp3File = mp4File[:-1] + "3"
audio.write_audiofile(f"{
filePath}{
mp3File}")
print(f"{
mp3File}音频提取完成!")
if __name__ == "__main__":
filePath = "D:/Music/" # 注意,最后“/”要有,否则会导致后续路径错误
files_before = os.listdir(filePath) # 获取运行前的文件列表,用于跟下载后的做对比
# 复制鬼畜《念诗之王》、《敢杀我的马》的视频链接,也可以只有一个视频链接
mp4Urls = ["https://www.bilibili.com/video/BV1bW411n7fY/?spm_id_from=333.337.search-card.all.click",
"https://www.bilibili.com/video/BV1yt4y1Q7SS/?spm_id_from=333.337.search-card.all.click"]
# 第一次运行时安装下面的you-get库,非首次运行时可注释掉;也可以通过命令行安装;还可以通过 PyCharm 或 Anaconda 安装
os.system("pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ you-get")
# 利用进程池同时下载多个视频
for url in mp4Urls:
downloadMp4(filePath, url)
# 对比文件夹变化,获取 flv 文件,并改为 MP4 格式
files_after = os.listdir(filePath)
mp4List = getMp4List(files_before, files_after)
# 利用进程池同时从视频中提取音频
for mp4 in mp4List:
getMp3(filePath, mp4)