Use Python script to call ffmpeg to download ts segmented video file

Work together to create and grow together! This is the 30th day of my participation in the "Nuggets Daily New Plan · August Update Challenge" Click view the event details

foreword

Now online videos are not in the mp4 format, they are all ts file formats. In order to prevent piracy and abuse of traffic and bandwidth, the ts file format is a video segmentation technology, which mainly requires an index. The file lists all the segment information of the video, the time length and some information about the ts file name:

image.png

These sub-segment information is loaded asynchronously during playback, and it is also very convenient to jump, and skip the loading of certain segments by directly calculating the duration.

image.png

Downloading a ts file alone cannot be played directly, so how to solve this problem? There are many ways, I found the most suitable solution for technical people like us, which is to use the famous one ffmpeg, which can directly use the ffmpeg command line to access the remote index file to download and merge into one MP4 file. Today, let's briefly introduce how to use it.

download

ffmpeg is a very easy-to-use software toolkit for processing audio and video. To use ffmpeg in Win10, you need to download it and then add environment variables. Download website: Download FFmpeg

image.png

After the download is complete, unzip it, and move the unzipped file to a suitable location:

image.png

To add environment variables, you can refer to the previous article Fiddler cannot crawl HTTPS links? Move the Fiddler certificate into the system root certificate directory through Root permissions . The same as configuring Opnen SSL, only the final configuration is shown here:

image.png

At this point, check the command line to see if the environment variable is correct:

ffmpeg -version
复制代码

image.pngsuccessfully installed

download

Download To find the link to the index file of the website that needs to be downloaded, I see it here on the browser F12 debug console:

image.png

If the command downloaded using the native command is:

ffmpeg -i {m3u8链接} -c copy -bsf:a aac_adtstoasc {文件名}.mp4'
复制代码

You can use the native command to download directly, but the Win10 system cannot recognize the environment variables of ffmpeg in the Python command line: So to add the specific directory of ffmpeg, here is a simple function to show how to use Python to call ffmpeg to download videos:


def one_video(urls, file_name):
    save_path = 'D:\Download\ts\'+file_name
    shell_str = 'D:\MiniTool\ffmpeg\bin\ffmpeg.exe -i '+url+' -c copy -bsf:a aac_adtstoasc '+save_path+'.mp4'
    os.system(shell_str)
urls = 'https://*****/videos/augdeduuigdrypzmvseh/index.m3u8?token=eyssddcflgjkmddsds********' 
file_name = 'test'
one_video(urls, file_name)
复制代码

Effect:

image.png

image.png

Summarize

ffmpeg is a very powerful tool, but single thread is too slow in scripts, and it is very slow if the file is large and there are many files, I am going to use it in Scrapy.

In addition, this m3u8 index file is a very important file, which is generally encrypted. In my example, there is a token replay protection, which is accessed once, and the second access will expire. It needs to be decrypted in the webpage to There is no problem with this token, but the actual situation is generally more complicated.

Guess you like

Origin juejin.im/post/7136823714410135559