Article directory
There are following libraries to download ytb videos
- you-get: https://github.com/soimort/you-get
Documentation: https://you-get.org - pytube: https://github.com/pytube/pytube
documentation: https://pytube.io/en/latest/index.html - youtube-dl: https://github.com/ytdl-org/youtube-dl
documentation: http://ytdl-org.github.io/youtube-dl/
https://github.com/ytdl-org/youtube -dl/blob/master/README.md - yt-dlp : https://github.com/yt-dlp/yt-dlp
A youtube-dl fork with additional features and fixes
About you-get
- github : https://github.com/soimort/you-get
you-get is a well-known open source video download toolkit, which I won’t go into details here.
code call
you-get provides a command line method to download videos. Here we introduce how to use Python to call the source code to download videos in order to handle more customized needs.
Take downloading youtube videos as an example
The code logic is as follows, but an error will appear when running. See the adjustment method below:
site = YouTube()
# 使用 url 下载
url = 'https://www.youtube.com/watch?v=mchvUV0iQLg'
site.download_by_url(url)
# 使用 vid 下载
vid = '1c3iQWFEDJI'
site.download_by_vid(vid)
Error handling
1. When downloading using URL, an error may be reported:[Failed] Unsupported URL pattern.
If a url is https://www.youtube.com/watch?v=mchvUV0iQLg
, then its vid is mchvUV0iQLg
.
It is because the vid in the url is not parsed in youtube.py get_vid_from_url
, causing the prepare function to pass the url to download_playlist_by_url
for downloading. At the same time, it does not meet the standards of the playlist, so an error is reported.
You can change get_vid_from_url
the rules for identifying vids.
2. When only passing vid to download, the program may crash, because when the function youtube.py
in is executed, the url is not passed here, so change it to the following:prepare
if re.search('\Wlist=', self.url)
if self.url and re.search('\Wlist=', self.url) and not kwargs.get('playlist'):
log.w('This video is from a playlist. (use --playlist to download all videos in the playlist.)')
Simple analysis of source code
In the process of calling and breaking points, we can learn that
- The source code is mainly in the src folder;
- Each downloader is in the extractors folder, and the downloader inherits from
VideoExtractor
the class; - Inside the VideoExtractor class, the subclass's will be called
extract
to extract the content of the stream;
its owndownload
method will be called to download the video, herecommon.py
the filedownload_urls
method will be used, ffmpeg will be used internally to download the video, and the segmented videos will be merged as needed.
├── src
│ └── you_get
│ ├── __init__.py
│ ├── __main__.py
│ ├── cli_wrapper
│ │ ├── __init__.py
│ │ ├── downloader
│ │ │ └── __init__.py
│ │ ├── openssl
│ │ │ └── __init__.py
│ │ ├── player
│ │ │ ├── __init__.py
│ │ │ ├── __main__.py
│ │ │ ├── dragonplayer.py
│ │ │ ├── gnome_mplayer.py
│ │ │ ├── mplayer.py
│ │ │ ├── vlc.py
│ │ │ └── wmp.py
│ │ └── transcoder
│ │ ├── __init__.py
│ │ ├── ffmpeg.py
│ │ ├── libav.py
│ │ └── mencoder.py
│ ├── common.py
│ ├── extractor.py
│ ├── extractors
│ │ ├── __init__.py
│ │ ├── acfun.py
│ │ ├── alive.py
│ │ ├── ...
│ │ ├── youku.py
│ │ ├── youtube.py
│ │ └── zhihu.py
│ ├── json_output.py
│ ├── processor
│ │ ├── __init__.py
│ │ ├── ffmpeg.py
│ │ ├── join_flv.py
│ │ ├── join_mp4.py
│ │ ├── join_ts.py
│ │ └── rtmpdump.py
│ ├── util
│ │ ├── __init__.py
│ │ ├── fs.py
│ │ ├── git.py
│ │ ├── log.py
│ │ ├── os.py
│ │ ├── strings.py
│ │ └── term.py
│ └── version.py
├── tests
│ ├── __init__.py
│ ├── test.py
│ ├── test_common.py
│ └── test_util.py
├── you-get
├── you-get.json
└── you-get.plugin.zsh
A brief introduction to pytube
Install
python -m pip install pytube
python -m pip install git+https://github.com/pytube/pytube
from pytube import YouTube
# 下载1
YouTube('http://youtube.com/watch?v=2lAe1cqCOXo').streams.first().download()
# 下载2-筛选流
yt = YouTube('http://youtube.com/watch?v=2lAe1cqCOXo')
yt.streams.filter(progressive=True, file_extension='mp4').order_by('resolution').desc().first().download()
search list
from pytube import Search
s = Search('news')
len(s.results) # 20
s.results
'''
[<pytube.__main__.YouTube object: videoId=aBV52zHr9F8>, <pytube.__main__.YouTube object: videoId=fjW_ryKbEnY>,
...
<pytube.__main__.YouTube object: videoId=iimaGVnvY6M>, <pytube.__main__.YouTube object: videoId=b1Avd5R93nw>]
'''
s.get_next_results()
len(s.results) # 35
Command line download
pytube https://youtube.com/watch?v=2lAe1cqCOXo
Download playlist
pytube https://www.youtube.com/playlist?list=PLS1QulWo1RIaJECMeUT4LFwJ-ghgoSH6n
Iori 2023-11-09 (Thursday)