Crawling video

Shiqianfeng crawling JavaScript in video

. 1  Import Requests
 2  from The urllib.parse Import quote
 . 3  from lxml Import etree
 . 4  '' ' 
. 5  the URL
 . 6      http://video.mobiletrain.org/course/index/courseId/479
 . 7  request method
 . 8      the GET
 . 9  request header
 10      User- - Agent: the Mozilla / 5.0 (the Windows NT 10.0; Win64; x64-) AppleWebKit / 537.36 (KHTML, like the Gecko) the Chrome / 70.0.3538.67 Safari / 537.36
 . 11  '' ' 
12 is  # analog transmission request acquirer 
13 is response = requests.get (
 14      = URL 'http://video.mobiletrain.org/course/index/courseId/479 ' ,
 15      headers = {
 16          ' the User-- Agent ' : ' the Mozilla / 5.0 (the Windows NT 10.0; Win64; x64-) AppleWebKit / 537.36 (KHTML, like the Gecko) the Chrome / 70.0.3538.67 Safari / 537.36 ' 
. 17      }
 18 is  )
 . 19 HTML = response.text
 20 is  # acquired video address page 
21 is eroot = etree.HTML (HTML)
 22 is hrefs = eroot.xpath ( " // Li [ class = @ 'J-clearfix-URL List'] / A / @ Data-URL " )
 23 is  for the href inhrefs:
 24      Print (href)
 25      # provided file name 
26 is      start_index = href.find ( ' : ' ) + 1'd
 27      end_index = -4
 28      filename = href [start_index: end_index]
 29      # taken from Chinese href in 
30      START_URL href = .find ( " one thousand " )
 31 is      URI = the href [START_URL: end_index]
 32      # configured to access the real address of the video 
33 is      START_URI = ' http://7xtcwd.com1.z0.glb.clouddn.com/ ' 
34 is      # to be Chinese coded 
35     = end_uri quote (URI)
 36      the src = START_URI end_uri + + " .mp4 " 
37 [  
38 is      with Open (filename + ' .mp4 ' , ' WB ' ) AS F:
 39          # using the request to download the file 
40          video_response = requests.get (
 41 is              URL = the src,
 42 is              Stream = True
 43 is          )
 44 is          Print ( " downloading: " , the src)
 45          # each 512 bytes downloaded to a callback 
46 is          for chunk in video_response.iter_content(chunk_size=512):
47             f.write(chunk)

 

Reproduced in: https: //www.cnblogs.com/chaunceyji/p/10995266.html

Guess you like

Origin blog.csdn.net/weixin_34293911/article/details/93683809