Python_ Kobayashi crawling QQ space album image links Program

Foreword

Yesterday saw someone's space XXXX uploaded a picture, and then I was thinking download back [so essentially this is a picture downloader], but one save as a waste of time, there is no ready-made Internet search tools, to actually registration code, also sold 45 set. Your conscience too bad! . And surprise! ! ! There is also a downloader and I have the same name? This really makes me very jealous. So I decided to figure out this thing, but most of the information is Python source code, and many of them are outdated and can not use the [Anyway, I did not find a usable].

Fortunately, there are a lot of people made some online tutorials, although Python is the first contact [almost time of the day], but it's really surprised me. The language is too simple, too simple! You can easily understand. So from these intermittent code and tutorial, I know wrote this program.

1. Prepare need something

  python3.0, these references and the following modules

import sys
import re
import requests
import execjs
import time

  And opened a can log QQ account number and password QQ space.

  To crawl space album QQ account object.

  A can successfully log on QQ space, and with Cookies pskey parameters.

1.2 crawl Cookies

  At first I also looked at the tutorial to find, and finally found, Chrome [version 76.0.3809.87 (official version) (64)] did not catch. Next, I still have to use a classic tool.

Steps: Open fiddler, open a browser and log in to access your account, find Cookies inner tube in the right column pskey intercept parameter list, right-click menu select View header copy it, replace Cookies parameter py file.

  

(The picture shows the fiddler to intercept information)

 

 (The picture shows the view header box)

2.0 replacement code

    QQ number will prepare [to correspond with Cookies] assigned to uin

    The preparation of crawling objects QQ number assigned to fuin

    The grab cookies assign cookie

2.1 code execution

  在安装完需求模块与准备好一切之后,将它保存,在PowerShell or cmd命令行中启动。 程序在爬取完相册链接后会执行Input,

填入指定目录,将输出链接.txt文件到该目录。【这时的文本文件里是未转义的JS代码,你可以写一个转义工具将 / \/ 手动转义,例如下面给出的代码:】

'\Code for vb6
'\e-mail: [email protected]
Dim url() As String

Public Function url_format(stra As String) As String
'由于可能需要转义的url代码行超出Integr(32...)级别,启用long更稳妥
Dim strb() As String
Dim i As Long
url_format = ""
strb = Split(stra, "\")
For i = 0 To UBound(strb)
If strb(i) <> "/" Then
url_format = url_format + strb(i)
End If
Next
End Function

Public Sub main()
Dim i As Long
Open App.Path & "\ling_url.txt" For Input As #1
Do Until EOF(1)
ReDim Preserve url(i)
Line Input #1, url(i)
url(i) = url_format(url(i))
i = i + 1
Loop
Close #1
Open App.Path & "\ling_Val_url.txt" For Output As #2
For i = 0 To UBound(url())
Print #2, url(i)
Next
Close #2
End Sub

   执行代码结果(示例):

(图为填写导出目录)

 

3.代码分析:

 

Guess you like

Origin www.cnblogs.com/lingqingxue/p/11306575.html