本文简单展示如何用Python抓取APP数据,以超级课程表树洞为例:
首先:需要下载抓包神器:fiddler
直接百度下载,然后打来fiddler设置几个选项:
选中"Decrpt HTTPS traffic", Fiddler就可以截获HTTPS请求
选中"Allow remote computers to connect". 是允许别的机器把HTTP/HTTPS请求发送到Fiddler上来,记得重新启动。
然后记住port:8888,windows 打开cmd输入 ipconfig查到自己的IP 然后去打开手机的网络代理设置:
安卓的:
确定一下手机和PC是连接在同一个局域网中
进入手机的设置->点击进入WLAN设置->选择连接到的无线网,长按弹出选项框 输入自己的ip和port
将代理设置成手动,将上面获取到的ip地址和端口号填入,点击保存。这样就将我们的手机设置成功了。
使用Android手机的浏览器打开:http://【IP】:8888, 点"FiddlerRoot certificate" 然后安装证书,如图:
苹果的在无限网右侧叹号出点击输入代理;不需要安装证书 这样fiddler就可以抓手机的包了
得到超级课程表登录的地址:http://120.55.151.61/V2/StudentSkip/loginCheckV4.action
表单:
表单中包括了用户名和密码,当然都是加密过了的,还有一个设备信息,直接post过去就是。
另外必须加header,很重要 这是模拟手机的请求,
from urllib import request,parse
import http.cookiejar
import json
c=http.cookiejar.CookieJar()
cookie=request.HTTPCookieProcessor(c)
#把这个存储器绑定到opener里
opener=request.build_opener(cookie)
request.install_opener(opener)
opener.addheaders = [('Connection','close'),('Content-Type','application/x-www-form-urlencoded; charset=utf-8'),
('User-Agent','Mozilla/5.0 (iPhone; CPU iPhone OS 11_4_1 like Mac OS X) '
'AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15G77 -SuperFriday_9.4.1'),('Content-Length','338'),
('Accept-Encoding','gzip'),('Connection','close')]
def login():
url = 'http://120.55.151.61:80/V2/StudentSkip/loginCheckV4.action'
data = 'password=80ae7800f3602767a22913fd5351858d&account=09a41e3366274f2da3eb0163d161986b®istrationId=&ifa=737D0450-F735-4251-8306-5B03DD079D58&ifv=A4F8E2BF-1657-4C67-B615-2806C49571DE&versionNumber=9.4.1&platform=2&channel=AppStore&phoneVersion=11.4.1&phoneModel=iphone%205s%28Global%29%20%28A1457%2FA1518%2FA1528%2FA1530%29&phoneBrand=Apple'
req = request.Request(url,data=data.encode())
response = opener.open(req).read()
if response:
print('login success')
这是登陆信息:
接下来得到了json串
手机点击话题,抓取请求,筛选出需要的信息
接下来贴整个代码:
from urllib import request,parse
import http.cookiejar
import json
c=http.cookiejar.CookieJar()
cookie=request.HTTPCookieProcessor(c)
#把这个存储器绑定到opener里
opener=request.build_opener(cookie)
request.install_opener(opener)
opener.addheaders = [('Connection','close'),('Content-Type','application/x-www-form-urlencoded; charset=utf-8'),
('User-Agent','Mozilla/5.0 (iPhone; CPU iPhone OS 11_4_1 like Mac OS X) '
'AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15G77 -SuperFriday_9.4.1'),('Content-Length','338'),
('Accept-Encoding','gzip'),('Connection','close')]
def login():
url = 'http://120.55.151.61:80/V2/StudentSkip/loginCheckV4.action'
data = 'password=80ae7800f3602767a22913fd5351858d&account=09a41e3366274f2da3eb0163d161986b®istrationId=&ifa=737D0450-F735-4251-8306-5B03DD079D58&ifv=A4F8E2BF-1657-4C67-B615-2806C49571DE&versionNumber=9.4.1&platform=2&channel=AppStore&phoneVersion=11.4.1&phoneModel=iphone%205s%28Global%29%20%28A1457%2FA1518%2FA1528%2FA1530%29&phoneBrand=Apple'
req = request.Request(url,data=data.encode())
response = opener.open(req).read()
if response:
print('login success')
def chat(timestampLong=0):
url = 'http://120.55.151.61:80/Treehole/V4/Message/getListByType.action'
data = 'type=1×tamp={}&versionNumber=9.4.1&platform=2&channel=AppStore&phoneVersion=11.4.1&phoneModel=iphone%205s%28Global%29%20%28A1457%2FA1518%2FA1528%2FA1530%29&phoneBrand=Apple'.format(timestampLong)
req = request.Request(url, data=data.encode())
response = request.urlopen(req).read()
parse_html(response.decode())
def parse_html(result):
result = json.loads(result)
print(result)
if result['status']==1:
print('success spider')
timestampLong = result['data']['timestampLong']
parse_one_page(result)
chat(timestampLong)
def parse_one_page(result):
for item in result['data']['listBO']['messageBOs']:
print(item.get('schoolName',' '),item.get('content'))
#--获取更多--
login()
chat()
贴上运行后截图