[python]利用requests爬取成绩

新手初学可能有一些地方理解不对的请理解哈

看着我周边的大佬们爬教务,用python写程序抢课,我也产生了学习python的想法,然而,菜就是菜,很多东西我都一点都不了解,糊里糊涂弄出来这么个东西,里面还有许多坑要填

下面列一下我想的东西,首先,我认为我们浏览网页就是从本地给他发送一个请求,然后接受服务器端的数据展现在浏览器中,所以我们可以通过requests模块来构建post,get请求,模拟访问。

首先是模拟登陆,用fiddler发现我们发送的请求里面学号没有变,但是密码变成了一串奇怪的字符串,查询得知这是md5加密(只知道这个名字具体啥也不会),利用hashlib对密码进行加密,注意首先要将密码转化为Bytes格式

def login():
    id = input('请输入学号:')
    password = input('请输入密码:')
    url = 'http://bkjws.sdu.edu.cn/b/ajaxLogin'
    code=password.encode(encoding="utf-8")
    #md5加密算法
    m = hashlib.md5()
    m.update(code)
    payload = {'j_username':***, 'j_password': m.hexdigest()} #构建post信息正文
    headers = {'cookie': 'index=***; JSESSIONID=*******',
               'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; LCTE; rv:11.0) like Gecko'}
    r = requests.post(url, data=payload, headers=headers)
    print(r.text)
r.text返回成功代表模拟登陆成功,还有cookie里面的JSESSIONID是可以修改的可能,反正我改了一下没什么

接着进行对主页的访问请求

def next():
    url1 = 'http://bkjws.sdu.edu.cn/f/common/main'
    headers1 = {'cookie': 'index=*****; JSESSIONID=***********',
                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; LCTE; rv:11.0) like Gecko'}
    r1 = requests.get(url1, headers=headers1)

发一个post请求得到成绩,正文是经过URL编码过的所以先在网上进行解码

def getscore():
    url3 = 'http://bkjws.sdu.edu.cn/b/cj/cjcx/xs/lscx'
    k = '[{"name":"sEcho","value":1},{"name":"iColumns","value":10},{"name":"sColumns","value":""},{"name":"iDisplayStart","value":0},{"name":"iDisplayLength","value":20},{"name":"mDataProp_0","value":"xnxq"},{"name":"mDataProp_1","value":"kch"},{"name":"mDataProp_2","value":"kcm"},{"name":"mDataProp_3","value":"kxh"},{"name":"mDataProp_4","value":"xf"},{"name":"mDataProp_5","value":"kssj"},{"name":"mDataProp_6","value":"kscjView"},{"name":"mDataProp_7","value":"wfzjd"},{"name":"mDataProp_8","value":"wfzdj"},{"name":"mDataProp_9","value":"kcsx"},{"name":"iSortCol_0","value":5},{"name":"sSortDir_0","value":"desc"},{"name":"iSortingCols","value":1},{"name":"bSortable_0","value":false},{"name":"bSortable_1","value":false},{"name":"bSortable_2","value":false},{"name":"bSortable_3","value":false},{"name":"bSortable_4","value":false},{"name":"bSortable_5","value":true},{"name":"bSortable_6","value":false},{"name":"bSortable_7","value":false},{"name":"bSortable_8","value":false},{"name":"bSortable_9","value":false}]'
    payload = {'aoData': k}
    headers = {'X-Requested-With': 'XMLHttpRequest',
               'cookie': 'index=****; j_username=***********; j_password=*********;JSESSIONID=************',
               'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; LCTE; rv:11.0) like Gecko',
               'Accept': 'text/javascript, application/javascript, application/ecmascript, application/x-ecmascript, */*; q=0.01'}
    r3 = requests.post(url3, data=payload, headers=headers)
    q = json.loads(r3.text)

返回值是一个json字符串,所以先将它转化为字典,再进行对于数据的一些处理

def display(q):
    q1 = q['object']
    sum = q1['iTotalRecords']
    print('共有%s门课' % sum)
    a = q1['aaData']
    for v in a:
        print('课程名:%s'%v['kcm'])
        print('考试成绩:%s   期末成绩:%s  平时成绩:%s 实验成绩:%s'%(v['kscj'],v['qmcj'],v['pscj'],['sycj']))
        print('等级:%s  绩点:%s'%(v['wfzdj'],v['wfzjd']))

这个程序就此完成了,虽然这个很简陋,但是我依然觉着很不错

猜你喜欢

转载自blog.csdn.net/Asensio_20/article/details/83446163