爬虫实战:使用requests库爬取12306余票信息

最近看了一些爬虫的资料,试着自己写了一个小爬虫,爬取12306的余票信息。
代码很少,也没做什么优化,仅记录一下第一个爬虫。
思路分析:
查询余票的正常步骤肯定是打开12306,输入出发地,目的地,出发时间,点击查询。根据这个步骤,一步一步开始:
1.首先来到https://kyfw.12306.cn/otn/leftTicket,输入出发地等信息,点击查询,通过浏览器F12抓包分析可以发现,车站信息是从https://kyfw.12306.cn/otn/resources/js/framework/station_name.js获取到的,而余票信息通过https://kyfw.12306.cn/otn/leftTicket/queryX?leftTicketDTO.train_date=2019-02-21&leftTicketDTO.from_station=BJP&leftTicketDTO.to_station=SHH&purpose_codes=ADULT获取(这里查询的是北京到上海2月21号的票)。
2.开始撸代码,先获取所有站点的信息,并把对应的车站和代号存起来以后使用。

import requests

dict_station = {}
def station_name():
    url = 'https://kyfw.12306.cn/otn/resources/js/framework/station_name.js'
    resp = requests.get(url)
    if resp.status_code == 200:
        print('获取站点信息成功!')
        for each in resp.text.split('=')[1].split('@')[1:]:
            station = each.split('|')
            dict_station[station[1]] = station[2] 
    else:
        print('获取站点信息失败!')

3.接下来查询余票,查询其实很简单,只需把对应的信息放到url中get就好了,主要是要从获取到的数据中分析出我们需要的数据,如有无余票,余票多少等。

def left_ticket(from_s, to_s, date):
    global dict_station
    url = 'https://kyfw.12306.cn/otn/leftTicket/queryX'
    dict1 = {'leftTicketDTO.train_date':date,
                 'leftTicketDTO.from_station':dict_station[from_s],
                 'leftTicketDTO.to_station':dict_station[to_s],
                 'purpose_codes':'ADULT'
                 }
    resp = requests.get(url,params=dict1)
    list1 = []
    if resp.status_code == 200:
        print('获取余票信息成功!')
#            print(resp.text)
        for each in resp.text.split('预订')[1:]:
            l = each.split('|')[2:-6]
            dict2 = {}
            dict2['车次'] = l[0]
            dict2['出发时间'] = l[5]
            dict2['到达时间'] = l[6]
            dict2['中途耗时'] = l[7]
            dict2['有无余票'] = l[8]
            dict2['商务座'] = l[29]
            dict2['一等座'] = l[28]
            dict2['二等座'] = l[27]
            dict2['软卧'] = l[20]
            dict2['硬卧'] = l[25]
            dict2['硬座'] = l[26]
            dict2['无座'] = l[23]
            list1.append(dict2)
        for i in list1:
            print(str(i))
    else:
        print('获取余票信息失败!')

4.最后定义一个main函数,大功告成!

def main():
    from_s = input('出发地:')
    to_s = input('目的地:')
    date = input('出发时间:') #格式如:2019-02-21
    station_name()
    left_ticket(from_s, to_s,date)

附上完整代码:

import requests

dict_station = {}
def station_name():
    url = 'https://kyfw.12306.cn/otn/resources/js/framework/station_name.js'
    resp = requests.get(url)
    if resp.status_code == 200:
        print('获取站点信息成功!')
        for each in resp.text.split('=')[1].split('@')[1:]:
            station = each.split('|')
            dict_station[station[1]] = station[2] 
    else:
        print('获取站点信息失败!')
    
def left_ticket(from_s, to_s, date):
    global dict_station
    url = 'https://kyfw.12306.cn/otn/leftTicket/queryX'
    dict1 = {'leftTicketDTO.train_date': date,
             'leftTicketDTO.from_station': dict_station[from_s],
             'leftTicketDTO.to_station': dict_station[to_s],
             'purpose_codes':'ADULT' }
    resp = requests.get(url,params=dict1)
    list1 = []
    if resp.status_code == 200:
        print('获取余票信息成功!')
        for each in resp.text.split('预订')[1:]:
            l = each.split('|')[2:-6]
            dict2 = {}
            dict2['车次'] = l[0]
            dict2['出发时间'] = l[5]
            dict2['到达时间'] = l[6]
            dict2['中途耗时'] = l[7]
            dict2['有无余票'] = l[8]
            dict2['商务座'] = l[29]
            dict2['一等座'] = l[28]
            dict2['二等座'] = l[27]
            dict2['软卧'] = l[20]
            dict2['硬卧'] = l[25]
            dict2['硬座'] = l[26]
            dict2['无座'] = l[23]
            list1.append(dict2)
        for i in list1:
            print(str(i))
    else:
        print('获取余票信息失败!')
            
def main():
    from_s = input('出发地:')
    to_s = input('目的地:')
    date = input('出发时间:') #格式如:2019-02-21
    station_name()
    left_ticket(from_s, to_s, date)
    
if __name__ == '__main__':
    main()

效果图:
在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/qq_36936510/article/details/87861350
今日推荐