[Python crawler tutorial] Use python to crawl the KFC store list example code in a certain location (supports paging)

This is a simple crawler developed in python. Its function is to crawl the store list published by KFC's official website. It supports keyword search and paging.

Let’s take a look at the effect first:

请输入想要查询的城市:北京
抓取成功第1页成功!!!
抓取成功第2页成功!!!
抓取成功第3页成功!!!
抓取成功第4页成功!!!
抓取成功第5页成功!!!
抓取成功第6页成功!!!
抓取成功第7页成功!!!
抓取成功第8页成功!!!
抓取成功第9页成功!!!
抓取成功第10页成功!!!
抓取结束

After running the program, the interface will first prompt the city to be queried. After inputting, the data will be captured page by page and saved to local files.

The following code requires the requests module, which needs to be executed if it is not installed.

pip3 install request

Install

import requests
import json
if __name__ == '__main__':
    url = 'http://www.kfc.com.cn/kfccda/ashx/GetStoreList.ashx?op=keyword'
    kw = input('请输入想要查询的城市:')
    page = 1
    pageSize = 10
    while True:
        params = {
    
    
            'cname': '',
            'pid': '',
            'keyword': kw,
            'pageIndex': page,
            'pageSize': pageSize 
        }
        header = {
    
    
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
        }
        response = requests.post(url=url, data=params, headers=header)
        res = response.json()
        shopCount = len(res['Table1'])
        if shopCount > 0:
            fileName = kw + str(page) + '.json'
            fileIndex = open('./' + fileName, 'w', encoding='utf-8')
            json.dump(res, fp=fileIndex, ensure_ascii=False)
            print('抓取成功第' + str(page) + '页成功!!!')
            page = page+1
        if shopCount < pageSize:
            print('抓取结束')
            break

Guess you like

Origin blog.csdn.net/one_and_only4711/article/details/126086184