Example 1 doraemon python crawler crawling (entry) Dynamic Data

Administration of certain data sources

import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'
}
fp = open('./yao.txt','w',encoding='utf-8')
# fp = open('./company_detail.txt','w',encoding='utf-8')

for page in range(1,2):
    url = 'http://125.35.6.84:81/xk/itownet/portalAction.do?method=getXkzsList'
    data = {
        'on': 'true',
        'page': str(page),
        'pageSize': '15',
        'productName': '',
        'conditionType': '1',
        'applyname': '',
        'applysn': '',
    }
# Is the time to visit this site, the parameters required to carry, English is the post request, so that data
    
#         data = {
#         'on': 'true',
#         'page': str(page),
#         'pageSize': '15',
#         'productName': '',
#         'conditionType': '1',
#         'applyname': '',
#         'applysn': '',
#     }

#     data_dic = requests.post(url=url,data=data,headers=headers).json()
    data_dic = requests.post(url=url,data=data,headers=headers).json()
#     print(data_dic)
    for dic in data_dic['list']:
        _id = dic['ID']
        post_url = 'http://125.35.6.84:81/xk/itownet/portalAction.do?method=getXkzsById'
        post_data={'id':_id}
        detail_dic = requests.post(url=post_url,data=post_data,headers=headers).json()
               
        company_title = detail_dic['epsName']
        address = detail_dic['epsProductAddress']

        fp.write (COMPANY_TITLE + ' : ' + address + ' \ n- ' )
         Print (COMPANY_TITLE, ' crawling successful! ' )               
                                 
fp.close() 

Summary: If there mistakes, the current statement is not wrong, it is best to look up the code

 

Guess you like

Origin www.cnblogs.com/doraemon548542/p/11955172.html