Some idioms crawling

Disclaimer: This article is a blogger original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement.
This link: https://blog.csdn.net/lipachong/article/details/101694150

Homemade idiom

import requests
import json
import re
from fake_useragent import UserAgent
headers= {'User-Agent':str(UserAgent().chrome)}
for page in range(0,43381,30):
    url='https://sp0.baidu.com/8aQDcjqpAAV3otqbppnN2DJv/api.php?resource_id=28204&from_mid=1&&format=json&ie=utf-8&oe=utf-8&query=成语大全&sort_key=&sort_type=1&stat0=&stat1=&stat2=&stat3=&pn={}&rn=30&cb=jQuery110204211961015271306_1569725005313&_=1569725005316'.format(str(page))
    print(url)
    rq=requests.get(url,headers=headers)
    rq.encoding='utf8'
    data1=rq.text
    data=data1.replace("jQuery110204211961015271306_1569725005313(",'').replace(")",'')
    jsonobj = json.loads(data)
    jsonobj=str(jsonobj)
    str_json=re.findall("{'ename': '(.*?)',",jsonobj)
    print(str_json)
    for chengyu in str_json:
        with open("chengyu.txt", "a+", encoding="utf-8") as f:
            f.write(chengyu + ",")

Guess you like

Origin blog.csdn.net/lipachong/article/details/101694150