Python crawler parsing multi-level nested json

The personal blog address https://nfreak-man.cn
has just been in contact with the Python crawler recently. It is just recently that pneumonia has spread throughout the country, so I plan to crawl real-time pneumonia data from the website and parse the data I want. The URL for obtaining json data is https://m.look.360.cn/events/feiyan

Crawl URL:

def main():
    url='https://m.look.360.cn/events/feiyan'
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) 		    AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36 QIHU 360EE'}
    parse_page(url)

Parse and save the json format file:

state = json.loads(response.content)
with open("feiyan.json", "w",encoding='utf-8') as f:
	f.write(json.dumps(state,indent=2,ensure_ascii=False))
    print("保存成功")

After obtaining the json file, it is found in a nested format. For the specific format, you can open the URL above to view:

{
 "data":[
 		{
 		 "data":{省份}
         "citys":[
         		  {城市}{}.....
         		 ]		 
 		}
 		{
 		 "data":{省份}
         "citys":[
         		  {城市}{}.....
         		 ]		 
 		}
 "country":[
 			{其他国家}
 		   ]	
 		]
}

json source code

Analytical data

Because I only want to get the names of the cities, the number of infections, the number of cures and the number of deaths, I need a nested loop to get and output each one. code show as below:

Provincial data acquisition:

provincename = state['data'][i]['data']['provinceName'] 
confirmedCount = state['data'][i]['data']['confirmedCount']
curedCount = state['data'][i]['data']['curedCount'] 
deadCount = state['data'][i]['data']['deadCount'] 

City data acquisition:

cityName = state['data'][i]['cities'][j]['cityName']
diagnosed = state['data'][i]['cities'][j]['diagnosed']
cured = state['data'][i]['cities'][j]['cured']
died = state['data'][i]['cities'][j]['died']

Data acquisition in foreign cities:

countryname = state['country'][c]['provinceName']
diagnosed = state['country'][c]['diagnosed']
cured = state['country'][c]['cured']
died = state['country'][c]['died']

The above is the nested json parsing format.

Output effect:
Output effect

Source address: github

Published 28 original articles · praised 0 · visits 722

Guess you like

Origin blog.csdn.net/William_GJIN/article/details/104627664