Python epidemic data acquisition and visual display

Use Python to obtain data, and use pyecharts to visualize, draw domestic and international daily growth map, and matplotlib draw square inch map. At the same time, the code is completed in the notebook,

The essay records what I have learned. This blog is used by me to record articles. It is posted here and is only for netizens to read and reference. Author: Kitayama

写在前面: This is not a new topic, so please don’t spray it

I am convinced, this CSDN, due to the change of the webpage, the crawling code reported an error, after modification, it will be G. In order to send it out, I have to modify some keywords and divide it into two parts to release

Import related modules

import time
import json
import requests
from datetime import datetime
import pandas as pd
import numpy as np

1. Acquisition of epidemic data

Get it through the web page of Tencent news release

For static web pages, we only need to pass the url in the address bar of the web page to the get request to easily get the data of the web page. The key to dynamic web crawling is to first analyze the logic of web page data acquisition and jumping, and then write code.

Right-click to check, select Network, Ctrl+R

Remember to install the express third-party library

pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple pyecharts
# 定义抓取数据函数
def Domestic():
    url = 'https://api.inews.qq.com/newsqa/v1/query/inner/publish/modules/list?modules=statisGradeCityDetail,diseaseh5Shelf'
    response = requests.get(url=url).text
    data = json.loads(response)['data']['diseaseh5Shelf']
    return data


def Oversea():
    url = 'https://view.inews.qq.com/g2/getOnsInfo?name=disease_foreign'
    reponse = requests.get(url=url).json()
    data = json.loads(reponse['data'])
    return data


domestic = Domestic()
oversea = Oversea()

print(domestic.keys())
print(oversea.keys())

2. Preliminary Analysis

Extract the data details of each region

# 提取各地区数据明细
areaTree = domestic['areaTree']
# 查看并分析具体数据
areaTree

Extract foreign region data details

# 提取国外地区数据明细
foreignList = oversea['foreignList']
# 查看并分析具体数据
foreignList

You can see the structure of the json data storage

3. Data processing

3.1 Extraction of epidemic data in various provinces in China

# Adresss:https://beishan.blog.csdn.net/
china_data = areaTree[0]['children'] 
china_list = []
for a in range(len(china_data)):
    province = china_data[a]['name']  
    confirm = china_data[a]['total']['confirm'] 
    heal = china_data[a]['total']['heal']  
    dead = china_data[a]['total']['dead']  
    nowConfirm = confirm - heal - dead 
    china_dict = {
    
    } 
    china_dict['province'] = province  
    china_dict['nowConfirm'] = nowConfirm 
    china_list.append(china_dict) 

china_data = pd.DataFrame(china_list) 
china_data.to_excel("国内疫情.xlsx", index=False) #存储为EXCEL文件
china_data.head()
province nowConfirm
0 Hongkong 323
1 Shanghai 40
2 sichuan 34
3 Taiwan 30
4 Guangdong 29

3.2 Extraction of international epidemic data

world_data = foreignList  
world_list = []  

for a in range(len(world_data)):
    # 提取数据
    country = world_data[a]['name']
    nowConfirm = world_data[a]['nowConfirm']  
    confirm = world_data[a]['confirm']
    dead = world_data[a]['dead']  
    heal = world_data[a]['heal'] 
    # 存放数据
    world_dict = {
    
    }
    world_dict['country'] = country
    world_dict['nowConfirm'] = nowConfirm
    world_dict['confirm'] = confirm
    world_dict['dead'] = dead
    world_dict['heal'] = heal
    world_list.append(world_dict)

world_data = pd.DataFrame(world_list)
world_data.to_excel("国外疫情.xlsx", index=False)
world_data.head()
country nowConfirm confirm dead heal
0 U.S. 7282611 30358880 552470 22523799
1 Spain 193976 3212332 72910 2945446
2 France 2166003 2405255 57671 181581
3 Peru 111940 422183 19408 290835
4 U.K. 90011 104145 13759 375

Guess you like

Origin blog.csdn.net/qq_45176548/article/details/127728522