Use Python to obtain data, and use pyecharts to visualize, draw domestic and international daily growth map, and matplotlib draw square inch map. At the same time, the code is completed in the notebook,
The essay records what I have learned. This blog is used by me to record articles. It is posted here and is only for netizens to read and reference. Author: Kitayama
写在前面
: This is not a new topic, so please don’t spray it
I am convinced, this CSDN, due to the change of the webpage, the crawling code reported an error, after modification, it will be G. In order to send it out, I have to modify some keywords and divide it into two parts to release
Import related modules
import time
import json
import requests
from datetime import datetime
import pandas as pd
import numpy as np
1. Acquisition of epidemic data
Get it through the web page of Tencent news release
For static web pages, we only need to pass the url in the address bar of the web page to the get request to easily get the data of the web page. The key to dynamic web crawling is to first analyze the logic of web page data acquisition and jumping, and then write code.
Right-click to check, select Network, Ctrl+R
Remember to install the express third-party library
pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple pyecharts
# 定义抓取数据函数
def Domestic():
url = 'https://api.inews.qq.com/newsqa/v1/query/inner/publish/modules/list?modules=statisGradeCityDetail,diseaseh5Shelf'
response = requests.get(url=url).text
data = json.loads(response)['data']['diseaseh5Shelf']
return data
def Oversea():
url = 'https://view.inews.qq.com/g2/getOnsInfo?name=disease_foreign'
reponse = requests.get(url=url).json()
data = json.loads(reponse['data'])
return data
domestic = Domestic()
oversea = Oversea()
print(domestic.keys())
print(oversea.keys())
2. Preliminary Analysis
Extract the data details of each region
# 提取各地区数据明细
areaTree = domestic['areaTree']
# 查看并分析具体数据
areaTree
Extract foreign region data details
# 提取国外地区数据明细
foreignList = oversea['foreignList']
# 查看并分析具体数据
foreignList
You can see the structure of the json data storage
3. Data processing
3.1 Extraction of epidemic data in various provinces in China
# Adresss:https://beishan.blog.csdn.net/
china_data = areaTree[0]['children']
china_list = []
for a in range(len(china_data)):
province = china_data[a]['name']
confirm = china_data[a]['total']['confirm']
heal = china_data[a]['total']['heal']
dead = china_data[a]['total']['dead']
nowConfirm = confirm - heal - dead
china_dict = {
}
china_dict['province'] = province
china_dict['nowConfirm'] = nowConfirm
china_list.append(china_dict)
china_data = pd.DataFrame(china_list)
china_data.to_excel("国内疫情.xlsx", index=False) #存储为EXCEL文件
china_data.head()
province | nowConfirm | |
---|---|---|
0 | Hongkong | 323 |
1 | Shanghai | 40 |
2 | sichuan | 34 |
3 | Taiwan | 30 |
4 | Guangdong | 29 |
3.2 Extraction of international epidemic data
world_data = foreignList
world_list = []
for a in range(len(world_data)):
# 提取数据
country = world_data[a]['name']
nowConfirm = world_data[a]['nowConfirm']
confirm = world_data[a]['confirm']
dead = world_data[a]['dead']
heal = world_data[a]['heal']
# 存放数据
world_dict = {
}
world_dict['country'] = country
world_dict['nowConfirm'] = nowConfirm
world_dict['confirm'] = confirm
world_dict['dead'] = dead
world_dict['heal'] = heal
world_list.append(world_dict)
world_data = pd.DataFrame(world_list)
world_data.to_excel("国外疫情.xlsx", index=False)
world_data.head()
country | nowConfirm | confirm | dead | heal | |
---|---|---|---|---|---|
0 | U.S. | 7282611 | 30358880 | 552470 | 22523799 |
1 | Spain | 193976 | 3212332 | 72910 | 2945446 |
2 | France | 2166003 | 2405255 | 57671 | 181581 |
3 | Peru | 111940 | 422183 | 19408 | 290835 |
4 | U.K. | 90011 | 104145 | 13759 | 375 |