Python疫情数据获取与可视化展示

使用Python获取数据,并使用pyecharts可视化,绘制国内、国际日增长人数地图,matplotlib绘制方寸图。同时代码是在notebook中完成,

随笔记录所的所学,此博客为我记录文章所用,发布到此,仅供网友阅读参考。作者:北山啦

写在前面:这个已经不是什么新鲜的话题了,所以请大佬勿喷

我服了,这个CSDN,由于网页变化,爬取代码报错,修改后,就G了。为了能发出来,我要修改一些keywords,分成两个部分来发布

导入相关模块

import time
import json
import requests
from datetime import datetime
import pandas as pd
import numpy as np

1. 疫情数据的获得

通过Tencent 新闻发布的网页进行获得

对于静态网页,我们只需要把网页地址栏中的url传到get请求中就可以轻松地获取到网页的数据。 对于动态网页抓取的关键是先分析网页数据获取和跳转的逻辑,再去写代码 。

右击检查,选择Network,Ctrl+R即可

记得安装快速第三方库

pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple pyecharts
# 定义抓取数据函数
def Domestic():
    url = 'https://api.inews.qq.com/newsqa/v1/query/inner/publish/modules/list?modules=statisGradeCityDetail,diseaseh5Shelf'
    response = requests.get(url=url).text
    data = json.loads(response)['data']['diseaseh5Shelf']
    return data


def Oversea():
    url = 'https://view.inews.qq.com/g2/getOnsInfo?name=disease_foreign'
    reponse = requests.get(url=url).json()
    data = json.loads(reponse['data'])
    return data


domestic = Domestic()
oversea = Oversea()

print(domestic.keys())
print(oversea.keys())

2. 初步分析

提取各地区数据明细

# 提取各地区数据明细
areaTree = domestic['areaTree']
# 查看并分析具体数据
areaTree

提取国外地区数据明细

# 提取国外地区数据明细
foreignList = oversea['foreignList']
# 查看并分析具体数据
foreignList

就可以看到在json数据存储的结构了

3. 数据处理

3.1 国内各省疫情数据提取

# Adresss:https://beishan.blog.csdn.net/
china_data = areaTree[0]['children'] 
china_list = []
for a in range(len(china_data)):
    province = china_data[a]['name']  
    confirm = china_data[a]['total']['confirm'] 
    heal = china_data[a]['total']['heal']  
    dead = china_data[a]['total']['dead']  
    nowConfirm = confirm - heal - dead 
    china_dict = {
    
    } 
    china_dict['province'] = province  
    china_dict['nowConfirm'] = nowConfirm 
    china_list.append(china_dict) 

china_data = pd.DataFrame(china_list) 
china_data.to_excel("国内疫情.xlsx", index=False) #存储为EXCEL文件
china_data.head()
province nowConfirm
0 香港 323
1 上海 40
2 四川 34
3 台湾 30
4 广东 29

3.2 国际疫情数据提取

world_data = foreignList  
world_list = []  

for a in range(len(world_data)):
    # 提取数据
    country = world_data[a]['name']
    nowConfirm = world_data[a]['nowConfirm']  
    confirm = world_data[a]['confirm']
    dead = world_data[a]['dead']  
    heal = world_data[a]['heal'] 
    # 存放数据
    world_dict = {
    
    }
    world_dict['country'] = country
    world_dict['nowConfirm'] = nowConfirm
    world_dict['confirm'] = confirm
    world_dict['dead'] = dead
    world_dict['heal'] = heal
    world_list.append(world_dict)

world_data = pd.DataFrame(world_list)
world_data.to_excel("国外疫情.xlsx", index=False)
world_data.head()
country nowConfirm confirm dead heal
0 美国 7282611 30358880 552470 22523799
1 西班牙 193976 3212332 72910 2945446
2 法国 2166003 2405255 57671 181581
3 秘鲁 111940 422183 19408 290835
4 英国 90011 104145 13759 375

猜你喜欢

转载自blog.csdn.net/qq_45176548/article/details/127728522