Project data sets course game community 9+ in | En visit fly paddle official website documentation Educational Cooperation
Programs> My courses> [job] new crown epidemic Visualization - work version
the new crown epidemic Visualization - work version
preview questions
job content data set
Job Description:
Job 1 : paddle installed locally fly

Screenshot submit fly propeller installed locally successful, as shown below, there are installation issues you can always ask questions in the group, refer to fly paddle official website: https: //www.paddlepaddle.org.cn/documentation/docs/zh/install/ index_cn.html

Job 2: new crown epidemic visualization

In accordance with what they learn in class, crawling statistics on March 31 the same day lilac garden open, according to the cumulative number of confirmed using pyecharts draw epidemic distribution, as shown below, submit screenshots.

Pycharts api refer to: https: //pyecharts.org/#/zh-cn/

important:

Be sure to use correct posture upload pictures + version generating jobs Notebook Submission Example: https://aistudio.baidu.com/aistudio/projectDetail/296022

Visualization, using a computer graphics and image processing techniques, converts the data into an image displayed on the screen, then the interactive processing theory, methods and techniques.

The practice based on statistical data lilac garden open to achieve new crown epidemic visualization, including disease maps, the growth trend of the epidemic, the epidemic distribution maps.

Map nationwide epidemic as follows:

The growth trend of the epidemic as follows:

First, data preparation
the whole process of the Internet:

Normal user
opens a browser -> transmission request to a target site -> the received response data -> to render the page.

Crawler
analog Browser -> transmission request to a target site -> the received response data -> extract useful data -> saved to the local / database.

Reptile process:

1. The transmission request (Requests module)

2. The fetch response data (the server returns)

3. parse and extract data (re regular)

4. Save the data

request modules:

HTTP requests are simple-to-use library, the official website address python implementation: http: //cn.python-requests.org/zh_CN/latest/

re module:

python re module is used for string matching module, which provides many of the features are implemented based on regular expressions,

In[ ]
import json
import re
import requests
import datetime

today = datetime.date.today().strftime(’%Y%m%d’) #20200315

crawl_dxy_data DEF ():
"" "
crawling Lilac Park, real-time statistics, to save the data directory to the current date as the file name, save JSON file
" ""
the Response = requests.get ( 'HTTPS: //ncov.dxy. cn / ncovh5 / view / pneumonia ' ) # request.get () request for the target website
print (response.status_code) # print status code

try:
    url_text = response.content.decode()                             #更推荐使用response.content.deocde()的方式获取响应的html页面
    #print(url_text)
    url_content = re.search(r'window.getAreaStat = (.*?)}]}catch',   #re.search():扫描字符串以查找正则表达式模式产生匹配项的第一个位置 ，然后返回相应的match对象。
                            url_text, re.S)                          #在字符串a中，包含换行符\n，在这种情况下：如果不使用re.S参数，则只在每一行内进行匹配，如果一行没有，就换下一行重新开始;
                                                                     #而使用re.S参数以后，正则表达式会将这个字符串作为一个整体，在整体中进行匹配。
    texts = url_content.group()                                      #获取匹配正则表达式的整体结果
    content = texts.replace('window.getAreaStat = ', '').replace('}catch', '') #去除多余的字符
    json_data = json.loads(content)                                         
    with open('data/' + today + '.json', 'w', encoding='UTF-8') as f:
        json.dump(json_data, f, ensure_ascii=False)
except:
    print('<Response [%s]>' % response.status_code)

crawl_statistics_data DEF ():
"" "
Get historical stats provinces, save the data directory, save JSON file
" ""
with Open ( 'data /' + Today + '.json', 'r', encoding = 'UTF -8 ') AS File:
json_array = json.loads (File.read ())

statistics_data = {}
for province in json_array:
    response = requests.get(province['statisticsData'])
    try:
        statistics_data[province['provinceShortName']] = json.loads(response.content.decode())['data']
    except:
        print('<Response [%s]> for url: [%s]' % (response.status_code, province['statisticsData']))

with open("data/statistics_data.json", "w", encoding='UTF-8') as f:
    json.dump(statistics_data, f, ensure_ascii=False)

IF name == ' main ':
crawl_dxy_data ()
crawl_statistics_data ()
the In []
'' '
install third-party libraries pyecharts, break or slow download problem occurs if the download fails, try Tsinghua mirror
' ''
#! PIP install pyecharts
! PIP install -i https://pypi.tuna.tsinghua.edu.cn/simple pyecharts
Second, the epidemic map
Echarts by Baidu is an open source data visualization tool, with a good interactive, sophisticated graphic design, has been recognized by many developers. The Python is an expressive language, it is suitable for data processing. When the data analysis encounter data visualization, pyecharts born. pyecharts api can refer to: https: //pyecharts.org/#/zh-cn/chart_api

Use options configuration items in pyecharts, everything all Options.

The main series is divided into components and configuration global configuration components.

(1) CI Series set_series_opts (), can be configured renditions, text style, tab style, line style, and so the point;

(2) global configuration item set_global_opts (), the title can be configured, animation, axes, legends and the like;

First to understand the global configuration components of it

2.1 national epidemic map
an In []
Import json
Import datetime
from the Map pyecharts.charts Import
from pyecharts the opts AS Import Options

Read the raw data file

today = datetime.date.today().strftime(’%Y%m%d’) #20200315
datafile = ‘data/’+ today + ‘.json’
with open(datafile, ‘r’, encoding=‘UTF-8’) as file:
json_array = json.loads(file.read())

Real-time analysis of national data confirmed: 'confirmedCount' field

china_data = []
for province in json_array:
china_data.append((province[‘provinceShortName’], province[‘confirmedCount’]))
china_data = sorted(china_data, key=lambda x: x[1], reverse=True) #reverse=True,表示降序，反之升序

print(china_data)

National epidemic Map

Custom range of each segment, and especially the style of each segment.

pieces = [
{‘min’: 10000, ‘color’: ‘#540d0d’},
{‘max’: 9999, ‘min’: 1000, ‘color’: ‘#9c1414’},
{‘max’: 999, ‘min’: 500, ‘color’: ‘#d92727’},
{‘max’: 499, ‘min’: 100, ‘color’: ‘#ed3232’},
{‘max’: 99, ‘min’: 10, ‘color’: ‘#f27777’},
{‘max’: 9, ‘min’: 1, ‘color’: ‘#f7adad’},
{‘max’: 0, ‘color’: ‘#f7e4e4’},
]
labels = [data[0] for data in china_data]
counts = [data[1] for data in china_data]

m = Map()
m.add(“累计确诊”, [list(z) for z in zip(labels, counts)], ‘china’)

# Series configuration items may be configured renditions, text style, tab style, line style, and point
m.set_series_opts (label_opts = opts.LabelOpts (= 12 is FONT_SIZE),
is_show = False)
# global configuration item, configurable header, animation , the axes, legends, etc.
m.set_global_opts (title_opts = opts.TitleOpts (title = ' real-time diagnosis data country',
SUBTITLE = 'source: lilac garden'),
legend_opts = opts.LegendOpts (is_show = False),
visualmap_opts the opts = .VisualMapOpts (= Pieces Pieces,
is_piecewise = True, # whether the segment type
is_show = True)) # whether visual mapping configuration
#render () will generate a local HTML file, the default will generate render.html file in the current directory, path parameters can be passed as m.render ( "mycharts.html")
m.render (path = '/ Home / aistudio / data / National real data confirmed .html')
2.2 Hubei epidemic map
the In []
Import JSON
Import datetime
from the Map pyecharts.charts Import
from pyecharts the opts AS Import Options

Read the raw data file

Real-time data confirmed Hubei Province

Read standardized city name for standardized data lilac garden in the city abbreviation

with open(’/home/aistudio/data/data24815/pycharts_city.txt’, ‘r’, encoding=‘UTF-8’) as f:
defined_cities = [line.strip() for line in f.readlines()]

def format_city_name(name, defined_cities):
for defined_city in defined_cities:
if len((set(defined_city) & set(name))) == len(name):
name = defined_city
if name.endswith(‘市’) or name.endswith(‘区’) or name.endswith(‘县’) or name.endswith(‘自治州’):
return name
return name + ‘市’
return None

province_name = ‘湖北’
for province in json_array:
if province[‘provinceName’] == province_name or province[‘provinceShortName’] == province_name:
json_array_province = province[‘cities’]
hubei_data = [(format_city_name(city[‘cityName’], defined_cities), city[‘confirmedCount’]) for city in
json_array_province]
hubei_data = sorted(hubei_data, key=lambda x: x[1], reverse=True)

    print(hubei_data)

labels = [data[0] for data in hubei_data]
counts = [data[1] for data in hubei_data]
pieces = [
{‘min’: 10000, ‘color’: ‘#540d0d’},
{‘max’: 9999, ‘min’: 1000, ‘color’: ‘#9c1414’},
{‘max’: 999, ‘min’: 500, ‘color’: ‘#d92727’},
{‘max’: 499, ‘min’: 100, ‘color’: ‘#ed3232’},
{‘max’: 99, ‘min’: 10, ‘color’: ‘#f27777’},
{‘max’: 9, ‘min’: 1, ‘color’: ‘#f7adad’},
{‘max’: 0, ‘color’: ‘#f7e4e4’},
]

the Map = m ()
m.add ( "cumulative confirmed", [list (z) for z in zip (labels, counts)], ' Hubei')
m.set_series_opts (label_opts = opts.LabelOpts (= 12 is FONT_SIZE),
is_show False =)
m.set_global_opts (title_opts = opts.TitleOpts (title = 'Hubei Province, real-time data confirmed',
SUBTITLE = 'source: lilac garden'),
legend_opts = opts.LegendOpts (is_show = False),
visualmap_opts = opts.VisualMapOpts (Pieces = Pieces,
is_piecewise = True,
is_show = True))
m.render (path = '/ Home / aistudio / the data / real-time data confirmed Hubei .html')
[( 'Wuhan', 50003), ( 'Xiaogan City ', 3518), (' Huanggang City ', 2907), (' Jingzhou City ', 1580), (' Ezhou ', 1394), (' Suizhou ', 1307), (' Xiangyang ', 1175) , ( 'Huangshi', 1015), ( 'Yichang City', 931), ( 'Jingmen', 928), ( 'Xianning', 836), ( 'Shiyan', 672), ( 'Xiantao ', 575), (' Tianmen ', 496), (' Enshi Prefecture ', 252), (' Qianjiang ', 198), (' Shennongjia ', 11)]
'/ home / aistudio / data / data confirmed Hubei real .html'
three, epidemic growth trend of
the In []
Import numpy AS NP
Import JSON
from pyecharts.charts Import Line
from the opts pyecharts AS Import Options

Read the raw data file

datafile = ‘data/statistics_data.json’
with open(datafile, ‘r’, encoding=‘UTF-8’) as file:
json_dict = json.loads(file.read())

Analysis of the provinces since February 1 of new cases of data: 'confirmedIncr'

statistics__data = {}
for province in json_dict:
statistics__data[province] = []
for da in json_dict[province]:
if da[‘dateId’] >= 20200201:
statistics__data[province].append(da[‘confirmedIncr’])

Get a list of dates

dateId = [str(da[‘dateId’])[4:6] + ‘-’ + str(da[‘dateId’])[6:8] for da in json_dict[‘湖北’] if
da[‘dateId’] >= 20200201]

New national trend

all_statis = np.array([0] * len(dateId))
for province in statistics__data:
all_statis = all_statis + np.array(statistics__data[province])

all_statis = all_statis.tolist ()

Hubei new trend

hubei_statis = statistics__data [ 'Hubei']

New trends outside of Hubei

other_statis = [all_statis[i] - hubei_statis[i] for i in range(len(dateId))]

Line = Line ()
line.add_xaxis (dateId)
line.add_yaxis ( "national new confirmed cases" legend #
all_statis, # data
is_smooth = True, # is smooth curve
linestyle_opts = opts.LineStyleOpts (width = 4, color = ' # B44038 '), # line style CI
itemstyle_opts = opts.ItemStyleOpts (color =' # B44038 ', # renditions CI
border_color = "# B44038", # color
border_width = 10)) # primitive size
line. add_yaxis ( "Hubei new confirmed cases", hubei_statis, is_smooth = True,
linestyle_opts opts.LineStyleOpts = (width = 2, Color = '# 4E87ED'),
label_opts = opts.LabelOpts (position = 'bottom'), tag # bottom polyline
itemstyle_opts = opts.ItemStyleOpts (Color = '# 4E87ED',
border_color = "# 4E87ED",
border_width = 3))
line.add_yaxis ( "new cases in other provinces", other_statis, is_smooth = True,
= opts.LineStyleOpts linestyle_opts (width = 2, Color = '# F1A846'),
label_opts = opts.LabelOpts (position = 'bottom'), at the bottom of the tab fold line #
itemstyle_opts = opts.ItemStyleOpts (color = '# F1A846',
= border_color "# F1A846",
border_width = 3))
line.set_global_opts (title_opts = opts.TitleOpts (title = "new confirmed cases", subtitle = 'source: lilac garden'),
yaxis_opts = opts.AxisOpts (max_ = 16000, min_ = 1, type _ = "log", # axes CI
splitline_opts = opts.SplitLineOpts (is_show = True) , # dividing line CI
axisline_opts = opts.AxisLineOpts (is_show = True) )) # axis tick CI
line.render (path = '/ home / aistudio / data / trends new cases .html')
complete the job in the following cell in
the in []
Please upload job shots
the in []
About AI Studio
AI Studio is based on the Baidu platform depth learning to fly one-stop paddle AI development platform that offers online programming environment, free GPU calculation power, massive open-source and open data algorithms to help developers to quickly create and deploy models.
Learn:
Related Resources
User Guide
FAQ
AI Studio Forum on
Education Area
Education Edition Introduction
Education Edition uses the document
teachers apply for commencement
Contact Us
E-mail: [email protected]
official QQ: 305197519

Haiqing

Released four original articles · won praise 0 · Views 1486

Private letter concerns

The new crown epidemic visualization paddlepaddle

Read the raw data file

Real-time analysis of national data confirmed: 'confirmedCount' field

National epidemic Map

Custom range of each segment, and especially the style of each segment.

Read the raw data file

Real-time data confirmed Hubei Province

Read standardized city name for standardized data lilac garden in the city abbreviation

Read the raw data file

Analysis of the provinces since February 1 of new cases of data: 'confirmedIncr'

Get a list of dates

New national trend

Hubei new trend

New trends outside of Hubei

Guess you like