Python create dynamic charts, see global epidemic trends

Author | Liu to get up early to get up early

Zebian | Tu Min


Foreword

Recent domestic outbreaks has improved, but the situation abroad is not optimistic, then how to use Python to make global epidemic trend of dynamic charts to see it? For example, following the development trend of the epidemic at home and abroad:

Or global epidemic trends ⬇️

In fact, implemented in Python is not difficult, in short, to be divided into three steps:

  • Get Data (requests)

  • Data cleaning (PANDAS)

  • Data visualization (pyecharts)

So we have to explain a little bit of it!

Data acquisition and processing

Epidemic is not difficult to obtain data on the Internet has many sites currently provide data, such as lilac garden, the news Tencent, Baidu news, in order to save on GitHub direct look to see if there is no ready-made interface, it is easy to find lilac Park data API:

The next two lines will be able to win all historical data

data = requests.get('https://lab.isaaclin.cn/nCoV/api/area?latest=0')
data = data.json()

Look at the data:

Obviously this is no way to do data analysis, so the next focus is how to clean this pile of data, mainly divided into the following two:

  • Data reduction: the raw data cleansing, conversion by the format json format to facilitate the analysis of dataframe

  • Data cleaning: original data acquisition API since the mechanism. Substantially duplicate data, invalid data, missing data is required to process these data

First look at the amount of data

We can see a total of 7584 data collected since more dirty data, so this part of the workload is relatively large, so we do not use too much space here to talk about step by step how to extract the data we want, will alone write an article data processing, but take a look at what went through the process with the code!

First, we want to extract all the data from the dictionary out and timestamp conversion, then save the data to the pandas in

data = requests.get('https://lab.isaaclin.cn/nCoV/api/area?latest=0')
data = data.json()
res = data['results']
df = pd.DataFrame(res)
def time_c(timeNum):
    timeTemp = float(timeNum/1000)
    tupTime = time.localtime(timeTemp)
    stadardTime = time.strftime("%Y-%m-%d %H:%M:%S", tupTime)
    return stadardTime

for i in range(len(df)):

    df.iloc[i,16] = time_c(df.iloc[i,16])

for i in range(len(df)):

    df.iloc[i,16] = df.iloc[i,16][5:10]

Now the data has become so

This looks much more comfortable, but still can not be used, because the API data would be collected several times a day, so there are a lot of duplicate data and abnormal data, so the next focus of this part of the process. For duplicate data we retain only a new, empty data for the previous day's data we have chosen to fill.

#去重部分代码
tem = df1[df1['updateTime'] == '03-02']
tem = tem.drop_duplicates(['provinceShortName'], keep='last')
for i in date[1:41]:
    tem1 = df1[df1['updateTime'] == i]
    tem1 = tem1.drop_duplicates(['provinceName'], keep='last')
    tem = tem.append(tem1)

tem = tem.reset_index(drop=True)
tem

Due to space reasons, it is no longer posted more code, we look at the final processed data

data visualization

Data visualization, explained many times before we still choose pyecharts, it is mainly used inside Timeline: Timeline carousel over the map, is simply to generate a map at each time point and then scrolling, a bit like a child Like hand-painted comic, so we need the data is time-series data, specifically on how to use, how to adjust the parameters please pay attention to the follow-up article to explain the separate visualization, direct look at the code and analyze it. The first is the trend of the epidemic at home and abroad

public class MyActivity extends AppCompatActivity from pyecharts.faker import Faker
from pyecharts import options as opts
from pyecharts.charts import Bar, Page, Pie, Timeline,Grid


def timeline_bar() -> Timeline:
    x = ['国内','国外']
    tl = Timeline()
    tl = Timeline()
    tl.add_schema(is_auto_play = True,
    play_interval = 500,
    is_loop_play = False)
    k= 0
    for i in date:
        bar = (
            Line()
            .add_xaxis(date)
            .add_yaxis("国内", hs(c1,k))
            .add_yaxis("国外", hs(c,k))
            .extend_axis(
            yaxis=opts.AxisOpts(
            )
        )
            .set_series_opts(
            areastyle_opts=opts.AreaStyleOpts(opacity=0.5),
            label_opts=opts.LabelOpts(is_show=False),
        )
            .set_global_opts(title_opts=opts.TitleOpts("{}国内外疫情趋势".format(i)))
        )
        tl.add(bar, "{}".format(i))
        k = k + 1
    return tl
timeline_bar().render_notebook()

As can be seen, the domestic growth has been flat in the state, while abroad since the end of February to the sudden outbreak is still on the rise, which is why now want to prevent foreign imported cases. Let's look at the specific case of foreign share it (micro-channel can only upload GIF 5M so little paste):

As it can be seen in recent days the sudden outbreak of Korea, Japan, Italy, the number of cases of these three countries accounted for about 75%. Finally, look at the global trend of the epidemic now!

Conclusion

At this point, we fully use Python data on the epidemic had a dynamic visualization, review the whole process is actually not too overly complicated steps, more about pandas and pyecharts using basic functions, if you want to know all the data cleansing process follow python early follow-up technology-sharing article, you need to get data directly to the visualization of friends can directly get up early in the python public number, and finally have to say, the epidemic has not dissipated, we have to continue to do protection! Go China!

【End】

Recommended Reading 

took the $ 220,000 annual salary, fared not as good as an intern?

Bitcoin most mainstream, Ethernet fell Square, block chain technology "one size fits all" bonus has ended | block chain developers Annual Report

How to create a new virus outbreak crown tracker with Jupyter Notebook?

born in a small town, the entrance flow, Fudan coaching, career in Silicon Valley, why the 59-year-old Lu Qi, so "lucky"?

How safe integration when DevOps transformation? What is the impact on the firm's output? 2019 Status Report DevOps latest interpretation of | the Force program

Bitcoin most mainstream, Ethernet fell Square, block chain technology "one size fits all" bonus has ended | block chain developers Annual Report

You look at every point, I seriously as a favorite

Released 1844 original articles · won praise 40000 + · Views 16,650,000 +

Guess you like

Origin blog.csdn.net/csdnnews/article/details/104935652