Visualization of epidemic data based on Python (matplotlib, pyecharts dynamic map, large-screen visualization)

Epidemic data visualization based on Python

If you have any learning questions, you can add me to WeChat to communicate! bmt1014

1. Project demand analysis

1.1 Background

In 2020, the COVID-19 epidemic broke out globally, posing a serious threat to people's health and lives. The epidemic situation in different countries and regions has also attracted widespread attention. The monitoring and analysis of epidemic data is crucial to epidemic prevention and control and scientific prevention and control. Based on the epidemic data, this report presents the development trend and changes of the global and domestic epidemic situation through visual data analysis, helping people understand the actual situation of the epidemic more intuitively and comprehensively, and doing a good job for the whole society, the government and the public. Provide reference for epidemic prevention and control and disposal.

1.2 Data sources

This data is taken from the World Health Organization (WHO) website and contains statistics on the number of COVID-19 cases and deaths worldwide as of a given date.
Data feature description:
Date_reported: Reporting date
Country_code: Country/region abbreviation code
Country: Country/region name
WHO_region: WHO region
New_cases: Number of new confirmed cases
Cumulative_cases: Cumulative number of confirmed cases
New_deaths: Number of new deaths
Cumulative_deaths: Number of cumulative deaths
These Variables provide basic statistics about the global COVID-19 outbreak. For each country, detailed statistics are provided on reporting date, cumulative and daily number of new cases and deaths.
Some of the data are shown in the figure:
insert image description here

1.3 Specific objectives

[Task 1] Use a basic map to show the epidemic situation in China
[Task 2] Use a line chart to show the trend of new confirmed cases in Wuchang countries
[Task 3] Use an animation to show the number of newly diagnosed TOP10 countries every day
[Task 4] Use a ring chart to show the seven continents Epidemic situation
[Task 5] Use a dynamic map to display the number of new people in the world every month
[Task 6] Use large-screen visual mapping analysis

2. Overall design

2.1 The environment used in this experiment

(1) windows10 system
(2) PyCharm Community Edition 2022.2.1 software
(3) Python version: 3.8
(4) Anaconda and Jupyter Notebook

2.2 Visualization scheme

1. For tasks 1, 2, and 3, use the matplotlib visualization library to display the epidemic data, mainly including histograms, line graphs, bar graphs, and ring graphs.
2. Use the pyecharts visualization library for tasks 4, 5, and 6, mainly including dynamic maps, pie charts, histograms and other basic graphics for large-screen visual display.

3. Detailed design

Task 1 uses a basic map to show the epidemic situation in China

(1) The technology to be adopted
It is proposed to use the histogram drawing function of motplotlib to display the histogram of the epidemic situation in China; in order to facilitate data observation, the bar chart of the number of infections in each province of China and the histogram of the number of infections and deaths in China as a whole (
2 ) task one source program
insert image description here
insert image description here

Task 2 Use a line chart to show the trend of new confirmed cases in Wuchang Country

(1) The technology to be used
It is planned to use motplotlib's histogram drawing function to display the trend line chart of the epidemic situation in Wuchang countries; in order to facilitate data observation, five colors and discounts are used to display the epidemic trend.
(2) Source program
insert image description here

Task 3 Use animation to show the TOP10 countries with new confirmed cases every day

(1) The technology to be adopted
This code mainly uses the Pandas library to process and analyze the data, uses the Matplotlib library to draw charts, and uses the Matplotlib animation function to display the change process of the data, and uses the datetime library for date and time processing. And the string formatting function formats the data into a specified string format.
(2) Source program
insert image description here
insert image description here

Task 4 uses a ring chart to show the epidemic situation on the seven continents

Subtask 1: Use a ring chart to show the epidemic situation in the seven continents

(1) The technology to be adopted
This code mainly uses the Pandas library to process and analyze data, uses the Matplotlib library to draw charts, and uses Matplotlib subgraph and ring functions to display multiple ring graphs, and uses dictionaries and Lambda functions at the same time Perform data processing, and the string formatting function formats the data into a specified string format.
(2) Source program
insert image description here
insert image description here

Subtask 2: Use the rose diagram to display the death rate rose diagram of the top ten countries in the world

(1) The technology to be adopted
It is planned to use the Pandas library to process and analyze the data, use the pyecharts library to draw charts, use the pyecharts library to draw Nightingale rose diagrams, set the fan-shaped attributes through the set_series_opts method, and finally generate a visual chart. In the drawing process, some options provided by the pyecharts library are also used, such as setting the ratio of inner and outer radii, the center position of the chart, the type of Nightingale rose diagram, etc.
(2) Source program
insert image description here
insert image description here
insert image description here

Task 5: Use a dynamic map to display the number of new people in the world every month

(1) The technology to be adopted
Mainly use two types of charts, Map map and Timeline time axis, to display the new number of people in each country on the map, and realize the dynamic display function of data through the time axis.
In the code, the pandas library is used to read and process the CSV files, and further use the numpy library, datetime library, collections library, matplotlib library, etc. to process data and draw charts; use the pyecharts library to create Map and Timeline charts and configure corresponding parameters, and finally Generate interactive HTML web pages.
The specific implementation process is: first read the CSV file, delete unnecessary column information, and then clean and process the data to obtain the information on the number of newly diagnosed people in each country in each month; then, use the Map and Timeline categories in the pyecharts library The function creates maps and timeline charts, and adds corresponding data sequences through the add() method, and sets related global parameters for control by the add_schema() method and set_global_opts() method. Finally, they are combined into a time axis object to realize the display effect of the dynamic chart.
(2) Source program
insert image description here

Task 6: Use large-screen visual mapping analysis

(1) The technology to be adopted
mainly displays the different dimensions and characteristics of the epidemic data in various forms such as histograms, maps, pie charts, line charts, and funnel charts. First, read 3 CSV files through the pandas library, including daily new data. The number of newly confirmed cases, the cumulative number of confirmed cases, and the global epidemic data released by WHO. Then, the visualization of 6 charts is realized through the pyecharts library.

4. Running results and result analysis

Task 1 uses a basic map to show the epidemic situation in China

(1) Running results
insert image description here

(2) Result analysis
The bar chart shows the number of infected people in various provinces across the country on January 26. It can be seen that the number of infected people in Hubei was large on that day, and the number of infected people in other provinces was below 200.
(1) Operation results
insert image description here

(2) Analysis of the results
Use the stacked column chart to draw a histogram of the epidemic situation in China in February. The blue part is the number of infections, and the orange part is the number of deaths. The data shows that the number of infected people is increasing day by day, and the increase is relatively large. The small increase in the number of deaths shows that the spread of the epidemic in my country was poorly controlled in February, but the number of deaths was well controlled.

Task 2 Use a line chart to show the trend of new confirmed cases in Wuchang Country

(1) Running results
insert image description here

(2) Analysis of the results
The figure shows the trend of the number of new confirmed cases of the epidemic in Wuchang countries from January 22, 2020 to March 22, 2020. The figure shows that during the period from January to March, China had the largest increase in the number of new cases. The remaining four countries increased more slowly, but started to increase rapidly after March 8, 2020.

Task 3 Use animation to show the TOP10 countries with new confirmed cases every day

(1) Running results
insert image description here
insert image description here

(2) Analysis of results
This figure shows the top ten countries with the highest number of new infections in the world on March 20, 2020 and March 22, 2020. It can be seen that the infection trend is gradually increasing

Task 4 uses a ring chart to show the epidemic situation on the seven continents

Subtask 1: Use a ring chart to display the epidemic situation on the seven continents
(1) Operation results
insert image description here

(2) Analysis of results
This figure shows a ring diagram of the epidemic situation in countries on seven continents, with new infection rates, recovery rates, and death rates. It can be seen that the recovery rate of Asian countries is relatively higher than that of other continents.
Subtask 2: The rose chart shows
the results of the top ten countries with the death rate (1)

(2) Result analysis
This figure shows the top ten countries with the highest mortality rate in the world. This data is as of March 2020, so it can be seen that the mortality rate in China is relatively high

Task five uses a dynamic map to display the number of new people in the world every month

(1) Running results
insert image description here

(2) Analysis of results
Through this dynamic map, we can clearly observe the trend of the number of new confirmed cases of COVID-19 in various countries around the world throughout 2020. Since the map is displayed by month, we can see which countries have more new cases and which countries have fewer new cases in different months, and whether the epidemic has eased in some countries with severe outbreaks. In addition, you can also place the mouse on the corresponding country to view its specific number of new confirmed cases.

Task 6 uses large-screen visual mapping analysis

(1) Running results
insert image description here

(2) Result analysis
Histogram of newly confirmed cases (time axis): This chart is used to display the daily newly confirmed cases from 2020.01.22 to 2020.03.33, and accumulate the newly confirmed cases every three days . By looking at this chart, we can see how fast the new coronavirus is spreading and when its peak transmission occurs.
Histogram of the cumulative number of confirmed cases, deaths and recoveries: This chart is used to show the cumulative number of confirmed cases, deaths and recoveries in different countries. By comparing the data of various countries, we can get the contribution of these countries in the global epidemic.
Map of new cases by country (time axis): This chart shows the number of new cases worldwide each month since 2020.01.02. Through this map, we can know that since the outbreak of the epidemic, the number of new cases around the world has been spreading, and the trend is relatively obvious, and the number of new cases in Europe and the United States is significantly higher than that in Asia.

5. Project conclusions and suggestions

Through the processing and analysis of the new crown pneumonia epidemic data, the following conclusions are drawn: the
COVID-19 epidemic data set is visualized using Python's data visualization library Pyecharts. Through a variety of viewing tools such as basic graphs, line graphs, moving graphs, circular graphs, and maps, it intuitively shows the epidemic situation of the number of new confirmed cases in China, five permanent countries, seven continents, and the world every month.
When making visualization, it is necessary to choose the appropriate visualization method according to different tasks. For example, when showing the epidemic situation in China, a histogram or a stacked chart can be used, and a line chart can be used to clearly show the trend of newly diagnosed cases in Wuchang. Using a dynamic map to show the number of newly diagnosed TOP10 countries every day can better reflect the numerical changes and differences; using a ring chart to show the epidemic situation in the seven continents can compare the epidemic situation in different continents at a glance; using a dynamic map to display the global new monthly It is easy to find the spatial distribution of newly confirmed cases in different countries.

6. Big homework experience and experience

When dealing with large-scale data sets, it is necessary to carefully think about what information to express, and choose a visualization method that meets the needs according to the type of information.
Pyecharts can be used to quickly create many types of graphics. Its graphics are beautiful and easy to operate. It is not only convenient for amateurs to analyze data, but also helps data scientists quickly demonstrate insights.
In the production process, pay attention to the principle of combining beauty and simplicity. If there is too much information on a chart, it can become visually confusing. So make sure the visual design is clear and easy to understand, paying attention to details like scaling, label size, etc.

insert image description here

Guess you like

Origin blog.csdn.net/weixin_48676558/article/details/131144715