Python data visualization summary

1 Introduction

Data in its raw form can be dull and boring to most people, but with the right visualization tools in hand, it can become compelling. In this article, through practical examples, let us use data visualization tools to explore different data experiences.

Without further ado, let's get started!

2. Take a chestnut

Let's start by creating a dataset, assuming the following dataset contains information such as carbon emissions, air quality index, greening rate, average temperature, and rainfall of Newporta city .

Year,AQI,Carbon_Emissions,Green_Space_Ratio,Rainfall,Temperature
2010,70,7.3,25.0,50,55
2011,72,7.5,25.5,47,57
2012,75,7.7,26.0,45,58
2013,77,7.9,26.5,44,58
2014,79,8.1,27.0,43,59
2015,80,8.3,27.5,42,60
2016,82,8.5,28.0,41,61
2017,85,8.7,28.5,40,62
2018,87,8.9,29.0,39,63
2019,90,9.1,29.5,38,64
2020,92,9.3,30.0,37,65

This dataset presents Newportraw data on how various environmental factors in the city change over time. We can visually see the trend of the city's carbon emissions, air quality, greening rate and average temperature over a decade. Finally, we can display all these factors together through visualization tools.

3. Know your audience

Understanding your audience is key to telling your data effectively. Let's imagine that our audience is a group of environmental policy makers. They are interested in how environmental factors change over time, so we need to present our data in a way that highlights these trends.

For our first visualization, let's create a line chart showing how the Newportair quality index (AQI) of a city (a fictional city in this example) has changed over the years. Line charts are an excellent choice for showing trends over time and are easily understood by a wide audience.

import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Load the data
df = pd.read_csv('environment_data.csv')

# Create a line chart of AQI over the years
fig1 = go.Figure()
fig1.add_trace(go.Scatter(x=df['Year'], y=df['AQI'], mode='lines', name='AQI', line=dict(color='red')))
fig1.update_layout(title='Newport - Air Quality Index Over Time', xaxis_title='Year', yaxis_title='Air Quality Index (AQI)')
fig1.show()

The results of the run are as follows:
insert image description here
In the figure above we have visualized the air quality index for each year, highlighting how it has changed over time. Simple graph, but effective - we can clearly see that the performance is that the pollution is increasing year by year.

4. Use the right visualization

Different types of visualizations serve different purposes. For our second visualization, using the same library and CSV file from tip 1, let's create a scatterplot to show the relationship between carbon emissions and the Air Quality Index (AQI).

# Create a scatter plot of Carbon Emissions vs AQI
fig2 = go.Figure()
fig2.add_trace(go.Scatter(x=df['Carbon_Emissions'], y=df['AQI'], mode='markers', name='Carbon Emissions vs AQI', marker=dict(color='red')))
fig2.update_layout(title='Newport - Carbon Emissions vs Air Quality Index', xaxis_title='Carbon Emissions (million metric tons)', yaxis_title='Air Quality Index (AQI)')
fig2.show()

Using scatterplots allows us to investigate whether there is a possible correlation between carbon emissions and air quality, providing valuable decision-making insights for policy makers. The running results are as follows:
insert image description here
In this example, the visualization shows a linear increasing relationship between the two, so it is very likely that there is a certain causal relationship between the two.

5. Highlight important points

Our third visualization will be a bar chart showing the change in percentage of greenery over the years. This can highlight the impact of urban planning and development policies on greening. Using the same library and CSV file from tip 1, the bar chart code is as follows:

# Create a bar chart of Green Space Ratio over the years
fig3 = go.Figure()
fig3.add_trace(go.Bar(x=df['Year'], y=df['Green_Space_Ratio'], name='Green Space Ratio', marker=dict(color='green')))
fig3.update_layout(title='Newport - Green Space Ratio Over Time', xaxis_title='Year', yaxis_title='Green Space Ratio (%)')
fig3.show()

Here's the result:
insert image description here
This bar graph highlights the increase in the rate of greenery over the years, an important point for policymakers interested in urban sustainability.

6. Tell stories with data

For our final visualization, we'll create an area chart showing the change in average temperature and rainfall over the years. This provides insight into Newportthe city's potential climate change impacts. Using the same library and CSV file from tip 1, area chart code:

# Create a stacked area chart of Average Temperature and Rainfall over the years

fig4 = make_subplots(specs=[[{
    
    "secondary_y": True}]])
fig4.add_trace(go.Scatter(x=df['Year'], y=df['Temperature'], mode='lines', name='Temperature', stackgroup='one'), secondary_y=False)
fig4.add_trace(go.Scatter(x=df['Year'], y=df['Rainfall'], mode='lines', name='Rainfall', stackgroup='one'), secondary_y=True)
fig4.update_layout(title='Newport - Temperature and Rainfall Over Time', xaxis_title='Year', yaxis_title='Temperature (°F) / Rainfall (inches)')
fig4.show()

This overlay area chart shows the variation of the two factors over time, allowing us to see underlying correlations and trends.
insert image description here
In this case, the data tell a very clear story of how temperature and rainfall have changed at the same time over the years. A temperature increases over time while rainfall decreases over time.

7. Graphical visualization dashboard

Now that we have all the visualizations, let's Dashcombine them into a single panel using a library in Python.

import dash
from dash import dcc
from dash import html

# Load the data
df = pd.read_csv('environment_data.csv')

#put all of our chart code here (fig1, fig2, fig3, fig4), remove ALL show() statements

app = dash.Dash(__name__)
app.layout = html.Div([
    html.H1('New York City Environmental Data Dashboard'),
    
    html.Div([
        dcc.Graph(figure=fig1),
        dcc.Graph(figure=fig2),
    ], style={
    
    'display': 'flex'}),
    
    html.Div([
        dcc.Graph(figure=fig3),
        dcc.Graph(figure=fig4),
    ], style={
    
    'display': 'flex'}),
])
if __name__ == '__main__':
    app.run_server(debug=True)

In this code, we mainly use Dashthe library, a webPython framework for building analytics applications, to create a dashboard with four visualizations. Arrange it in two rows with two subplots in each row. as follows:
insert image description here

Super awesome and super easy to implement, this article tells us 4 useful and eye-catching visualizations from our simple dataset, all in one easy-to-access dashboard!

8. Summary

Use data visualization tools to achieve compelling results. It's about understanding what's hidden behind the data, understanding the audience, and continually receiving and implementing feedback to continually improve the visualization. This article explains the complete steps of how to obtain eye-catching visualization effects step by step through specific data examples, and gives corresponding code examples.

Are you useless?

Guess you like

Origin blog.csdn.net/sgzqc/article/details/130784147