Pandas Data Visualization


Joyful Pandas

Datawhale CommunityJoyful Pandas

basic drawing

one-dimensional data

  • numeric
    • Histogram plt.hist()
    • Box plot plt.boxplot()
    • Line chart plt.plot() # ordered numerical type
  • Type
    • Histogram plt.bar()
    • pie chart plt.pie()

Hands-on Data Analysis

Datawhale community hands-on data analysis

2 Chapter 2: Data Visualization

Before starting, import numpy, pandas, and matplotlib packages and data

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#加载result.csv这个数据
df = pd.read_csv('./result.csv')
df.head()
Unnamed: 0 PassengerId Survived Pclass Name Sex Age SibSp respect Ticket Fare Cabin Embarked
0 0 1 0 3 Braund, Mr. Owen Harris male 22.0 1.0 0.0 A/5 21171 7.2500 NaN S
1 1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1.0 0.0 PC 17599 71.2833 C85 C
2 2 3 1 3 Heikkinen, Miss. A loan female 26.0 0.0 0.0 STON/O2. 3101282 7.9250 NaN S
3 3 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1.0 0.0 113803 53.1000 C123 S
4 4 5 0 3 Allen, Mr. William Henry male 35.0 0.0 0.0 373450 8.0500 NaN S

2.7 How to let people understand your data at a glance?

"Python for Data Analysis" Chapter 9

2.7.1 Task 1

Follow the ninth chapter of the book to understand matplotlib, create a data item by yourself, and perform basic visualization on it

[Thinking] What are the most basic visual patterns? Applicable to those scenarios respectively? (For example, a line chart is suitable for visualizing the trend of an attribute value over time)

This part of the reference content comes from datawhale open source content fantastic-matplotlib

matplotlib provides two of the most commonly used drawing interfaces

  • Explicitly create figures and axes, and call drawing methods on them, also known as OO mode (object-oriented style)

  • Rely on pyplot to automatically create figures and axes, and draw

fig, ax = plt.subplots()
ax.plot([1,2,3,4], [1,4,2,3])
plt.show()


insert image description here

plt.plot([1,2,3,4], [1,4,2,3]);


insert image description here


<matplotlib.lines.Line2D at 0x23155916dc0>When using matplotlib in jupyter notebook, you will find that a paragraph like this is automatically printed out after the code runs , because matplotlib's drawing code prints out the last object by default. If you don't want to display this sentence, there are three methods:

  • Add a semicolon at the end of the code block;

  • Add a sentence at the end of the code blockplt.show()

  • When drawing, explicitly assign the drawing object to a variable, such as changing plt.plot([1, 2, 3, 4]) toline =plt.plot([1, 2, 3, 4])


2.7.2 Task 2

Visualize the distribution of survivors among men and women in the Titanic dataset (try it with a histogram).

#代码编写
sex_survived = df.groupby('Sex')['Survived'].sum()
_ = plt.bar(sex_survived.index, sex_survived.values)


insert image description here

[Thinking] Calculate the number of deaths among men and women in the Titanic data set, and visualize it? How to combine it with the visual histogram of the number of survivors of men and women? See your data visualization and talk about your first impressions (for example: you can see that more boys survived at a glance, so gender may affect the survival rate).

#思想问题answer
Women survive far more than men

2.7.3 Task Three

Visualize the proportion of survivors and deaths among men and women in the Titanic dataset (try it with a histogram).

#代码编写
# 提示:计算男女中死亡人数 1表示生存,0表示死亡
radio_ss = df.groupby(['Sex','Survived'])['Survived'].count().unstack()
radio_ss
Survived 0 1
Sex
female 81 233
male 468 109

Index Pivot : convert row index to column index,

  • unstack(): By default, the innermost row index is moved to the innermost column index
radio_ss.plot(kind = 'bar', stacked = True);


insert image description here

[Tips] For the two data axes of men and women, the number of survivors and deaths is expressed in a histogram in proportion

2.7.4 Task Four

Visualize the distribution of the number of people alive and dead for different fares in the Titanic dataset. (Try it with a line chart) (The horizontal axis is different ticket prices, and the vertical axis is the number of survivors)

[Tip] For data of this statistical nature and represented by broken lines, you can consider sorting or unsorting the data to represent them separately. see what you can find

#代码编写
# 计算不同票价中生存与死亡人数 1表示生存,0表示死亡
df.groupby(['Fare','Survived'])['Survived'].count().unstack().plot(kind = 'line');


insert image description here

2.7.5 Task Five

Visually display the distribution of survivors and dead personnel at different bin levels in the Titanic dataset. (Try it with a histogram)

#代码编写
# 1表示生存,0表示死亡
df.groupby(['Pclass', 'Survived'])['Survived'].count().unstack().plot(kind = 'bar', stacked = True);


insert image description here

[Thinking] After seeing the previous few data visualizations, talk about your first impression and your summary

#Thinking question answer

  1. High ticket prices, high probability of first-class survival
  2. Females are more likely to survive

2.7.6 Task Six

Visualize the distribution of the number of survivors and deaths of people of different ages in the Titanic dataset. (unlimited expression)

#代码编写
df.groupby(['Age','Survived'])['Survived'].count().unstack().plot(kind = 'line');


insert image description here

2.7.7 Task Seven

Visualize the age distribution of people in different bin classes in the Titanic dataset. (Try it with a line chart)

#代码编写
df.groupby(['Age','Pclass'])['Age'].count().unstack().plot(kind = 'line');


insert image description here

[Thinking] Do an overall analysis of all the visualization examples above, and see if you can find out by yourself

#Thinking question answer

  1. Younger people have a higher chance of survival
  2. Middle-aged people also have a certain survival rate. Combined with the cabin class-age relationship, this part of middle-aged people may be richer, have a certain social status, and have a higher survival rate
  3. Among these people, the probability of survival of women is generally higher than that of men

【Summarize】

Have a basic understanding of data visualization and learn how to draw basic graphics.

Guess you like

Origin blog.csdn.net/qq_38869560/article/details/128753648