Python data analysis and visualization training--data analysis and visualization of Excel tip data set

1. Experimental purpose

This training is mainly about data analysis and visualization of the tip data set.

2. Experimental data

The experimental tip data set comes from the data that comes with the Python library Seaborn, which has been converted into an Excel type data set.
Partial screenshot:
Please add image description

3. Experimental operation

1. Import module

#导入实验需要的包
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['font.sans-serif']=['SimHei']#用来显示中文标签
plt.rcParams['axes.unicode_minus']=False#用来显示负号
%matplotlib inline

2. Get the data.
Import the data and display the first 5 rows.

fdata=pd.read_excel('C:/Users/leglon/Desktop/ch4/tips.xls')#读取数据,在此需要导入xls的环境
fdata.head()#输出前五行

Please add image description

Here you need to install the xlrd environment in advance, otherwise errors may easily occur: ImportError: Missing optional dependency 'xlrd'. Install xlrd >= 1.0.0 for Excel support Use pip or conda to install xlrd. To solve this problem, just open cmd and
enter : pip install xlrd, just wait for the installation to complete. Or go to anaconda to download and install the xlrd environment.
Steps: anaconda—>Environments—>tensorflow—>Not installed, enter: xlrd, check the pop-up option, and then click Apply. Just open it again.
Please add image description

3. View data information

fdata.describe()#查看数据描述

Please add image description
4. Modify the column name to Chinese

#修改为汉字,并且显示前五行数据
fdata.rename(columns={
    
    'total_bill':'消费总额','tip':'小费','sex':'性别','smoker':'是否吸烟','day':'星期','time':'聚餐时间段','size':'人数'},inplace=True)
fdata.head()

Please add image description
5. View the top 5 lines of per capita consumption

#人均消费,显示前五行
fdata['人均消费']=round(fdata['消费总额']/fdata['人数'],2)
fdata.head()

6. Find data in the data set where the per capita consumption of smoking men is greater than 15

#查询吸烟男性中消费大于15的数据
fdata.query('是否吸烟=="Yes"&性别=="Male"&人均消费>15')

Please add image description
7. Check the relationship between total consumption and tips

fdata.plot(kind='scatter',x='消费总额',y='小费')#查看消费总额与小费的关系

Please add image description

It can be seen from the figure that there is a positive correlation between tips and total consumption.

8. Check the relationship between smoking and tipping

fdata.plot(kind='scatter',x='是否吸烟',y='小费')#查看是否吸烟与小费的关系

Please add image description
It can be seen from the figure that the relationship between smoking and tipping has little impact.

9. Compare the total consumption data of men and women

fdata.groupby('性别')['消费总额'].mean()

Please add image description

It can be seen that men consume more than women.

10. See how generosity compares between genders

#查看性别的慷慨程度对比
fdata.groupby('性别')['小费'].mean()

Please add image description
Men tip more than women.
11. Analyze the relationship between week and tip

#分析星期与小费的关系
print(fdata['星期'].unique())#显示星期的取值
r=fdata.groupby('星期')['小费'].mean()
fig=r.plot(kind='bar',x='星期',y='小费',fontsize=12,rot=30)
fig.axes.title.set_size(16)

Please add image description
It can be seen from the figure that tips are larger on Saturdays and Sundays than on Thursdays and Fridays.

12. Analyze the generosity of gender and smoking combinations

#分析性别与吸烟组合的慷慨度
r=fdata.groupby(['性别','是否吸烟',])['小费'].mean()
fig=r.plot(kind='bar',x=['性别','是否吸烟'],y='小费',fontsize=12,
rot=30)
fig.axes.title.set_size(16)

Please add image description
It can be seen that non-smoking men are more generous and tip more; non-smoking women are more generous than smoking women.

13. Analyze the relationship between dinner time and tipping

#分析聚餐时间段与小费的关系
r=fdata.groupby(['聚餐时间段'])['小费'].mean()
fig=r.plot(kind='bar',x='聚餐时间段',y='小费',fontsize=15,rot=30)
fig.axes.title.set_size(16)

Please add image description

You can see from the picture that tips are larger during dinner than during lunch.

14. Analyze the relationship between the number of people and tips

#分析人数与小费的关系
r=fdata.groupby(['人数',])['小费'].mean()
fig=r.plot(kind='bar',x='人数',y='小费',fontsize=15,rot=30)
fig.axes.title.set_size(16)

Please add image description

It can be seen from the picture that the more people at the dinner party, the more tips will be given.

4. Summary

Learning data visualization plays a great role for us. We can get some important information by analyzing data, allowing us to understand events better and give us more ways to respond to events.

Guess you like

Origin blog.csdn.net/qq_62127918/article/details/130512822