Python case|Data analysis implemented by Matplotlib library

picture

Data display is an important link in data analysis and mining, and the inherent laws of data can be presented intuitively and clearly in the form of graphics.

The data used in this article adopts the data table after the implementation of the previous case , and the data is stored in the newbj_lianJia.csv file. The specific code is as follows.

import pandas as pd  #导入库
import matplotlib.pyplot as plt
plt.rcParams['font.sans-serif']=['SimHei'] # 调整字体设置
plt.rcParams 'axes.unicode minus False
df=pd.read csv('newbj lianJia.csv',encoding='gbk') # 读取文件

The main task of this paper is to display and analyze the data distribution of each attribute, including the following aspects.

(1) Draw a bar graph of the average rent distribution for each floor.

(2) Draw a line chart of the average rent of houses in each urban area.

(3) Draw a histogram of the number of street houses with the top 20 average rents and a line chart of their average rent distribution.

(4) Draw the proportion of the top 10 house types.

01. Case realization

(1) Draw a bar graph of the average rent distribution for each floor. code show as below.

# 按照楼层分组
g=df.groupby('floor')
# 计算各楼层的房屋数量
df floor=g.count()['ID
floor=df floor.index.tolist()
# 计算各楼层的平均租金
df floor rent=g.mean() 'rent;
rent=df floor rent.values.tolist ()
rent= round(x,2) for x in rent
# 绘制条形图
plt.barh(y=floor,width=rent)
plt.ylabel('楼层')
plt.xlabel('租金/元)
plt.title('各楼层平均租金条形图',fontproperties='stkaiti',fontsize=14)
plt.tight layout(pad=2)
plt.show()

The running result is shown in Figure 1.

picture

■Figure 1 Bar graph of average rent for each floor

It can be seen that the order that has the greatest impact on rent is: basement, low floor, middle floor, and high floor.

(2) Draw a line chart of the average rent of houses in each urban area.

code show as below.

# 按照城区进行分组,统计租金的平均值
dfl=df.groupby('district')['rent'].mean()
# 获取城区名
region=df1.index.tolist()
# 获取各城区的平均租金
rent=_round(x,2) for x in dfl.values.tolist()
# 绘制各城区房屋租金折线图
plt.figure(figsize=(12,6))
plt.plot(region,rent,c='r',marker='o',linestyle='--')
for x,y in zip(region,rent):
plt.text(x,Y,' .0f' y,ha='center',fontsize=11)
# 设置坐标轴标签文本
plt.ylabel('租金/元,fontproperties='simhei')
plt.xlabel('城区',fontproperties='simhei')
#设置图形标题
plt.title(各城区房屋平均租金折线图,fontproperties='stkaiti'
fontsize=14)
# 设置横坐标字体倾斜角度
plt.xticks(rotation=15)# 显示图形
plt.show()

 The running result is shown in Figure 1.

picture

■Figure 1 The line chart of the average rent of houses in each urban area

It can be seen that the urban area has a great impact on the rent. The average rent is the highest in Chaoyang District, 13,976 yuan, and the lowest in Miyun District, 3,480 yuan. Of course, this is directly related to the geographical location of the urban area. The closer to the city center, the higher the rent, and the farther away from the city center, the lower the rent.

(3) Draw a histogram of the number of street houses with the top 20 average rents and a line chart of their average rent distribution.

code show as below.

# 按照街道进行分组
g=df.groupby('street')
# 对街道按照平均租金进行升序排序,并取前 20 名
df region=g.mean() 'rent
top_street rentdf region.sort values(axis=0,ascending=False)[:20]
# 获取排名前 20 名的街道名称
region=top street rent .index.tolist ()
# 统计各个街道出租房屋数量
count=[g.count()['ID'][s] for s in region]
# 获取排名前 20 名的街道的平均租金
rent= round(x,2) for x in top street rent.values.tolist()#绘图
fig,axs=plt.subplots(1,1,figsize=(12,6))axs.bar(region,height=count)
plt.ylabel("数量")
plt.xlabel("街道")
axs1=axs.twinx()
axsl.plot(region,rent,c='r',marker='o',linestyle='--')for x,y in zip(region,count):
axs.text(x,y,.Of' y, ha='center',fontsize=12)
for x,y in zip(region,rent):axs1.text(x,y,.Of' yha='center',fontsize=12)axs.set title(租金前 20 名的街道出租房屋数量及其租金分布图fontsize= 14)
plt.ylabel("租金/元")
fig.autofmt xdate(rotation=15)
plt.tight layout(pad=1)
plt.show()

 The running result is shown in Figure 2.

picture

■Figure 2 The number of rented houses and the distribution of rent in the top 20 streets

It can be seen that the streets with the most expensive rents are Guanyuan, Andingmen, Xuanwumen, Xishan, and Baishiqiao. Except for Guanyuan, which has the highest rent, the rents of other streets are not much different, so there is no strong correlation between street attributes and rents.

(4) Draw the proportion of the top 10 house types, the code is as follows.

# 根据房屋户型分组
dfl=df .groupby('model')
#计算房屋户型数量,排序并取前 10 名
df model=dfl.count()['ID'].sort values(axis=0,ascending=False)[:10]
model=df model.index.tolist ()
# 计算房屋数量
count=df model.values.tolist()#绘制房屋户型占比饼图
plt.pie(count,labels=model,autopct='%1.2f%')# 设置图形标题
plt.title('房屋户型前 10 名的占比情况,fontproperties='stkaiti'fontsize=14)
plt.show()

 The running result is shown in Figure 3.

picture

It can be seen that most houses have 2 bedrooms, 1 living room and 1 bathroom, 1 bedroom, 1 living room and 1 bathroom, 3 bedrooms, 1 living room and 1 bathroom, and 3 bedrooms, 2 living rooms and 2 bathrooms.

 

Guess you like

Origin blog.csdn.net/qq_41640218/article/details/132487714