Visual Analysis of Starbucks Store Distribution
Project introduction: Visual analysis of Starbucks store distribution using python
Data background: The data source comes from Kaggle: Starbucks Locations Worldwide | Kaggle, which includes the basic information of global Starbucks stores as of February 2017, including a series of detailed information such as brand name, house address, country, latitude and longitude.
Data introduction:
Field Name | explain |
---|---|
Brand | brand name |
Store Number | store number |
Store name | store name |
Ownership Type | store ownership type |
Street Address | State/Province |
City | The city where the store is located |
State/Province | The province where the store is located |
Country | The country where the store is located |
Postcode | The zip code where the store is located |
Phone Number | Store contact number |
Timezone | The time zone where the store is located |
Longitude | The longitude of the store address |
Latitude | Latitude of store address |
centered text centered | right-aligned text right |
task overview
- How many brands does Starbucks own?
- Statistics of how many countries have opened Starbucks stores in the world, showing the top five and bottom ten countries
- Display the top ten cities with the number of Starbucks stores
- According to the distribution of Starbucks in China, the top ten cities are counted
- There are several ways to use pie charts to show Starbucks' business
Import the necessary packages
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['font.sans-serif'] = ['SimHei'] # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False # 用来正常显示负号
read data and view
data = pd.read_csv(r'./Desktop/directory.csv.csv')
data.head()
View missing values
data.isnull().sum()
Use the isnull() function to count the missing values of the data, and you can find the missing data in each column. In this data, there are many missing values in the city, postcode, and phone number fields, but the indicators processed by this task are not related to them. so don't process it
Count how many brands Starbucks owns
num = len(data['Brand'].unique())
print('星巴克旗下有%d个品牌'%num)
data['Brand'].value_counts()
Use the unique () function to de-duplicate the "Brand" field to get the number of brands under Starbucks, and then use the value_counts function to count the number of stores of each brand. It is found that Starbucks has a total of 4 brands , of which Starbucks has the largest number of stores , reached 25,249 .
Print out how many countries have opened Starbucks stores in the world, and display the top 5 and bottom 10 countries with the number of stores.
country_num = len(data['Country'].unique())
print('全国一共有%d个国家开设了星巴克门店'%country_num)
Then use the groupby function to group and aggregate 'country', count the top five countries with the number of stores, and sort them in descending order
dff = data.groupby(["Country"]).size().reset_index()
dff.columns = ['country','number']
dff.sort_values(by = ['number'],ascending = False).head()
Sure enough, Starbucks has the largest number of stores in the United States, reaching 13,608 , which is nearly 6 times that of the second-ranked China. We visualize it as a diagram to make it clearer.
plt.figure(figsize=(20,8))
plt.bar(top_5.index,top_5,width=0.5,color=['r','g','b','y','m'])
plt.xlabel('国家',fontsize = 20)
plt.ylabel('门店数量',fontsize = 20)
plt.xticks(fontsize = 20)
plt.yticks(fontsize = 20)
plt.grid(linestyle = '--',alpha = 0.5)
plt.title('星巴克全球门店数量排名前五的国家',fontsize = 20)
plt.show()
- The ten countries with the least number of stores in the world
After exploring the countries with the largest number of stores in the world, let's take a look at which countries Starbucks has the least number of stores
plt.figure(figsize=(20,8))
plt.bar(tail_10.index,tail_10,color = 'brown')
plt.xlabel('国家',fontsize = 20)
plt.ylabel('门店数量',fontsize = 20)
plt.xticks(fontsize = 20)
plt.yticks(fontsize = 20)
plt.title('星巴克全球门店数量最少十个国家',fontsize = 20)
plt.show()
As can be seen from the figure above, the three countries AD, LU, and MC have the least number of stores, and there is only one store in AD country
Display the top 10 cities with the number of Starbucks stores
After looking at the countries with the largest number of stores, we will explore the 10 cities with the largest number of Starbucks stores in the world and visualize them
len(data['City'].unique())
city_count = data['City'].value_counts().head(10)
city_count
plt.figure(figsize=(20,8))
plt.bar(city_count.index,city_count)
plt.xlabel('城市',fontsize = 15)
plt.ylabel('门店数量',fontsize = 15)
plt.xticks(fontsize = 15)
plt.yticks(fontsize = 15)
plt.title('星巴克全球门店数量排名前十的城市',fontsize = 20)
plt.show()
We found that the city with the largest number of stores in the world is actually Shanghai, with 542 stores , which is worthy of being a magic city. The second and third places are Seoul and our capital Beijing, both of which have similar numbers. Seattle, the 10th place, is the headquarters of Starbucks. In addition to this reason, perhaps Seattle programmers also contributed a lot of turnover
According to the distribution of Starbucks stores in China, the top 10 cities are counted
df = data[data['Country'] =='CN'] #先把中国的门店数据提取出来
df2 = df.groupby(["City"]).size().reset_index()# 利用groupby分组聚合
df2.columns = ['city','number']
df2.sort_values(by = ['number'],ascending = False).head(10)# 按各城市门店数量降序排列 取前十
Not surprisingly, it is found that the cities with the largest number of Starbucks stores in China are cities with highly developed domestic GDP, mainly in the Pearl River Delta, Yangtze River Delta and Beijing. Only these developed cities can support Starbucks' high consumption. Next Let's visualize it:
plt.figure(figsize=(20,8))
plt.bar(china_city.index,china_city,color = 'c')
plt.xlabel('城市',fontsize = 15)
plt.ylabel('门店数量',fontsize = 15)
plt.xticks(fontsize = 15)
plt.yticks(fontsize = 15)
plt.title('星巴克在中国门店数量排名前十的城市',fontsize=20)
plt.show()
There are several ways to use pie charts to show the operation of Starbucks stores
#绘制饼图
plt.figure()
plt.pie(work_style,labels=data['Ownership Type'].value_counts().index,autopct='%1.2f%%')
plt.axis('equal')
plt.legend()
plt.title('星巴克的经营方式')
plt.show()
Starbucks operates in four main ways, of which company owned companies account for 46.6%, nearly half, followed by licensed operations, accounting for 36.6%, and the rest are joint ventures and franchises.
Summarize
This article sorts the countries and cities in China according to the number of Starbucks stores. It mainly uses the groupby method of DataFrame in pandas for grouping and aggregation, the value_counts function for value statistics, and uses the DataFrame.reset_index() method to re-specify the index and sort() method Sorting and matplotlib library for histogram, pie chart drawing.