Analysis of tourist attractions in Yunnan based on Python

As a native of Yunnan, as a data analyst, I will use Python to introduce you to relevant scenic spots in Yunnan!

Welcome everyone to travel to Yunnan! The information includes the data of the scenic spots in various places. I hope that you can do relevant exercises after learning, and you will succeed in your studies!

I put the required information here, and you can extract it by yourself:

Link: https://pan.baidu.com/s/16ziypbHZL-ZNNxnVQ2-iXg 
Extraction code: yunn

Tools to use: Jupyter Notebooks Recommended to view the link to download and learn independently: Introduction to the installation and use of Jupyter Notebooks_LarsCheng's Blog-CSDN Blog_jupyter

 Analysis of "South of Colorful Clouds" Tourist Attractions

1. Import the required packages

Pandas — data processing

Pyecharts — data visualization

jieba — Word Segmentation ( two packages jieba and wordcloud are often imported by jupyter for text analysis )

collections — statistics

!pip install --upgrade pyecharts
#升级 pyecharts 包,地图显示部分需要用到 pyecharts==1.9.0以上版本

import jieba#中文分词处理
import pandas as pd #数据处理库
from collections import Counter#数据统计库
from pyecharts.charts import Line,Pie,Scatter,Bar,Map,Grid#pyecharts数据可视化
from pyecharts.charts import WordCloud
from pyecharts import options as opts
from pyecharts.globals import ThemeType
from pyecharts.globals import SymbolType
from pyecharts.commons.utils import JsCode

2. Data processing

2.1 Read the Yunnan tourist attractions dataset and display part of it

Yunnan= pd.read_excel('云南旅游景点.xlsx')
Yunnan.head()

2.2 View the index, data type and memory information of the Yunnan tourist attractions dataset

Yunnan.info()

Yunnan.shape#处理前总共有75行,11列


 

Conclusion: It can be seen from the info that the data set processed this time has 11 features

2.3 View the summary statistics of the numerical columns of the Yunnan tourist attractions dataset

Yunnan.describe()

Conclusion: From describe, it can be seen that the data set analyzed this time has three columns of numerical data, which are score, price and sales

2.4 Find the row with 0 sales

Yunnan.loc[Yunnan['销量']==0,:].head()

Conclusion: No scenic spots with 0 sales were found, and the tourism industry in Yunnan is booming

Yunnan.loc[Yunnan['销量']>0,:].head(75)

Conclusion: All 75 tourist attractions in Yunnan are visited by tourists, and Yunnan Province will vigorously develop tourism

2.5 Count the null values ​​of each feature column

Yunnan.isnull().sum()

Conclusion: The data is very complete, only star ratings have empty values. The quality ratings of scenic spots in the People's Republic of China are divided into five levels, from high to low, they are AAAAA, AAAA, AAA, AA, and A. Among them, the scenic spots temporarily fail to meet the national requirements or the classification of new scenic spots is temporarily empty.

The data set of this statistics is only the most famous 75 scenic spots in Yunnan, of which 37 are national (4A and 5A account for the majority, and all are above 3A), and 38 have not yet been evaluated but are still well-known in Yunnan scenic spots. It can be seen that Yunnan Province is a major tourist province in China.

2.6 Fill the star missing value with 'pending'

Yunnan['星级'].fillna('待定', inplace=True)
Yunnan.isnull().sum()

Conclusion: After processing, the data is complete and reliable, and data analysis can be performed

2.7 Reorder the data set by sales order (from high to low)

Yunnan.sort_values('销量', ascending=False).head(75)

Conclusion: the highest is Colorful Yunnan Happy World, and the lowest is City of Flowers

3. Data Analysis and Visualization¶

3.1 Data of Top 20 Popular Attractions in Yunnan

# 线性渐变
color_js = """new echarts.graphic.LinearGradient(0, 0, 1, 0,
    [{offset: 0, color: '#009ad6'}, {offset: 1, color: '#ed1941'}], false)"""


sort_info = Yunnan.sort_values(by='销量', ascending=True)
b1 = (
    Bar()
    .add_xaxis(list(sort_info['名称'])[-20:])
    .add_yaxis('云南热门景点销量', sort_info['销量'].values.tolist()[-20:], itemstyle_opts=opts.ItemStyleOpts(color=JsCode(color_js)))
    .reversal_axis()
    .set_global_opts(
        title_opts=opts.TitleOpts(title='云南热门景点销量数据'),
        yaxis_opts=opts.AxisOpts(name='景点名称'),
        xaxis_opts=opts.AxisOpts(name='销量'),
    )
       .set_series_opts(label_opts=opts.LabelOpts(position="right"))

)
# 将图形整体右移
g1 = (
    Grid()
    .add(b1, grid_opts=opts.GridOpts(pos_left='20%', pos_right='5%'))
)
g1.render_notebook()

Conclusion: You can choose the above popular attractions when traveling in Yunnan during holidays

3.2 Map distribution of holiday travel data in Yunnan

Yunnan_tmp1 = Yunnan[['城市','销量']]
Yunnan_counts = Yunnan_tmp1.groupby('城市').sum()
m1 = (
        Map()
        .add('云南假期出行分布', [list(z) for z in zip(Yunnan_counts.index.values.tolist(), Yunnan_counts.values.tolist())], '云南')
        .set_global_opts(
        title_opts=opts.TitleOpts(title='云南假期出行数据地图分布'),
        visualmap_opts=opts.VisualMapOpts(max_=100000, is_piecewise=False,range_color=["white", "#fa8072", "#ed1941"]),
        )
    )
m1.render_notebook()

Conclusion: Kunming City and Xishuangbanna Dai Autonomous Prefecture account for the bulk of Yunnan tourism

3.3 Histogram of the number of 4A-5A scenic spots in each city of Yunnan Province

# 线性渐变
color_js = """new echarts.graphic.LinearGradient(0, 1, 0, 0,
    [{offset: 0, color: '#009ad6'}, {offset: 1, color: '#ed1941'}], false)""" 

Yunnan_tmp2 =Yunnan[Yunnan['星级'].isin(['4A', '5A'])]
Yunnan_counts = Yunnan_tmp2.groupby('城市').count()['星级']
b2 = (
        Bar()
            .add_xaxis(Yunnan_counts.index.values.tolist())
            .add_yaxis('4A-5A景区数量', Yunnan_counts.values.tolist(),itemstyle_opts=opts.ItemStyleOpts(color=JsCode(color_js)))
            .set_global_opts(
            title_opts=opts.TitleOpts(title='云南省各城市4A-5A景区数量'),
            datazoom_opts=[opts.DataZoomOpts(), opts.DataZoomOpts(type_='inside')],
        )
    )
b2.render_notebook()

Conclusion: The first choice of 4A and 5A tourist cities in Yunnan Province is Xishuangbanna Dai Autonomous Prefecture, followed by Kunming, the provincial capital, followed by Lijiang City and Dali Bai Autonomous Prefecture. You can choose tourist attractions in Yunnan according to this (data suggestion)

3.4 Rose diagram of the number of 4A-5A scenic spots in the South of Caiyun

Yunnan0 = Yunnan_counts.copy()
Yunnan0.sort_values(ascending=False, inplace=True)
c1 = (
    Pie()
    .add('', [list(z) for z in zip(Yunnan0.index.values.tolist(), Yunnan0.values.tolist())],
         radius=['30%', '100%'],
         center=['50%', '60%'],
         rosetype='area',
         )
    .set_global_opts(title_opts=opts.TitleOpts(title='地区景点数量'),
                     legend_opts=opts.LegendOpts(is_show=False),
                     toolbox_opts=opts.ToolboxOpts())
    .set_series_opts(label_opts=opts.LabelOpts(is_show=True, position='inside', font_size=12,
                                               formatter='{b}: {c}', font_style='italic',
                                               font_weight='bold', font_family='Microsoft YaHei'
                                               ))
)
c1.render_notebook()

3.5 Shadow scatter diagram of the number of 4A-5A scenic spots in Yunnan Province

item_style = {'normal': {'shadowColor': '#000000', 
                         'shadowBlur': 20,
                         'shadowOffsetX':5, 
                         'shadowOffsetY':15
                         }
              }
s1 = (
        Scatter()
        .add_xaxis(Yunnan_counts.index.values.tolist())
        .add_yaxis('4A-5A景区数量', Yunnan_counts.values.tolist(),symbol_size=50,itemstyle_opts=item_style)
        .set_global_opts(visualmap_opts=opts.VisualMapOpts(is_show=False, 
                                              type_='size',
                                              range_size=[5,50]))
)
s1.render_notebook()

3.6 Map distribution of 4A-5A scenic spots in Yunnan Province

Yunnan_tmp3 = Yunnan[Yunnan['星级'].isin(['4A', '5A'])]
Yunnan_counts = Yunnan_tmp3.groupby('城市').count()['星级']
m2 = (
    Map()
    .add('云南省4A-5A景区分布', [list(z) for z in zip(Yunnan_counts.index.values.tolist(), Yunnan_counts.values.tolist())], '云南')
    .set_global_opts(
    title_opts=opts.TitleOpts(title='云南省地图数据分布'),
    visualmap_opts=opts.VisualMapOpts(max_=12, is_piecewise=True),
    )
)
m2.render_notebook()

3.7 Rose Chart of the Proportion of Ticket Price Range in Yunnan Province

price_level = [0, 50, 100, 150, 200, 250, 300, 350, 400, 500]    
label_level = ['0-50', '50-100', '100-150', '150-200', '200-250', '250-300', '300-350', '350-400', '400-500']    
jzmj_cut = pd.cut(Yunnan['价格'], price_level, labels=label_level)        
Yunnan_price = jzmj_cut.value_counts()
Yunnan_price  #门票价格文本统计

Conclusion: The price/performance ratio of tourist attractions in Yunnan is also very high, the star rating of the attractions is very high, and the price is very affordable

p1 = (
    Pie(init_opts=opts.InitOpts(
            width='800px', height='600px',
            )
       )
        .add(
        '',
        [list(z) for z in zip(Yunnan_price.index.tolist(), Yunnan_price.values.tolist())],
        radius=['20%', '60%'],
        center=['40%', '50%'],
        rosetype='radius',
        label_opts=opts.LabelOpts(is_show=True),
        )    
        .set_global_opts(title_opts=opts.TitleOpts(title='门票价格占比',pos_left='33%',pos_top="5%"),
                        legend_opts=opts.LegendOpts(type_='scroll', pos_left="80%",pos_top="25%",orient="vertical")
                        )
        .set_series_opts(label_opts=opts.LabelOpts(formatter='{b}: {c} ({d}%)'),position='outside')
    )
p1.render_notebook()#门票价格占比玫瑰图

3.8 Scatter diagram of the number of ticket price ranges in Yunnan

color_js = """new echarts.graphic.RadialGradient(
                    0.5, 0.5, 1,
                    [{offset: 0,
                      color: '#009ad6'},
                     {offset: 1,
                      color: '#ed1941'}
                      ])"""
 
s2 = (
        Scatter()
        .add_xaxis(Yunnan_price.index.tolist())
        .add_yaxis('门票价格区间', Yunnan_price.values.tolist(),symbol_size=50,itemstyle_opts=opts.ItemStyleOpts(color=JsCode(color_js))) 
        .set_global_opts(
            yaxis_opts=opts.AxisOpts(name='数量'),
            xaxis_opts=opts.AxisOpts(name='价格区间(元)'))
        .set_global_opts(visualmap_opts=opts.VisualMapOpts(is_show=False, 
                                              # 设置通过图形大小来表现数据
                                              type_='size',
                                              # 图形大小映射范围
                                              range_size=[5,50]))
)
s2.render_notebook()

3.9 Introduction word cloud of Caiyunzhinan attractions

contents = "".join('%s' % i for i in Yunnan['简介'].values.tolist())
contents_list = jieba.cut(contents)
ac = Counter(contents_list)
 
stopwords = []
with open('stopwords.txt', "r",encoding='utf-8') as f:  # 打开文件
    data = f.read()  # 读取文件
    stopwords = data.split('\n')
 
for i in stopwords:
    del ac[i]
 
w1 = (
    WordCloud()
    .add("", 
         ac.most_common(150), 
         word_size_range=[5, 100], 
         textstyle_opts=opts.TextStyleOpts(font_family="cursive"),
        shape='star')
    .set_global_opts(title_opts=opts.TitleOpts(title="景点简介词云"))
)
w1.render_notebook()

 Welcome everyone to visit Yunnan while learning code!

Guess you like

Origin blog.csdn.net/qq_58012062/article/details/128084811