Use Plotly to draw various charts

Use Plotly to draw various charts

Plotly section

Installing Plotly

Plotly is relatively new and is not included in the anaconda environment and needs to be installed separately.
Pycharm or anaconda find Plotly and click install to install it.

You can enter it on the terminal: pip3 install plotly
You can enter it on the command line: pip install plotly
You can install it in the anaconda environment: conda install plotly
If the download speed is slow, you can use the Tsinghua source:pip install -i https://pypi.tuna.tsinghua.edu.cn/simple plotly

Check if the installation is successful

import plotly
from plotly import __version__
print(__version__)

The version displayed here means that the installation is successful. I am using version 4.14.3.

Use Plotly to draw the first graph

Use the offline version: plotly.offline
the easiest way to draw: iplot([数据]), plotand iplotthe biggest difference is whether to create a new webpage to display the chart, which will be demonstrated later.
First look at iplotthe schema.

from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
dic1 = {
    
    'x':[1,2,3,4],
        'y':[1,3,5,8]
       }
iplot([dic1])
# 可以试试plot([dic1])感受区别

insert image description here

It can be seen that the obvious difference between the graph drawn by plotly and that drawn by matplotlib is:
1. The data in the chart can display specific values ​​(interactivity);
2. There are many tools for viewing the details of the image in the upper right corner (extended sex).

Attempt to plot large amounts of data

Use go.Scatter to store data

import plotly.graph_objects as go
import numpy as np
import random
x = np.random.randn(30)
y = np.random.randn(30)
go.Scatter(x=x,y=y)

When there is a lot of data, the data is often stored go.Scatterfirst , and then called collectively.
insert image description here

Call go.Scatter data to draw a scatter plot

iplot([go.Scatter(x=x,y=y)])
# iplot([数据]),注意这里数据是放在中括号内

insert image description here
Such data is very messy. In fact, we only need to draw a scatter plot. Here we need to set the mode:mode='markers'

iplot([go.Scatter(x=x,y=y,mode='markers')])

insert image description here

Scatterplot

Go object standard writing

import plotly
import plotly.graph_objs as go
import numpy as np
from plotly.offline import download_plotlyjs , init_notebook_mode,plot ,iplot
n =1000
x = np.random.randn(n)
y = np.random.randn(n)
trace = go.Scatter(x=x, y=y, mode='markers', marker=dict(color='red',size=3,opacity=0.5))
data=[trace]
iplot(data)

The standard way of writing the go.Scatter statement:
the first step is to generate data.
The second step is to put the data into the go object. Assign go.Scatter()the value in a variable , and adjust the details 例子用tracein go.Scatter , which represents the color, the size of the scatter point, and the transparency. The third step is to create a variable to store the go object. The fourth step, the (optional) variable is an array list, so more than one go object can be stored. The fourth step, drawing.marker=dict()colorsizeopacity
data
data
iplot(data)
insert image description here

pie chart

groups=['餐食','账单','娱乐','其他']
amount=[1000,500,1500,300]
colors=['#d32c58','#f9b1ee','#b7f9b1','#b1f5f9']
trace=go.Pie(labels=groups, values=amount)
data=[trace]
iplot(data)

insert image description here
Enrich the details:

trace=go.Pie(labels=groups, values=amount, hoverinfo='label+percent', textinfo='value',
             textfont=dict(size=25), marker=dict(colors=colors,line=dict(color='#000000',width=3)))
# hoverinfo='label+percent':显示标签+百分比
# textinfo='value':饼图上文字显示value值
# textfont=dict(size=25):文字大小25号
# marker=dict(colors=colors,line=dict(color='#000000',width=3)):颜色用colors内的颜色,线条用黑色,宽度3。
data=[trace]
iplot(data)

insert image description here
It is also possible to display only the required part, and the percentage will be recalculated:
insert image description here

Draw using custom data (PUBG eating chicken game data)

Data Sources

Website: kaggle.com

Use Pandas linkage Plotly

Use Pandas to read csv file data:

from plotly.offline import download_plotlyjs , init_notebook_mode,plot ,iplot
import plotly.graph_objs as go
import pandas as pd
pubg = pd.read_csv("PUBG.csv")
pubg.head()

insert image description here
Visualization requires the data to be of numeric type, and infoyou can view it with a statement:

pubg.info()

insert image description here
View fields:

pubg.columns

insert image description here

Working with data structures

df_pubg = pubg.apply(pd.to_numeric,errors = 'ignore')
# 所有的数据转化成数值类型,错误忽略
df_new_pubg = df_pubg.head(100)

Draw a scatterplot

trace = go.Scatter(x = df_new_pubg.solo_RoundsPlayed ,y = df_new_pubg.solo_Wins , name = 'Rounds Won' ,mode='markers')
layout = go.Layout(title =" PUBG win vs round played " ,plot_bgcolor='rgb(230,230,230)' ,showlegend=True)
# 对输出内容进行设置
# plot_bgcolor:背景颜色
# showlegend=True:显示图示
fig = go.Figure(data=[trace] , layout=layout)
# 把trace和layout组合在一张画布上
iplot(fig)

insert image description here

Two sets of data to draw a histogram

trace1 = go.Bar(x=df_new_pubg.player_name, y=df_new_pubg.solo_RoundsPlayed, name='Rounds Play')
trace2 = go.Bar(x=df_new_pubg.player_name, y=df_new_pubg.solo_Wins, name='Wins')
layout = go.Layout(barmode='group')
fig = go.Figure(data=[trace1,trace2] , layout=layout)
iplot(fig)

insert image description here

density density map

The data follows the PUGB chicken data file of Data Visualization Analysis 2.3 .

from plotly.offline import download_plotlyjs , init_notebook_mode,plot ,iplot
import plotly.graph_objs as go
import pandas as pd
pubg = pd.read_csv("PUBG.csv")
df_pubg = pubg.apply(pd.to_numeric,errors = 'ignore')
df_new_pubg = df_pubg.head(100)
import plotly.figure_factory as ff

A 2D chart requires two sets of data:

x = df_new_pubg.solo_Wins
y = df_new_pubg.solo_TimeSurvived

Setting parameters:

colorscale = ['#7A4579','#D56073','rgb(236,158,105)',(1,1,0.2),(0.98,0.98,0.98)]

Do not add parameters to see the drawing effect

fig=ff.create_2d_density(x,y)
iplot(fig ,filename='histgram_subplot')

insert image description here
Color optimization via palette:

fig = ff.create_2d_density(x,y , colorscale= colorscale)

insert image description here
The color of the density map and the histogram here are not consistent, and then adjust the color of the histogram:

fig = ff.create_2d_density(x,y , colorscale= colorscale ,hist_color='rgb(255,237,222)' , point_size= 5)

insert image description here

3d scatter plot

A 3d map requires three pieces of data:

x = df_new_pubg.solo_Wins
y = df_new_pubg.solo_TimeSurvived
z = df_new_pubg.solo_RoundsPlayed

Use gothe statement to traceassign a value:

trace1 = go.Scatter3d(
    x=x,
    y=y,
    z=z,
    mode='markers'
)
data=[trace1]
fig=go.Figure(data=data)
iplot(fig)

insert image description here
Optimize the marker parameters:

trace1 = go.Scatter3d(
    x=x,
    y=y,
    z=z,
    mode='markers',
    marker=dict(
        size=12,
        color=z,
        colorscale='Viridis',
        # 采用Viridis调色板
        opacity=0.8, 
        showscale =True
        # 增加图示
    )
)

insert image description here
The lighter the color the more the surface is played with.
Add another layout file:

layout = go.Layout(margin=dict(
    l=0,
    r=0,
    t=0,
    b=0
))
fig = go.Figure(data=data , layout=layout)
iplot(fig,filename='3d')

insert image description here

online mapping

Interactive visualization on the web is one of Plotly's most powerful features.
First, you need to click Sign Up on the official website of plotly to register an account: After logging in on the official website of plotly, click Settings: then find the API interface: generate a temporary password: install the library: start uploading pictures:

insert image description here

insert image description here

insert image description here
chart_studio
insert image description here

import chart_studio
import chart_studio.plotly as py
chart_studio.tools.set_credentials_file(username='D_Ddd0701',api_key='ZDBddR6QXiKshV9xdMwu')
# 输入网站上注册的用户名和生成的API
init_notebook_mode(connected=True)
# 笔记本和线上做连接
# 把刚才的代码复制过来,加入py.iplot
fig = go.Figure(data=data , layout=layout)
py.iplot(fig,filename='3d')

insert image description here
Click the EDIT button in the lower right corner, enter the webpage and find that the picture has been uploaded, click Save to store it.
insert image description here
After confirming the content, click Save again:
insert image description here
the picture has been uploaded to the personal File at this time:
insert image description here
click Viewer to browse, or click Editor to edit.
insert image description here
Similarly, you can also visit the pictures made by others to learn how others did it.

Real-time financial data plotting

This case uses Apple and Tesla stocks as examples.

Import data using Pandas

from plotly.offline import download_plotlyjs , init_notebook_mode,plot ,iplot
import plotly.graph_objects as go
import pandas as pd
df = pd.read_csv('APPL.csv')
df.head()

insert image description here
You can see that the data includes date, opening and closing volume and other data.

draw a chart

Use gothe statement to store data:

trace1=go.Scatter(
    x=df['Date'],
    y=df['AAPL.Close']
)
iplot([trace1])

insert image description here
Use the same method to draw Tesla. Here we plot the highest and lowest prices for Tesla stock.

df2 = pd.read_csv('Tesla.csv')
trace_a = go.Scatter(
         x = df2.date,
         y = df2.high,
         name = "Tesla High",
         line = dict(color = '#17BECF'),
         opacity =0.8

)
trace_b = go.Scatter(
         x = df2.date,
         y = df2.low,
         name = "Tesla Low",
         line = dict(color = '#7f7f7f'),
         opacity =0.8

)
data=[trace_a, trace_b]
iplot(data)

insert image description here
Make some adjustments to the layout Layout:

layout = dict(title = "Tesla stock High vs Low")
fig = dict(data = data,layout = layout)
iplot(fig)

insert image description here
Then add a close price line:

trace_c = go.Scatter(
         x = df2.date,
         y = df2.close,
         name = "Tesla Close",
         line = dict(color = '#7f7f7f'),
         opacity =0.8
)
data =[trace_a,trace_b,trace_c]

insert image description here

Introducing financial features - range selectors

import plotly.express as px
fig = px.line(df2 , x='date',y='close') # 导入df2数据,x轴是date,y是close
fig.update_xaxes(rangeslider_visible=True) # 范围选择器
fig.show()

insert image description here
Look at Apple's:

fig = px.line(df, x='Date', y='AAPL.High', title='Time Series with Rangeslider')
fig.update_xaxes(rangeslider_visible=True)
fig.show()

insert image description here

Introduce financial features - daily line, 5-day line and other shortcut keys

fig = px.line(df, x='Date', y='AAPL.High', title='Time Series with Rangeslider')

fig.update_xaxes(rangeslider_visible=True,
                 rangeselector = dict(
                 	buttons=list([
						dict(count=1,label="1d",step="day",stepmode="backward"),
                     	dict(count=5,label="5d",step="day",stepmode="backward"),
                     	dict(count=1,label="1m",step="month",stepmode="backward"),
                     	dict(count=3,label="3m",step="month",stepmode="backward"),
                     	dict(count=6,label="6m",step="month",stepmode="backward"),
                     	dict(count=1,label="1y",step="year",stepmode="backward"),
                     	dict(step="all")   #恢复到之前                  
                 ])
                 )                                               
                )
fig.show()

insert image description here

Introducing Financial Features - Candle Charts

Candlestick code:

go.Candlestick(
               x=日期,
               open=开盘价,
               high=最高价,
               low=最低价,
               close=收盘价
              )
fig = go.Figure(data=[go.Candlestick(
                        x=df['Date'],
                        open=df['AAPL.Open'],
                        high=df['AAPL.High'],
                        low=df['AAPL.Low'],
                        close=df['AAPL.Close']
                        )
                     ]
                )
ig = go.Figure(data=[go.Candlestick(
                        x=df['Date'],
                        open=df['AAPL.Open'],
                        high=df['AAPL.High'],
                        low=df['AAPL.Low'],
                        close=df['AAPL.Close']
                        )
                     ]
                )
fig.update_xaxes(rangeslider_visible=True,
                 rangeselector = dict(
                     buttons=list([
                         dict(count=1,label="日",step="day",stepmode="backward"),
                         dict(count=5,label="五日",step="day",stepmode="backward"),
                         dict(count=1,label="月线",step="month",stepmode="backward"),
                         dict(count=3,label="季线",step="month",stepmode="backward"),
                         dict(count=6,label="半年线",step="month",stepmode="backward"),
                         dict(count=1,label="年线",step="year",stepmode="backward"),
                         dict(step="all")                    
                 ])
                 )               
                )
fig.show()

insert image description here

Introducing Financial Characteristics - Indicators

import cufflinks as cf
cf.set_config_file(offline=True,world_readable=True) #设置offline=True和python关联

Generate k-line data with the built-in simulator of cf:

df= cf.datagen.ohlc()
df.head()

insert image description here

qf=cf.QuantFig(df) # 把df内的数据变为金融数据

insert image description here
Use qf.iplot()the K-line diagram to draw:

qf.iplot()

insert image description here

Add macd indicator

Add macd indicator:qf.add_macd()

qf.add_macd()
qf.iplot()

insert image description here

Increase the rsi indicator

Increase the rsi indicator: qf.add_rsi()
insert image description here
Here, according to the individual, enter the value:

qf.add_rsi(6,80) #周期6天触发值80
qf.iplot()

insert image description here

Add Bollinger Bands Channel

Add Bollinger Bands Channel:qf.add_bollinger_bands()

qf.add_bollinger_bands()
qf.iplot()

insert image description here
You can also click on the upper right corner to turn off the indicator.

Use heatmap to draw a heat map

Import class library:

import pandas as pd
import numpy as np
import chart_studio.plotly as py
import seaborn as sns
import plotly.express as px
%matplotlib inline #代表所有绘制的图表都内嵌在网页中
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)

Import data using Pandas

flights = sns.load_dataset("flights")
flights.head()

The self-contained database is used here seaborn, and errors may occur IncompleteRead. Solutions:
1. The csv file download address , click to download.
2. Put the downloaded copy of fights.csv into the seaborn-data folder.

insert image description here
View data types:flights.info()
insert image description here

Draw a heat map

A heat map requires three pieces of data.

fig = px.density_heatmap(flights, x ='year' , y ='month' , z= 'passengers')

insert image description here
换个颜色:color_continuous_scale='配色器'
可选:One of the following named colorscales:
[‘aggrnyl’, ‘agsunset’, ‘algae’, ‘amp’, ‘armyrose’, ‘balance’,
‘blackbody’, ‘bluered’, ‘blues’, ‘blugrn’, ‘bluyl’, ‘brbg’,
‘brwnyl’, ‘bugn’, ‘bupu’, ‘burg’, ‘burgyl’, ‘cividis’, ‘curl’,
‘darkmint’, ‘deep’, ‘delta’, ‘dense’, ‘earth’, ‘edge’, ‘electric’,
‘emrld’, ‘fall’, ‘geyser’, ‘gnbu’, ‘gray’, ‘greens’, ‘greys’,
‘haline’, ‘hot’, ‘hsv’, ‘ice’, ‘icefire’, ‘inferno’, ‘jet’,
‘magenta’, ‘magma’, ‘matter’, ‘mint’, ‘mrybm’, ‘mygbm’, ‘oranges’,
‘orrd’, ‘oryel’, ‘oxy’, ‘peach’, ‘phase’, ‘picnic’, ‘pinkyl’,
‘piyg’, ‘plasma’, ‘plotly3’, ‘portland’, ‘prgn’, ‘pubu’, ‘pubugn’,
‘puor’, ‘purd’, ‘purp’, ‘purples’, ‘purpor’, ‘rainbow’, ‘rdbu’,
‘rdgy’, ‘rdpu’, ‘rdylbu’, ‘rdylgn’, ‘redor’, ‘reds’, ‘solar’,
‘spectral’, ‘speed’, ‘sunset’, ‘sunsetdark’, ‘teal’, ‘tealgrn’,
‘tealrose’, ‘tempo’, ‘temps’, ‘thermal’, ‘tropic’, ‘turbid’,
‘turbo’, ‘twilight’, ‘viridis’, ‘ylgn’, ‘ylgnbu’, ‘ylorbr’,
‘ylorrd’]
选择viridis试试。

fig = px.density_heatmap(flights, x ='year' , y ='month' , z= 'passengers' , color_continuous_scale='viridis')

insert image description here

Statistics with histogram

To count the total number of data contained in the x and y axes, you need to usemarginal_x="histogram" ,marginal_y="histogram"

fig = px.density_heatmap(flights, x ='year' , y ='month' , z= 'passengers' ,marginal_x="histogram"   ,marginal_y="histogram")

insert image description here

React heatmap with 3D line chart

fig = px.line_3d(flights , x ='year' , y ='month' , z= 'passengers' ,color='year') # color='year'表示每一年的数据用不同颜色

insert image description here
This is consistent with the heat map response, as the number of years increases, the number of people also increases. July data is generally the largest.
insert image description here
If you only look at the data of these few years, then the displayed change law will be more intuitive.

Use scatter to draw a scatter plot

fig = px.scatter_3d(flights , x ='year' , y ='month' , z= 'passengers' ,color='year')

insert image description here
This is similar to the result shown in the line chart.

Use scatter_matrix to draw a scatter matrix

But sometimes, we hope to display the information of the 3D graph with a 2D graph. The 3D graph of the data in the case just now involves three sets of variables x, y, and z. If it is displayed in 2D, the relationship between xy, xz, and yz needs to be displayed. Draw it with three graphs. Functions are needed here scatter_matrix.

fig = px.scatter_matrix(flights,color="month")

insert image description here
Next, we use the classic machine learning data Iris flower data set to do another visual analysis.
The Iris data includes attribute values ​​of four different dimensions of three species of iris. We want to separate the three types of flowers through the dataset.

df = px.data.iris()
df.head()

insert image description here
Use the scatter matrix just now to analyze:

fig = px.scatter_matrix(df,color="species")

insert image description here
It can be seen that using petal_length and petal_width to draw points is relatively open.
So use these two elements to draw a scatterplot separately:

fig  = px.scatter(df , x ='petal_length' , y='petal_width' , color='species' )

insert image description here
It is found that this picture does not give enough information, so we add another distinction:size='petal_length'

fig  = px.scatter(df , x ='petal_length' , y='petal_width' , color='species', size='petal_length')

insert image description here
You will find that setosa is relatively small, and virginica is relatively large. This picture can separate the blue, and the red and green overlap, and the separation is more complicated. So you can try 3D.

fig  = px.scatter_3d(df , x ='petal_length' , y='petal_width' ,z='sepal_width' , color='species' ,size='petal_length')

insert image description here
The 3D map finds that there will be a section in space that can separate the three.

Use scatter_geo to draw geographic scatter plots

Use px's own database

Use the built-in database 2007 gdp data:

df = px.data.gapminder().query("year == 2007")
df.head()

insert image description here

Mapping geographic information

fig = px.scatter_geo(df,locations="iso_alpha")
# locations='iso_alpha'表示自动适配地理信息

insert image description here
To make the display more recognizable, add other parameters:

# color="continent"表示按洲不同颜色不同
# hover_name="lifeExp"表示显示数据集中lifeExp数值
# size='pop'表示用数据集中pop数据区别大小
# projection='orthographic'表示用地球投影模式
fig = px.scatter_geo(df,locations="iso_alpha",color="continent",hover_name="lifeExp",size='pop',projection='orthographic')

insert image description here
可以用的投影模式有:
One of the following enumeration values:
[‘equirectangular’, ‘mercator’, ‘orthographic’, ‘natural
earth’, ‘kavrayskiy7’, ‘miller’, ‘robinson’, ‘eckert4’,
‘azimuthal equal area’, ‘azimuthal equidistant’, ‘conic
equal area’, ‘conic conformal’, ‘conic equidistant’,
‘gnomonic’, ‘stereographic’, ‘mollweide’, ‘hammer’,
‘transverse mercator’, ‘albers usa’, ‘winkel tripel’,
‘aitoff’, ‘sinusoidal’]

Use choropleth function to draw map information

import pandas as pd
import numpy as np
import plotly
import plotly.graph_objects as go
import chart_studio.plotly as py
import plotly.express as px

When drawing a map, data information is required, but this data requires gps information (latitude and longitude coordinates), if not, geojson information needs to be supplemented.
For example, Chengdu in the csv file does not represent Chengdu on the map. Chengdu in the csv file is just a character string. To let the program recognize that this is Chengdu on the map, it should include the range of Chengdu (span of latitude and longitude, etc.).
Here is a specific example:
insert image description here
In order to determine the range of this polygon, we use 5 points to delineate it, thus forming the polygonal geographic information in a space. All area information can be summarized as follows:
insert image description here
insert image description here

Draw geographic areas using geojson functions

"Enclosure" from the official website of geojson

In order to obtain geojson data, you need to "enclose the land" on the official website of geojson. Geojson website
insert image description here
Find the place you want (take Tianfu Square in the center of Chengdu as an example), the framed area will form data in geojson format on the right, and click Sava to save it in geojson format.
insert image description here

Use the folium library to "enclose" the jupyter notebook

First, you need to install foliumthis third-party library, and the installation method is the same as other third-party libraries.

from folium.plugins import Draw
import folium

m = folium.Map()
draw = Draw(export=True,filename="tianfu_square.geojson")
draw.add_to(m)

insert image description here
Click Export to save the local geojsonformat file.

Use choroplethmapbox to draw beautiful maps

A data is prepared here:

geo = pd.read_csv("Geography.csv")

insert image description here
Then we know that there are only two ways to locate on the map: 1. Give the specified (x, y) coordinates; 2. Geojson format file. Obviously, it is too difficult to give the coordinates of the city, so the second method is selected here.
Someone on the Internet has given a geojson file of city-level regions across the country, and we will call it directly.

import json
with open('china_geojson.json') as file:
    china = json.load(file)

insert image description here
Draw the picture:

# geojson=geojson数据
# locations=地图对应的id信息
# z=数值
fig = go.Figure(go.Choroplethmapbox(geojson=china,locations=geo.Regions,z=geo.followerPercentage
                                   ,colorscale='Cividis'))
# 直接绘制是不能显示的,必须需要fig.update_layout()渲染s
fig.update_layout(mapbox_style="carto-positron",mapbox_zoom=3,mapbox_center = {
    
    "lat" : 35.9 ,"lon" : 104.2})
fig.update_layout(margin={
    
    "t":0,"b":0,"l":0,"r":0})
fig

insert image description here
This picture looks fine, but because the landmarks given in the Geography.csv file are all in English, we prefer to display them in Chinese:

geo2 = pd.read_csv('Geography2.csv')
geo2.head(5)

insert image description here
Change the Beijing and Shanghai of the previous data to Chinese and run it again:

fig = go.Figure(go.Choroplethmapbox(geojson=china,locations=geo2.Regions,z=geo2.followerPercentage
                                   ,colorscale='Cividis'))
fig.update_layout(mapbox_style="carto-positron",mapbox_zoom=3,mapbox_center = {
    
    "lat" : 35.9 ,"lon" : 104.2})

fig.update_layout(margin={
    
    "t":0,"b":0,"l":0,"r":0})
fig

insert image description here
It was found that neither Beijing nor Shanghai could be displayed.
This is because our geojson file id is also in English. If Beijing and Shanghai are changed to Chinese here, it will not match the geojson, so it cannot be displayed.
If you modify the id to Chinese, it can be displayed correctly.

Folium draws (x,y) positioning map

We want to draw a map similar to coordinate positioning:
insert image description here
first import a dataset:

df = pd.read_csv('geo_pandas.txt')

insert image description here

We take the first 100 rows of the dataset.

limit=100
df = df.iloc[:limit,:]

We can achieve positioning with folium:

lat = 37.77
long = -122.42
m2 = folium.Map(location=[lat,long],zoom_start=12)

If you want to plot the anchor points:

# 引入特征
incidents = folium.map.FeatureGroup()
# 组合经纬度
for lat, long in zip(df.Y, df.X):
    incidents.add_child(
        folium.CircleMarker(# 画小点点
            [lat,long],
            radius=5
        )
    )
m2.add_child(incidents)

insert image description here
It can be further beautified:

for lat, long in zip(df.Y, df.X):
    incidents.add_child(
        folium.CircleMarker(# 画小点点
            [lat,long],
            radius=5,
            fill = True, # 开启外圈填充
            fill_color = 'blue', # 外圈填充蓝色
            color = 'yellow', #内圈颜色黄色
            fill_opacity = 0.6 #透明度
        )
    
    )
m2.add_child(incidents)

insert image description here
We still hope to have a label (arrow) like the demo.

for lat, long in zip(df.Y, df.X):
    incidents.add_child(
        folium.CircleMarker(# 画小点点
            [lat,long],
            radius=5,
            fill = True, # 开启外圈填充
            fill_color = 'blue', # 外圈填充蓝色
            color = 'yellow', #内圈颜色黄色
            fill_opacity = 0.6 #透明度
        )
    
    )
lat1 = list(df.Y)
long1 = list(df.X)
label1 = list(df.Category)
for lat1, long1, label1 in zip(lat1, long1, label1):
    folium.Marker([lat1, long1],popup=label1).add_to(m2)
m2.add_child(incidents)

insert image description here
Here we only display the first 100 rows of data. If there are a total of 150,000 data displayed, it will be densely packed and very uncomfortable. So what should I do if I want to display all of them here? We can use a clustering approach.

from folium.plugins import MarkerCluster
# 新建地图
m3 = folium.Map(location=[lat,long],zoom_start=12)
marker_cluster = MarkerCluster().add_to(m3)
lat1 = list(df.Y)
long1 = list(df.X)
label1 = list(df.Category)
for lat1, long1, label1 in zip(lat1, long1, label1):
    folium.Marker([lat1, long1],popup=label1).add_to(marker_cluster)
# 这里不add_to(m3),而是add_to(聚类分组处理器)

insert image description here
As the scroll wheel zooms in and out, specific and clustered information can be displayed.

Use of dynamic data graphs

Use a normal scatterplot

First import various class libraries

import pandas as pd
import numpy as np
import chart_studio.plotly as py
import cufflinks as cf
import seaborn as sns
import plotly.express as px
%matplotlib inline
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)

Import px's own dataset:

df_cnt = px.data.gapminder()
df_cnt.head()

insert image description here
Draw a scatterplot to see the correlation between gdpPercap and lifeExp:

px.scatter(df_cnt,x='gdpPercap' , y = "lifeExp" )

insert image description here
Here you will find that the data are stacked together, and it is impossible to distinguish the previous situation of each country, so use colors to distinguish them:

px.scatter(df_cnt,x='gdpPercap' , y = "lifeExp" ,color='continent')

insert image description here
Here you will find that it is still not easy to distinguish, so you can use a method that has been used before-using scientific notation.

px.scatter(df_cnt,x='gdpPercap' , y = "lifeExp" ,color='continent' ,log_x =True)

insert image description here
We also want to click on this point to see specific country information:

px.scatter(df_cnt,x='gdpPercap' , y = "lifeExp" ,color='continent' ,log_x =True , hover_name="country")

insert image description here
In fact, this picture is still quite messy, because there are decades of data, it would be great if it can be displayed separately, here we need to use a dynamic data map.

Use dynamic scatterplots

px.scatter(df_cnt,x='gdpPercap' , y = "lifeExp" ,color='continent' ,log_x =True , hover_name="country",
          animation_frame="year")
#animation_frame="year"表示按年播放

insert image description here
Drag the slider below or click the play button to play the data of each year. But it will be found that as time increases, the points will overflow the table, because the dynamic y-axis is not set.
Add both dynamic x-axis and dynamic y-axis at once:

px.scatter(df_cnt,x='gdpPercap' , y = "lifeExp" ,color='continent' ,
           log_x =True , hover_name="country",animation_frame="year",range_x=[100,100000],
           range_y=[25,90])

There is still a problem here, the actual size of gdp is not clear, for example, what is the gdp situation of purple and other colors? So you can adjust sizethe parameters and use the population to distinguish.

px.scatter(df_cnt,x='gdpPercap' , y = "lifeExp" ,color='continent' ,
           log_x =True , hover_name="country",animation_frame="year",range_x=[100,100000],
           range_y=[25,90],size='pop',size_max=60)

insert image description here

Use a normal histogram

px.bar(df_cnt,x='continent' , y='pop')

insert image description here

I found that I couldn't see what country this pillar represented by pointing up here. so increasehover_name="country"

px.bar(df_cnt,x='continent' , y='pop',hover_name="country")

insert image description here

Use a dynamic histogram

px.bar(df_cnt,x='continent' , y='pop' , hover_name='country' ,color='continent' ,
       animation_frame='year')

This will also involve the y-axis display problem, so design the y-axis range:

px.bar(df_cnt,x='continent' , y='pop' , hover_name='country' ,color='continent' ,
       animation_frame='year',range_y=[0,4000000000],animation_group='country')
# 这里的animation_group='country'类似MySQL中的groupby,表示按国家分组

insert image description here

dynamic density map

fig = px.density_contour(df_cnt, x="gdpPercap", y="lifeExp", color="continent", marginal_y="histogram",
                        animation_frame='year', animation_group='country', range_y=[25,100])

insert image description here

Dynamic heat map

fig = px.density_heatmap(df_cnt, x="gdpPercap", y="lifeExp", marginal_y="histogram",
                        animation_frame='year', animation_group='country', range_y=[25,100])

insert image description here

Dynamic geographic information map

gapminder = px.data.gapminder()
px.choropleth(gapminder,               
              locations="iso_alpha",               
              color="lifeExp",
              hover_name="country",  
              animation_frame="year",    
              color_continuous_scale='Plasma',  
              height=600             
)

insert image description here
Then use the map to explore the trend of crime rate in a certain area:

df = pd.read_csv('CrimeStatebyState_1960-2014.csv')
df.head()

insert image description here

px.choropleth(df, 
              locations = 'State_code',
              color="Murder_per100000", # 用每10万人犯罪数量区别
              animation_frame="Year",
              color_continuous_scale="oranges",
              locationmode='USA-states', # 自带国家边界的geojson数据
              scope="usa",
              range_color=(0, 20),
              title='Crime by State',
              height=600
             )

insert image description here

Guess you like

Origin blog.csdn.net/D_Ddd0701/article/details/114093346