Use Matplotlib to draw various charts

Use Matplotlib to draw various charts

Matplotlib part

Using Python visualization here mainly introduces matplotlib.
Pyecharts and Seaborn have the opportunity to introduce them systematically in the future.

Matplotlib installation

Method 1: Command line installation in windows environment: pip install matplotlib; pip3 install matplotlib in mac environment.
Method 2: Use the anaconda environment.

first drawing

import matplotlib.pyplot as plt

x=[0,1,2,3,4]
y=[0,1,2,3,4]

plt.plot(x,y)

At this time, it corresponds to {(x,y)=(0,0),(1,1),(2,2),(3,3),(4,4)}
but plt.plot(x,y) is just Drawing commands, if you want to display, you need to add a show statement.

import matplotlib
import matplotlib.pyplot as plt

x=[0,1,2,3,4]
y=[0,1,2,3,4]

plt.plot(x,y)
plt.show()

The result is as follows:
insert image description here

Title and Axis Names

Title naming: plt.title('标题内容')
x-axis naming: plt.xlabel('x轴名字')
y-axis naming:plt.ylabel('y轴名字')

Note:
The plt here is import matplotlib.pyplot as pltor from matplotlib import pyplot as pltthe plt declared here, which needs to be declared when using it.
If so from matplotlib import pyplot, it needs to be written out completely, for examplepyplot.xlabel('x轴名字')

It is recommended to use jupyter to write, and the graphics can be displayed without using the jupyter interactive notebook plt.show(). Here is a demo:

import matplotlib.pyplot as plt
x=[-1,1,2,3,4]
y=[-1,1,2,3,4]

plt.xlabel('x轴数据')
plt.ylabel('y轴数据')
plt.title('示例1')
plt.plot(x,y)

insert image description here

Add more detail to the line chart

marker——data point marker

from matplotlib import pyplot as plt

x=[-1,1,2,3,4]
y=[-1,1,2,3,4]

plt.xlabel('x轴数据')
plt.ylabel('y轴数据')
plt.title('示例1')
plt.plot(x,y)

insert image description here
When drawing a line chart at work, it is often necessary to mark the data points with different details. Here, the marker parameter should be set:

plt.plot(x,y,marker='.')

After adding the marker parameter to the plot statement just now:
insert image description here
use the markersize parameter to adjust the point size: plt.plot(x,y,marker='.',markersize=10)
use the color parameter to adjust the point color: plt.plot(x,y,marker='.',color='red'), the color here can be set by yourself using the HEX code, such as plt.plot(x,y,marker='.',color='#2614e8')
the line width parameter for the line pair thickness: plt.plot(x,y,marker='.',linewidth=3)
adjust the color of the point border Use markeredgecolor parameter: plt.plot(x,y,marker='.',markeredgecolor='blue')
line style adjustment with linestyle parameter:plt.plot(x,y,marker='.',linestyle='dashed')

Overall effect:
plt.plot(x,y,marker='.',markersize=10,color='red',linewidth=3,markeredgecolor='blue')
insert image description here

draw multiple polylines

from matplotlib import pyplot as plt
dev_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]
dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
py_dev_y = [45372, 48876, 53850, 57287, 63016,65998, 70003, 70000, 71496, 75370, 83640]
plt.plot(dev_x,dev_y)
plt.plot(dev_x,py_dev_y)

Two line graphs can be drawn on one drawing with two plot statements:
insert image description here
in order to make it more obvious which line corresponds to which data, it is necessary to add an illustration, using the label parameter: plt.plot(x轴数据,y轴数据, label='名字')
to supplement the above code:

from matplotlib import pyplot as plt
dev_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]
dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
py_dev_y = [45372, 48876, 53850, 57287, 63016,65998, 70003, 70000, 71496, 75370, 83640]
plt.plot(dev_x,dev_y,label='所有开发人员')
plt.plot(dev_x,py_dev_y,label='python开发人员')
plt.legend()

Note: To display the icon after using the label parameter, you need to add a plt.legend()statement. Since I am writing in jupyter notebook, the statement can be omitted . If it is not an interactive notebook , the statement needs to be added at the end to display the visual chart plt.show()when running the program . plt.show()
insert image description here
Add the third piece of data here, and then use the marker to optimize the chart:

from matplotlib import pyplot as plt
dev_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]
dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
py_dev_y = [45372, 48876, 53850, 57287, 63016,65998, 70003, 70000, 71496, 75370, 83640]
js_dev_y = [37810, 43515, 46823, 49293, 53437,56373, 62375, 66674, 68745, 68746, 74583]
plt.plot(dev_x,dev_y,'r--',label='所有开发人员')
plt.plot(dev_x,py_dev_y,'b^--',label='python开发人员')
plt.plot(dev_x,js_dev_y,'go--',label='Js开发人员')
plt.legend()
plt.title('不同语言开发人员不同年龄收入情况')
plt.xlabel('年龄')
plt.ylabel('收入')

insert image description here
Simplified writing is used here: (fmt mode)

plt.plot(dev_x,dev_y,[fmt],label='所有开发人员')
#  fmt=[颜色][marker][linestyle]
#  'go--'表示color='green',marker='o',linestyle='dashed',linewidth=2,markersize=12

For details, you can refer to the official document according to your own matplotlib version: 3.3.2 plot parameters in matplotlib.pyplot

Turn on the grid function

In order to obtain image data information more clearly, you need to use the grid parameter to enable the grid function:plt.grid()

from matplotlib import pyplot as plt
dev_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]
dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
py_dev_y = [45372, 48876, 53850, 57287, 63016,65998, 70003, 70000, 71496, 75370, 83640]
js_dev_y = [37810, 43515, 46823, 49293, 53437,56373, 62375, 66674, 68745, 68746, 74583]
plt.plot(dev_x,dev_y,'r--',label='所有开发人员')
plt.plot(dev_x,py_dev_y,'b^--',label='python开发人员')
plt.plot(dev_x,js_dev_y,'go--',label='Js开发人员')
plt.legend()
plt.title('不同语言开发人员不同年龄收入情况')
plt.xlabel('年龄')
plt.ylabel('收入')
plt.grid()

insert image description here

Beautify charts with style files

首先查看一下有什么风格:print(plt.style.available)
[‘Solarize_Light2’, ‘_classic_test_patch’, ‘bmh’, ‘classic’, ‘dark_background’, ‘fast’, ‘fivethirtyeight’, ‘ggplot’, ‘grayscale’, ‘seaborn’, ‘seaborn-bright’, ‘seaborn-colorblind’, ‘seaborn-dark’, ‘seaborn-dark-palette’, ‘seaborn-darkgrid’, ‘seaborn-deep’, ‘seaborn-muted’, ‘seaborn-notebook’, ‘seaborn-paper’, ‘seaborn-pastel’, ‘seaborn-poster’, ‘seaborn-talk’, ‘seaborn-ticks’, ‘seaborn-white’, ‘seaborn-whitegrid’, ‘tableau-colorblind10’]

Now use a style for comparison:

plt.plot(dev_x,dev_y,'r--',label='所有开发人员')
plt.plot(dev_x,py_dev_y,'b^--',label='python开发人员')
plt.plot(dev_x,js_dev_y,'go--',label='Js开发人员')
plt.legend()
plt.title('不同语言开发人员不同年龄收入情况')
plt.xlabel('年龄')
plt.ylabel('收入')
plt.style.use('tableau-colorblind10')
plt.rcParams['font.sans-serif'] = ['SimHei']

insert image description here
You can also use anime style: plt.xkcd(), but please note that plt.xkcd()there is no Chinese font library, and it is only applicable to pure English charts.

from matplotlib import pyplot as plt
dev_x = [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]
dev_y = [38496, 42000, 46752, 49320, 53200, 56000, 62316, 64928, 67317, 68748, 73752]
py_dev_y = [45372, 48876, 53850, 57287, 63016,65998, 70003, 70000, 71496, 75370, 83640]
js_dev_y = [37810, 43515, 46823, 49293, 53437,56373, 62375, 66674, 68745, 68746, 74583]
plt.xkcd()
plt.plot(dev_x,dev_y,'r--',label='All')
plt.plot(dev_x,py_dev_y,'b^--',label='python')
plt.plot(dev_x,js_dev_y,'go--',label='Js')
plt.grid()
plt.legend()
plt.title('Title')
plt.xlabel('Age')
plt.ylabel('Income')
plt.show()

insert image description here

line chart with shade

Import data using Pandas

import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('data.csv')

The data structure is shown in the figure:
insert image description here

Draw a line chart

plt.plot(data['Age'],data['All_Devs'],label='All')
plt.plot(data['Age'],data['Python'],label='Python')
plt.legend()

insert image description here

add shadows

Shadow parameters:plt.fill_between()

plt.fill_between(data['Age'],data['Python'])

insert image description here
You will find that this will cause the line chart to be very unclear. Here you can adjust the transparency:alpha=0.2

plt.fill_between(data['Age'],data['Python'],alpha=0.2)

insert image description here

set threshold

Threshold line is set to 60000:overall_mid=60000

overall_mid=60000
plt.fill_between(data['Age'],data['Python'],overall_mid,alpha=0.2)

insert image description here

Conditional statement to filter shadow position

plt.fill_between(data['Age'],data['Python'],overall_mid,where=(data['Python'] > overall_mid),alpha = 0.2)

insert image description here
It looks a bit awkward here, and can be optimized with gradient parameters:interpolate=True

plt.fill_between(data['Age'],data['Python'],overall_mid,where=(data['Python'] > overall_mid),interpolate=True,alpha = 0.2)

insert image description here

add more details

Can be used color=‘颜色’to control the color of the shadow area and labeladd labels.

plt.fill_between(data['Age'],data['Python'],data['All_Devs'],where=(data['Python'] > data['All_Devs']),interpolate=True,alpha = 0.2,label='Python > All')
plt.fill_between(data['Age'],data['Python'],data['All_Devs'],where=(data['Python'] <= data['All_Devs']),interpolate=True,alpha = 0.2,color='red',label='Python <= All')

insert image description here

histogram

Read data using pandas

Use pandas to import data from csv files:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
plt.xkcd()
data = pd.read_csv('data.csv')
data.head()

The data structure is shown in the figure:
insert image description here
insert image description here
Count the specific languages ​​in the LanguagesWorkedWith column:

from collections import Counter
language_responses=data['LanguagesWorkedWith']
cnt = Counter()
for l in language_responses:
    cnt.update(l.split(';'))

insert image description here
Take the first 15:cnt.most_common(15)

lang=[]
popularity=[]
for c in cnt.most_common(15):
    lang.append(c[0])
    popularity.append(c[1])

Draw a histogram of the extracted data

Draw a histogram:plt.bar(x,y)

plt.bar(lang,popularity)
plt.title('Top 15 Languages')
plt.xlabel('Language')
plt.ylabel('Popularity')

insert image description here
It is found that the x-axis data cannot be fully displayed. Here are three solutions:
Solution 1: Zoom in on the chartplt.figure(figsize=(10,8))
insert image description here

Solution 2: The text on the x-axis is tilted by 60 degrees. plt.xticks(rotation=60)
insert image description here
Solution 3: Flip the x and y axes. plt.barh(lang,popularity)
insert image description here
If you want to arrange it from large to small instead of small to large, you need to invert the data.

lang.reverse()
popularity.reverse()

insert image description here

stacked column chart

Import Data

minutes = [1, 2, 3, 4, 5, 6, 7, 8, 9]

player1 = [1, 2, 3, 3, 4, 4, 4, 4, 5]
player2 = [1, 1, 1, 1, 2, 2, 2, 3, 4]
player3 = [1, 5, 6, 2, 2, 2, 3, 3, 3]

Draw a simple stacked graph

plt.bar(minutes, player1)
plt.bar(minutes, player2)
plt.bar(minutes, player3)

insert image description here
Obviously there is a problem with the stacked chart here, some data is hidden and cannot be displayed. Here you need to set the index .

index_x = np.arange(len(minutes))
w= 0.15
plt.bar(index_x-w,player1,width=w)
plt.bar(index_x,player2,width=w)
plt.bar(index_x+w,player3,width=w)

insert image description here
This stacking method needs to set the width by itself, and a simpler method can be used: stackplot.

plt.stackplot(minutes, player1, player2, player3)

insert image description here
Enrich the details:

labels=['class1','class2','class3']
colors = ['Blue','Red','Green']
plt.stackplot(minutes,player1,player2,player3,labels=labels,colors=colors)
plt.legend()

The display label needs to be addedplt.legend()
insert image description here
, and the position of the label can be modified to avoid overlapping with the content of the pictureplt.legend(loc(坐标))

plt.legend(loc=(0.1,0.8))

insert image description here

More ways to use

ages = [18, 19, 21, 25, 26, 26, 30, 32, 38, 45, 55]

If this set of data is drawn in a histogram, because there is no repetition, the height of each column is the same. Here you can use the grouping function:plt.hist=(数据, bins=频次)

plt.hist(ages,bins=4)

insert image description here
In this way, the data can be evenly cut into four ranges.
The four ranges here are 18-27.25, 27.25-36.5, 36.5-45.75, and 45.75-55
will be more obvious when the dividing line is added:edgecolor=‘颜色’

plt.hist(ages,bins=4,edgecolor='black')

insert image description here
Of course, the bins here can also be entered manually:

bins=[20,30,40,50,60]
plt.hist(ages,bins,edgecolor='black')

insert image description here

Practical case

Import data from pandas:

data=pd.read_csv('data.csv')
data.head()

insert image description here
Break it down into five groups:

plt.hist(data.Age,bins=5,edgecolor='black')

insert image description here
Custom grouping:

bins=[10,20,30,40,50,60,70,80,90,100]
plt.hist(data.Age,bins,edgecolor='black')

insert image description here
Since the y-axis data is relatively large, scientific notation can be used here:log=True

bins=[10,20,30,40,50,60,70,80,90,100]
plt.hist(data.Age,bins,edgecolor='black',log=True)

insert image description here
Here you can clearly see the difference between the two graphs. After using scientific notation, you can see that 80-90 years old is less than 90-100 years old, and the graph that is not used is very blurred in the 80-100 age range.

Add average auxiliary line

Average auxiliary line:plt.axvline=(中位数)

median_age=data.Age.mean()
plt.axvline(median_age,color='red',label='Median')
plt.legend()

insert image description here

pie chart

Draw the first pie chart

Enter the data first:

import matplotlib.pyplot as plt
list1 =['JavaScript','HTML/CSS','SQL','Python','Java']
list2 = [59219,55466,47544,36443,35917]

insert image description here
Generate pie charts with Pie mode:plt.pie(数值类型,labels='对应名称')

plt.pie(list2,labels=list1)

insert image description here

add explosion effect

Explosion effect parameters:explode=explo

explo = [0,0,0,0.1,0]
# 选择排名第4的数据
plt.pie(list2,labels=list1,explode=explo)

insert image description here

explo = [0.1,0,0,0.1,0]
# 选择排名第一和第三的数据
plt.pie(list2,labels=list1,explode=explo)

insert image description here

add shadow

Shadow parameters:shadow=True

explo = [0.1,0,0,0,0]
plt.pie(list2,labels=list1,explode=explo,shadow=True)

insert image description here

Modify the position of the first block

Custom position parameters: startangle=0, to rotate counterclockwise.
At that timestartangle=90 , the position of the first block was in the upper left corner: insert image description here
At that timestartangle=180 , the position of the first block was in the lower left corner:
insert image description here
At that timestartangle=270 , the position of the first block was in the lower right corner:
insert image description here

show percentage

Percentage parameter: autopct='%1.2f%%'
%1.2f here means 2 digits of precision after the decimal point, and %% means to display the percent sign (the first percent sign is the conversion character)

explo = [0.1,0,0,0,0]
plt.pie(list2,labels=list1,explode=explo,shadow=True,startangle=0,autopct='%1.2f%%')

insert image description here

change image border

Boundary control parameters: wedgeprops={'edgecolor':'black'}
This way of writing means that the boundary color is outlined in black.

explo = [0.1,0,0,0,0]
plt.pie(list2,labels=list1,explode=explo,shadow=True,startangle=0,autopct='%1.2f%%',wedgeprops={
    
    'edgecolor':'black'})

insert image description here

add title

The title is the same as other pictures plt.title('标题')
In order to make the title more suitable for mobile display, you can add plt.tight_layout()the compact mode

explo = [0.1,0,0,0,0]
plt.pie(list2,labels=list1,explode=explo,shadow=True,startangle=0,autopct='%1.2f%%',wedgeprops={
    
    'edgecolor':'black'})
plt.title('最受欢迎的语言占比情况')
plt.tight_layout()

insert image description here

Start drawing a Pie pie chart from data import

Import data using Pandas

import pandas as pd
import numpy as np
fifa = pd.read_csv('fifa_data.csv')
fifa.head()

insert image description here
Filter out the number of players who prefer to play with left or right foot:

left = fifa.loc[fifa['Preferred Foot']=='Left'].count()[0]
right = fifa.loc[fifa['Preferred Foot']=='Right'].count()[0]

insert image description here

draw pie chart

plt.pie([left,right])

insert image description here
Then start to beautify:

labels = ['Left','Right']
explo=[0.1,0]
plt.pie([left,right],labels=labels,explode=explo,shadow=True,startangle=0,autopct='%1.2f%%',wedgeprops={
    
    'edgecolor':'black'})

insert image description here

Plotting Weight data with strings

Let’s look at the data first:
insert image description here
It’s obviously not possible to draw a pie chart directly with the data with the string 'lbs'. Here are two ideas:
1. .strip('lbs')
2. .replace('lbs','')
Use idea 1 to deal with it in detail:

def func1(d1):
    if type(d1)==str:
        return int(d1.strip('lbs'))
fifa['Weight2']=fifa.Weight.apply(func1)

insert image description here

Classify different Weight

class1=fifa.loc[fifa.Weight2 < 125].count()[0]
class2 = fifa.loc[(fifa.Weight2 >= 125) & (fifa.Weight2 < 150)].count()[0]
class3 = fifa.loc[(fifa.Weight2 >= 150) & (fifa.Weight2 < 175)].count()[0]
class4 = fifa.loc[(fifa.Weight2 >= 175) & (fifa.Weight2 < 200)].count()[0]
class5 = fifa.loc[fifa.Weight2 > 200].count()[0]

insert image description here
data into listlist= [class1,class2,class3,class4,class5]

Draw a pie chart on the processed data

labels = ['< 125 ','125-150','150-175','175-200', '> 200']
explo=[0.4,0.2,0,0,0.4]
plt.pie(list,labels=labels,explode=explo,shadow=True,startangle=0,autopct='%1.2f%%',wedgeprops={
    
    'edgecolor':'black'})

insert image description here

Here it is found that the smallest ratio is too small, the display is not obvious, you can modify the size of the canvas

plt.figure(figsize=(8,5),dpi = 100)

reuse pctdistance=0.8control spacing

plt.pie(list,labels=labels,explode=explo,pctdistance=0.8,shadow=True,startangle=0,autopct='%1.2f%%',wedgeprops={
    
    'edgecolor':'black'})

insert image description here

Scatterplot

Scatter plot drawing:plt.scatter(x数据,y数据)

plt.scatter(x,y,s=100,color='red',edgecolor='black',alpha=0.8)
# s是size,点的大小
plt.grid()

insert image description here
You can also cluster with different colors:

x = [5, 7, 8, 5, 6, 7, 9, 2, 3, 4, 4, 4, 2, 6, 3, 6, 8, 6, 4, 1]
y = [7, 4, 3, 9, 1, 3, 2, 5, 2, 4, 8, 7, 1, 6, 4, 9, 7, 7, 5, 1]
colors = [447, 445, 449, 447, 445, 447, 442, 5, 3, 7, 1, 2, 8, 1, 9, 2, 5, 6, 7, 5]
plt.scatter(x,y,s=100,c=colors,edgecolor='black',alpha=0.8)
plt.grid()

insert image description here
If the color is not intuitive, or confusing, you can add more graphic details:

x = [5, 7, 8, 5, 6, 7, 9, 2, 3, 4, 4, 4, 2, 6, 3, 6, 8, 6, 4, 1]
y = [7, 4, 3, 9, 1, 3, 2, 5, 2, 4, 8, 7, 1, 6, 4, 9, 7, 7, 5, 1]
colors = [7, 5, 9, 7, 5, 7, 2, 5, 3, 7, 1, 2, 8, 1, 9, 2, 5, 6, 7, 5]
plt.scatter(x,y,s=100,c=colors,edgecolor='black',alpha=0.8)
plt.grid()
cbar = plt.colorbar()
cbar.set_label('Label')

insert image description here

Import data from Pandas

First look at the data structure:

df = pd.read_csv('2019-05-31-data.csv')
df.head()

insert image description here
Draw a scatterplot:plt.scatter(df.view_count,df.likes)
insert image description here

Optimization of details

plt.figure(figsize=(10,6))
plt.scatter(df.view_count,df.likes,c='red',edgecolors='black',linewidths=1,alpha=0.9)
plt.xscale('log')
# 数据堆叠在一起,采用对数坐标更加明显
plt.yscale('log')

insert image description here
But here I want to df.ratioadd elements to the scatterplot:

plt.figure(figsize=(10,6))
plt.scatter(df.view_count,df.likes,c=df.ratio,edgecolors='black',linewidths=1,alpha=0.9)
plt.xscale('log')
plt.yscale('log')
cbar = plt.colorbar()
cbar.set_label('Like&Dislike')

insert image description here

Time Series Data Processing

Traditional String Performance Effects

import matplotlib.pyplot as plt
from datetime import datetime,timedelta
# 因为是时间序列,所以需要使用datetime
x = ['2019-5-24','2019-5-25','2019-5-26','2019-5-27','2019-5-28','2019-5-29','2019-5-30','2019-6-30']
y = [0,1,3,4,6,5,7,3]
plt.plot(x,y)

insert image description here
If it is plt.plotdrawn in this traditional way, the time below looks messy, because the plt.plotdefault strmethod is a string. In addition, there is one most important problem with the string: 2019-5-30 to 2019-6-30 is actually a month apart, but they are two adjacent units on the line chart.

Use plt.plot_date method

x = [
    datetime(2019,5,24),
    datetime(2019,5,25),
    datetime(2019,5,26),
    datetime(2019,5,27),
    datetime(2019,5,28),
    datetime(2019,5,29),
    datetime(2019,5,30),
    ]
y = [0,1,3,4,6,5,7,3]
plt.plot_date(x2,y)

insert image description here
At first glance, it seems that there is no difference, but the attribute of the x-axis data is yes datetime, no str. Now connect the points with polylines.

plt.style.use('seaborn')
plt.plot_date(x,y,linestyle='solid')

insert image description here
However, as the amount of data becomes larger, the x-axis data will still be blurred, for example:

x2 = [
    datetime(2019,5,24),
    datetime(2019,5,25),
    datetime(2019,5,26),
    datetime(2019,5,27),
    datetime(2019,5,28),
    datetime(2019,5,29),
    datetime(2019,5,30),
    datetime(2019,6,24),
    datetime(2019,6,25),
    datetime(2019,6,26),
    datetime(2019,6,27),
    datetime(2019,6,28),
    datetime(2019,6,29),
    datetime(2019,6,30),
    ]
y2 = [0,1,3,4,6,5,7,0,1,3,4,6,5,7]
plt.plot_date(x2,y2,linestyle='solid')

insert image description here
Although the time interval of one month is solved here, the line chart will show the interval of one month, but the problem of the x-axis here is very obvious. Let's start discussing solutions:

x-axis display fuzzy solution

plt.plot_date(x2,y2,linestyle='solid')
plt.gcf().autofmt_xdate()
# gcf是获得图表的控制权,gca是获得坐标轴控制权
# plt.gcf().autofmt_xdate()可以自动调整x轴日期格式

insert image description here
Of course, you can also set the date format yourself:

from matplotlib import dates as mpl_dates
plt.plot_date(x2,y2,linestyle='solid')
plt.gcf().autofmt_xdate()
date_format=mpl_dates.DateFormatter('%b,%d %Y')
# 用月份-日期-年份格式
plt.gca().xaxis.set_major_formatter(date_format)

insert image description here

Using Pandas to import financial data analysis

import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime,timedelta
from matplotlib import dates as mpl_dates
df = pd.read_csv('data.csv')
df.head()

insert image description here
Note that the time here is not necessarily in datetime format, you need to check it

df.info()

insert image description here
Sure enough, it is in string format and needs to be adjusted to datetime:

df.Date = pd.to_datetime(df.Date)
df.info()

insert image description here
Sort the time series to see if there is any problem

df.sort_values('Date',inplace=True)
df.head()

insert image description here
Next, start drawing the trend chart:

plt.plot_date(df.Date,df.Close,linestyle='solid')
plt.gcf().autofmt_xdate()

insert image description here
Enrich with more details:

plt.plot_date(df.Date,df.Close, linestyle='solid')
plt.gcf().autofmt_xdate()
date_format = mpl_dates.DateFormatter('%b,%d %Y')
plt.gca().xaxis.set_major_formatter(date_format)
plt.title('Bitcoin Price')
plt.xlabel('Date')
plt.ylabel('Price USD')

insert image description here

real-time data processing

traditional drawing

import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
x = [0,1,2,3,4,5]
y = [0,1,2,3,4,5]
plt.plot(x,y)

insert image description here
This type of data can be drawn in this way, but if it is real-time data, such as stocks, sensor feedback data, etc., how should it be processed?

Use iterator to set a real-time data

import random
from itertools import count
index = count()
x1=[]
y1=[]
def animate(i):
    x1.append(next(index)) #next(index)是一个计数器,0,1,2...
    y1.append(random.randint(0,50))
    plt.plot(x1,y1)
for i in range(50):
    animate(i)   

insert image description here

Let the program run automatically

import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import HTML
from itertools import count
import random
from matplotlib.animation  import FuncAnimation
plt.style.use('fivethirtyeight')
index = count()
x1=[]
y1=[]
def animate(i):
    x1.append(next(index)) #next(index)是一个计数器,0,1,2...
    y1.append(random.randint(0,50))
    plt.cla() #plt.cla()可以控制实时图线条颜色不变化
    plt.plot(x1,y1)
ani = FuncAnimation(plt.gcf(),animate,interval=1000)
# plt.gcf()获取控制权
# 调用animate函数
# interval=1000:间隔1000毫秒(1秒)
HTML(ani.to_jshtml())

insert image description here

Get real-time data and save it to a file and then load it into the notebook

The data source of the above case comes from the random and count functions, so what if the data source is loaded into an external interface or obtained in real time?

Design an external file to obtain data in real time

import csv
import random
import time
x_value = 0
y1 = 1000
y2 = 1000
fieldname=["x","y1","y2"]
with open('data.txt','w') as csvfile:
    csv_w = csv.DictWriter(csvfile,fieldnames=fieldname)
    csv_w.writeheader() #写入表头
while True:
    with open('data.txt','a') as csvfile:
        csv_w = csv.DictWriter(csvfile,fieldnames=fieldname)
        info = {
    
    
            "x" : x_value,
            "y1" : y1 ,
            "y2" : y2 ,
        }
        x_value += 1
        y1 = y1 + random.randint(-6,10)
        y2 = y2 + random.randint(-4,5)
        csv_w.writerow(info)
    time.sleep(1) #设置运行间隔1s

As long as this program is running, a set of real-time data will be generated every 1s and stored in the data.txt file.
Next, read the file data to draw a continuously changing real-time data graph:

import pandas as pd
import matplotlib.pyplot as plt
from itertools import count
import random
from matplotlib.animation  import FuncAnimation
plt.style.use('fivethirtyeight')
def animate2(i):
    dfa = pd.read_csv('data.txt')
    x = dfa.x
    y1 = dfa.y1
    y2 = dfa.y2
    plt.cla()
    plt.plot(x,y1,label='Stock1')
    plt.plot(x,y2,label='Stock2')
    plt.legend()
ani2 = FuncAnimation(plt.gcf(),animate2,interval=1000)
# plt.gcf()获取控制权
# 调用animate函数
# interval=1000:间隔1000毫秒(1秒)
plt.show()

It is recommended to run in pycharm, jupyter notebook can only display 100 data.

Chart Multiplot

In some cases it is necessary to figdraw a、b、cthe graph in one, so multiple drawing of the graph is required

Traditional method of drawing

import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('seaborn')
data = pd.read_csv('data.csv')
data.head()

insert image description here
Extract the data and plot it:

ages = data.Age
all_dev = data.All_Devs
py= data.Python
js = data.JavaScript
plt.plot(ages,all_dev,label='All')
plt.plot(ages,py,label='Python')
plt.plot(ages,js,label='JS')
plt.legend()
plt.xlabel('Age')
plt.ylabel('Sal')

insert image description here
So how to draw the three pieces of information in this picture into a picture?

Enable multiple charts

fig,ax=plt.subplots(nrows=2,ncols=1)
# 一个fig存在2行1列的小图

insert image description here
In order to better identify these two pictures, they are named ax1 and ax2 respectively:

fig,(ax1,ax2)=plt.subplots(nrows=2,ncols=1)

Import data into multiple charts

The way to import the chart: change pltto图片名

fig,(ax1,ax2)=plt.subplots(nrows=2,ncols=1)
ax1.plot(ages,all_dev,label='All')
ax2.plot(ages,py,label='Python')
ax2.plot(ages,js,label='JS')
ax1.legend()
ax2.legend()

insert image description here
Similarly, if there are three pictures:

fig,(ax1,ax2,ax3)=plt.subplots(nrows=3,ncols=1)
ax1.plot(ages,all_dev,label='All')
ax2.plot(ages,py,label='Python',color='g')
ax3.plot(ages,js,label='JS',color='r')
ax1.legend()
ax2.legend()
ax3.legend()

insert image description here

shared x-axis

Since the x-axis data of the three graphs are the same, the x-axis can be shared to make the graph look more concise:sharex=True

fig,(ax1,ax2,ax3)=plt.subplots(nrows=3,ncols=1,sharex=True)
ax1.plot(ages,all_dev,label='All')
ax2.plot(ages,py,label='Python',color='g')
ax3.plot(ages,js,label='JS',color='r')
ax1.legend()
ax2.legend()
ax3.legend()
ax3.set_xlabel('Age')

insert image description here

shared y-axis

fig , (ax1,ax2,ax3) = plt.subplots(nrows=1,ncols=3,sharey=True)
ax1.plot(ages,all_dev,label='All')
ax2.plot(ages,py,label='Python',color='g')
ax3.plot(ages,js,label='JS',color='r')
ax1.legend()
ax2.legend()
ax3.legend()
ax3.set_xlabel('Age')
ax1.set_ylabel('Salary')

insert image description here

dynamic loading

If you don't know how many rows and columns you have before drawing the chart, it is obviously not advisable to set the sum, nrowsand you can use the dynamic loading method at this time.ncols

fig = plt.figure()
ax1 = fig.add_subplot(311)
# 311代表3行1列第1个
ax2 = fig.add_subplot(312)
# 312代表3行1列第2个
ax3 = fig.add_subplot(313)
# 313代表3行1列第3个
ax1.plot(ages,all_dev,label='All')
ax2.plot(ages,py,label='Python',color='g')
ax3.plot(ages,js,label='JS',color='r')
ax1.legend()
ax2.legend()
ax3.legend()
ax3.set_xlabel('Age')

insert image description here
Change a parameter:

ax1 = fig.add_subplot(221)
ax2 = fig.add_subplot(222)
ax3 = fig.add_subplot(223)

insert image description here
You can continue to change according to your needs:

ax1 = fig.add_subplot(221)
ax2 = fig.add_subplot(222)
ax3 = fig.add_subplot(212)

insert image description here

Grid mode for drawing more complex layouts

ax1 = plt.subplot2grid((6,1),(0,0),rowspan=2,colspan=1)
# 设置一个6行1列的布局
# ax1从0行0列开始跨越2行,1列
ax2 = plt.subplot2grid((6,1),(2,0),rowspan=2,colspan=1)
# ax2从2行0列开始跨越2行,1列
ax3 = plt.subplot2grid((6,1),(4,0),rowspan=2,colspan=1)
# ax3从4行0列开始跨越2行,1列
ax1.plot(ages,all_dev,label='All')
ax2.plot(ages,py,label='Python',color='g')
ax3.plot(ages,js,label='JS',color='r')
ax1.legend()
ax2.legend()
ax3.legend()
ax3.set_xlabel('Age')

insert image description here
Continue to design more custom distributions:

ax1 = plt.subplot2grid((6,1),(0,0),rowspan=1,colspan=1)
ax2 = plt.subplot2grid((6,1),(1,0),rowspan=3,colspan=1)
ax3 = plt.subplot2grid((6,1),(4,0),rowspan=2,colspan=1)

insert image description here
Draw it like the previous example:

ax1 = plt.subplot2grid((4,2),(0,0),rowspan=2,colspan=1)
ax2 = plt.subplot2grid((4,2),(0,1),rowspan=2,colspan=1)
ax3 = plt.subplot2grid((4,2),(2,0),rowspan=2,colspan=2)

insert image description here

Guess you like

Origin blog.csdn.net/D_Ddd0701/article/details/113917362