Matplotlib data visualization foundation for Python data analysis application

Brief description

Mat refers to the function design, plot represents the function of drawing, and lib represents a collection. This year, driven by the open source community, Matplotlib has been widely used in the field of scientific computing and has become one of the most widely used drawing tools in Python. Among them, the most widely used Matplotlib is the matplotlib.pyplot module.

matplotlib.pyplot is a collection of command-style functions that make Matplotlib's mechanism more like MATLAB. Each drawing function can make some modifications to the graph, such as creating a graph, creating a drawing area in the graph, drawing some lines in the drawing area, decorating the drawing with labels, etc. In pyplot, various states are saved across function calls so that things like the current figure and plot area are kept track of, and plot functions always point to the current axes domain. This chapter uses pyplot as the basis to introduce and expand learning.

learning target

  • Master the adjustment methods of commonly used drawing parameters in pyplot
  • Master how to draw subgraphs
  • Master the methods of saving and displaying drawn graphics
  • Understand the functions and drawing methods of scatter charts and line charts
  • Understand the functions and drawing methods of histograms, pie charts, and box plots

Master the basic syntax and basic parameters of drawing

Master the basic syntax of pyplot

Most pyplot graphics drawing follows a process, and most graphics can be drawn using this process. The basic drawing process of pyplot is mainly divided into 3 parts.
Insert image description here

  • Create canvas and sub-picture
    Create a blank canvas, and you can choose whether to divide the entire canvas into multiple parts to facilitate drawing multiple graphics on the same picture. When you only need to draw a simple graphic, you don't need to divide it.
    Common functions for pyplot to create canvas and select subgraphs
function name function effect
plt.figure Create a blank canvas, you can specify the canvas size and pixels
figure.add_subplot Create and select a sub-image. You can specify the number of rows and columns of the sub-image and the number of the selected image.
  • Adding Canvas Content
    The second part is the body of the drawing.
    The steps including adding titles, adding coordinate names, drawing graphics, etc. are done in parallel and in no particular order. But adding the legend must be done after drawing the graph.
    Common functions for adding various labels and legends in pyplot
function name function effect
plt.title Add a title and specify parameters such as title name, position, color, font size, etc.
plt.xlabel Add x-axis name, you can specify position, color, font size and other colors
plt.ylabel Add y-axis name, you can specify position, color, font size and other colors
plt.xlim Specify the x-axis range. Only one numerical range can be determined, and string identifiers cannot be used.
plt.ylim Specify the y-axis range. Only one numerical range can be determined, and string identifiers cannot be used.
plt.xticks Specify the number and value of x-axis ticks
plt.yticks Specify the number and value of y-axis scales
plt.legend Specify the legend, you can specify the size, position, and label of the legend
  • Saving and displaying graphics.
    The third part is used to save and display graphics. There are usually only two functions with few parameters.
    Common functions for saving and displaying graphics in pyplot.
function name function function
plt.savefig Save the drawn graphics, and you can specify the resolution of the graphics, edge color, etc.
plt.show Display graphics locally
3-1Basic drawing syntax in pyplot
import numpy as np
import matplotlib.pyplot as plt
#matplotlib inline表示在行中显示图片,在命令行运行报错
data=np.arange(0110.01)
plt.title('lines')# 添加标题
plt.xlabel('x')#添加x轴的名称
plt.ylabel('y')#添加y轴的名称
plt.xlim((0,1))#确定x轴范围
plt.ylim((0,1))#确定y轴范围
plt.xticks([0,0.2,0.4,0.6,0.8,1])#规定x轴刻度
plt.yticks([0,0.2,0.4,0.6,0.8,1])#确定y轴刻度
plt.plot(data,data**2)#添加y=x^2曲线
plt.plot(data,data**4)#添加y=x^4曲线
plt.legend(['y=x^2','y=x^4'])
plt.savefig(' 3-1.png')
plt.show()

Insert image description here

3-2 Contains the basic syntax of subgraphs
import numpy as np
import matplotlib.pyplot as plt

rad = np.arange(0, np.pi * 2, 0.01)
# 第一幅子图
p1 = plt.figure(figsize=(8, 6), dpi=80)  # 确定画布大小
ax1 = p1.add_subplot(2, 1, 1)  # 创建一个2行1列的子图
plt.title('lines')  # 添加标题
plt.xlabel('x')  # 添加x轴的名称
plt.ylabel('y')  # 添加y轴的名称
plt.xlim((0, 1))  # 确定x轴范围
plt.ylim((0, 1))  # 确定y轴范围
plt.xticks([0, 0.2, 0.4, 0.6, 0.8, 1])  # 确定x轴刻度
plt.yticks([0, 0.2, 0.4, 0.6, 0.8, 1])  # 确定y轴刻度
plt.plot(rad, rad ** 2)  # 添加曲线
plt.plot(rad, rad ** 4)  # 添加曲线
plt.legend(['y=x^2'], ['y=x^4'])
# 第二幅子图
ax2 = p1.add_subplot(2, 1, 2)  # 开始绘制第二幅
plt.title('sin/cos')
plt.xlabel('rad')
plt.ylabel('value')
plt.xlim((0, np.pi * 2))
plt.ylim((-1, 1))
plt.xticks([0, np.pi / 2, np.pi, np.pi * 1.5, np.pi * 2])
plt.yticks([-1, -0.5, 0, 0.5, 1])
plt.plot(rad, np.sin(rad))
plt.plot(rad, np.cos(rad))
plt.legend(['sin'], ['cos'])
plt.savefig('sincos.png')
plt.show()

Insert image description here

Set dynamic rc parameters for pyplot

pyplot uses rc configuration files to customize various default properties of graphics, called rc configurations or rc parameters.

Default rc parameters can be changed dynamically in the Python interactive environment. All rc parameters stored in word variables are called rcParams. After the rc parameter is modified, the default parameters used during drawing will change.

3-3 Adjust the rc parameters of the line
import numpy as np
import matplotlib.pyplot as plt

# 原图
x = np.linspace(0, 4 * np.pi)
y = np.sin(x)
plt.plot(x, y, label="$sin(x)$")
plt.title('sin')
plt.savefig('默认sin曲线.png')
plt.show()

Insert image description here

import numpy as np
import matplotlib.pyplot as plt

#修改RC参数后的图
plt.rcParams['lines.linestyle'] = '-.'
plt.rcParams['lines.linewidth']=3
plt.plot(x,y,label="$sin(x)$")
plt.title('sin')
plt.savefig('修改rc参数后sin曲线.png')
plt.show()

Insert image description here
Commonly used rc parameter names for lines. Interpretation and value

rc parameter name explain value
lines.linewidth line width Take a value between 0 and 10, the default is 1.5
lines.linestyle line style Four types of "-""–""-."":" are available. The default is"-"
lines.marker The shape of the point on the line It can take 20 types such as "o", "D", "h", ".", "S", etc., and the default is None.
lines.markersize dot size Take 0~10

lines.linstyle parameter values ​​and their meanings

lines.linestyle value meaning
‘-’ solid line
‘-.’ Dotted line
‘–’ long dashed line
‘:’ short dashed line

lines.marker parameter value and its meaning

lines.marker value meaning
o circle
D diamond
h Hexagon 1
H Hexagon 2
- horizontal line
8 Octagon
P pentagon
Pixel
+ plus
None none
point
s square
* Asterisk
d small rhombus
v triangle with one corner pointing down
< A triangle with one corner facing left
> A triangle with one corner facing right
^ Triangle with one corner pointing up
|Vertical line
x X
3-4 Adjust the rc parameter of the font
import numpy as np
import matplotlib.pyplot as plt

# 无法显示中文标题
x = np.linspace(0, 4 * np.pi)
y = np.sin(x)
plt.plot(x, y, label="$sin(x)$")
plt.title('sin曲线')
plt.savefig('无法显示中文标题sin曲线.png')
plt.show()

Insert image description here

import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

# 修改rc参数后的图
plt.plot(x, y, label='$sin(x)$')
plt.title('sin')
plt.savefig('修改rc参数后的sin曲线.png')
plt.show()

Insert image description here

Analyze relationships between features

Draw a scatter plot

A scatter plot is a graph that uses coordinates, that is, the distribution shape of scatter points, to reflect the statistical relationship between features. Values ​​are represented by the position of points on the graph, and categories are represented by different markers on the graph, often used to compare data across categories.

Scatter plots can provide two types of key information:

  1. Is there a numerical or quantitative correlation trend between features? Is the correlation trend linear or nonlinear?
  2. If a certain point or several points deviate from the majority of points, this point is an outlier. It can be clearly seen through the scatter plot, so that we can further analyze whether these outliers have a great impact on the modeling analysis.
    Common parameters and descriptions of scatter function
    Insert image description here
3-5 Draw a scatter chart of gross allergy production value by quarter from 2000 to 2017
import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'
plt.rcParams['axes.unicode_minus'] = False

data = np.load('35data.npz/国民经济核算季度数据.npz', allow_pickle=True)
name = data['columns']
values = data['values']
plt.figure(figsize=(8, 7))
plt.scatter(values[:, 0], values[:, 2], marker='o')
plt.xlabel('年份')
plt.ylabel('生产总值(亿元)')
plt.ylim((0, 225000))
plt.xticks(range(0, 70, 4), values[range(0, 70, 4), 1], rotation=45)
plt.title('绘制2000-2017年个季度过敏生产总值散点图')
plt.savefig('绘制2000-2017年个季度过敏生产总值散点图.png')
plt.show()

Insert image description here

3-6 Draw a scatter chart of gross national product in each quarter from 2000 to 2017
import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

plt.figure(figsize=(8, 7))
data = np.load('35data.npz/国民经济核算季度数据.npz', allow_pickle=True)
values = data['values']
# 绘制散点图1
plt.scatter(values[:, 0], values[:, 3], marker='o', c='red')
# 绘制散点图2
plt.scatter(values[:, 0], values[:, 4], marker='D', c='blue')
# 绘制散点图3
plt.scatter(values[:, 0], values[:, 5], marker='v', c='yellow')
plt.xlabel('年份')
plt.ylabel('生产总值(亿元)')
plt.xticks(range(0, 70, 4), values[range(0, 70, 4), 1], rotation=45)
plt.title('2000-2017年各季度国民生产总值散点图')
plt.legend(['第一产业', '第二产业', '第三产业'])
plt.savefig('2000-2017年各季度国民生产总值散点图.png')
plt.show()

Insert image description here

Draw a line chart

折线图:将数据点按照顺序连接起来的图形。适合用于显示随时间而变化的连续数据。同时还可以看出数量的差异,增长趋势的变化。

pyplot绘制折线图的函数为plot,基本语法如下:
matplotlib.pyplot.plot(*args,**kwargs)

Insert image description here
Insert image description here

3-7绘制2000-2017年各季度过敏生产总值折线图
import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

plt.figure(figsize=(8, 7))
#,
data = np.load('35data.npz/国民经济核算季度数据.npz',allow_pickle=True)
values = data['values']
plt.plot(values[:, 0], values[:, 2], color='r', linestyle='--')
plt.xlabel('年份')
plt.ylabel('生产总值(亿元)')
plt.ylim((0, 225000))
plt.xticks(range(0, 70, 4), values[range(0, 70, 4), 1], rotation=45)
plt.title('2000~ 2017 年各季度 国民生产 总值折线')
plt.savefig('2000~ 2017 年各季度 国民生产 总值折线.png')
plt.show()

Insert image description here

3-8 2000~ 2017年各季度国民生产总值点线图
import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置


data = np.load('35data.npz/国民经济核算季度数据.npz',allow_pickle=True)
values = data['values']
plt.figure(figsize=(8, 7))
plt.plot(values[:,0],values[:,2],color='r',linestyle='--',marker='o')
plt.xlabel('年份')
plt.ylabel('生产总值(亿元)')
plt.ylim((0, 225000))
plt.xticks(range(0, 70, 4), values[range(0, 70, 4), 1], rotation=45)
plt.title('2000~ 2017年各季度国民生产总值点线图')
plt.savefig('2000~ 2017年各季度国民生产总值点线图.png')
plt.show()

Insert image description here

3-92000~ 2017年各季度国民生产总值折线散点图
import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

data = np.load('35data.npz/国民经济核算季度数据.npz', allow_pickle=True)
values = data['values']
plt.figure(figsize=(8, 7))
plt.plot(values[:, 0], values[:, 3], 'bs-',
         values[:, 0], values[:, 4], 'ro-',
         values[:, 0], values[:, 5], 'gH--')
plt.xlabel('年份')
plt.ylabel('生产总值(亿元)')
plt.ylim((0, 100000))
plt.xticks(range(0, 70, 4), values[range(0, 70, 4), 1], rotation=45)
plt.title('2000~ 2017年各季度国民生产总值折线')
plt.legend(['第一产业','第二产业', '第三产业'])
plt.savefig('2000~ 2017年各季度国民生产总值折线散点图.png')
plt.show()

Insert image description here

任务实现

任务1

绘制2000-2017各产业与行业的过敏生产总值散点图

import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

data = np.load('35data.npz/国民经济核算季度数据.npz', allow_pickle=True)
name = data['columns']
values = data['values']
p = plt.figure(figsize=(12, 12))
# 子图1
ax1 = p.add_subplot(2, 1, 1)
plt.scatter(values[:, 0], values[:, 3], marker='o', c='r')
plt.scatter(values[:, 0], values[:, 4], marker='D', c='b')
plt.scatter(values[:, 0], values[:, 5], marker='v', c='y')
plt.ylabel('生产总值(亿元)')
plt.title('2000-2017年各产业与行业国民生产总值散点图')
plt.legend(['第一产业', '第二产业', '第三产业'])

# 子图2
ax2 = p.add_subplot(2, 1, 2)
plt.scatter(values[:, 0], values[:, 6], marker='o', c='r')
plt.scatter(values[:, 0], values[:, 7], marker='D', c='b')
plt.scatter(values[:, 0], values[:, 8], marker='v', c='y')
plt.scatter(values[:, 0], values[:, 9], marker='8', c='g')
plt.scatter(values[:, 0], values[:, 10], marker='p', c='c')
plt.scatter(values[:, 0], values[:, 11], marker='+', c='m')
plt.scatter(values[:, 0], values[:, 12], marker='s', c='k')
# 绘制散点图
plt.scatter(values[:, 0], values[:, 13], marker='*', c='purple')
# 绘制散点图
plt.scatter(values[:, 0], values[:, 14], marker='d', c='brown')
plt.legend(['农业', '工业', '建筑', '批发', '交通', '餐饮', '金融', '房地产', '其他'])
plt.xlabel('年份')
plt.ylabel('生产总值(亿元)')
plt.xticks(range(0, 70, 4), values[range(0, 70, 4), 1], rotation=45)
plt.savefig('2000~ 2017年各产业与行业各季度国民生产总值散点子图.png')
plt.show()

Insert image description here

任务2

绘制2000-2017各产业与行业的过敏生产总值折线图

import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

data = np.load('35data.npz/国民经济核算季度数据.npz', allow_pickle=True)
name = data['columns']
values = data['values']
p1 = plt.figure(figsize=(8, 7))
# 子图1
ax3 = p1.add_subplot(2, 1, 1)
plt.plot(values[:, 0], values[:, 3], 'b-',
values[:, 0], values[:, 4], 'r--',
values[:, 0], values[:, 5], 'g--')
plt.ylabel('生产总值(亿元)')
plt.title('2000-2017年各产业与行业国民生产总值折线图')
plt.legend(['第一产业', '第二产业', '第三产业'])

# 子图2
ax4 = p1.add_subplot(2, 1, 2)
plt.plot(values[:, 0], values[:, 6], 'r--',
values[:, 0], values[:, 7], 'b.',
values[:, 0], values[:, 8], 'y--',
values[:, 0], values[:, 9], 'g:',
values[:, 0], values[:, 10], 'c-',
values[:, 0], values[:, 11], 'm-',
values[:, 0], values[:, 12], 'k--',
# 绘制散点图
values[:, 0], values[:, 13], 'r:',
# 绘制散点图
values[:, 0], values[:, 14], 'b-')
plt.legend(['农业', '工业', '建筑', '批发', '交通', '餐饮', '金融', '房地产', '其他'])
plt.xlabel('年份')
plt.ylabel('生产总值(亿元)')
plt.xticks(range(0, 70, 4), values[range(0, 70, 4), 1], rotation=45)
plt.savefig('2000~ 2017年各产业与行业各季度国民生产总值折线子图.png')
plt.show()

Insert image description here

分析特征内部数据分布与分散状况

直方图、饼图和箱线图是另外3种数据分析常用的图形,主要用于分析数据内部的分布状态和分散状态

  • 直方图主要用于查看各分组数据的数量分布,以及各个分组数据之间的数量比较。
  • 饼图倾向于查看各分组数据在总数据中的占比。
  • 箱线图的主要作用是发现整体数据的分布分散情况。

绘制直方图

在直方图中可以发现分布表无法发现的数据模式、样本的频率分布和总体的分布

puplot中绘制直方图的函数为bar,基本使用语法如下:
matplotlib.pyplot.bar(left,height,width=0.8,bottom=None,hold=None,data=None,**kwargs)
Insert image description here

import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

data = np.load('35data.npz/国民经济核算季度数据.npz', allow_pickle=True)
name = data['columns']
values = data['values']
label = ['第一产业', '第二产业', '第三产业']
plt.figure(figsize=(6, 5))
plt.bar(range(3), values[-2, 3:6], width=0.5)
plt.xlabel('年份')
plt.ylabel('生产总值(亿元)')
plt.title('000~ 2017年各产业与行业各季度国民生产总值直方图')
plt.xticks(range(3), label)
plt.savefig('2000~ 2017年各产业与行业各季度国民生产总值直方图.png')
plt.show()

Insert image description here

绘制饼图

饼图(Pie Graph)是将各项的大小与各项总和的比例显示在一张“饼”中,以“饼”的大小来确定每一项的占比。饼图可以比较清楚地反映出部分与部分、部分与整体之间的比例关系,易于显示每组数据相对于总数的大小,而且显示方式直观。

pyplot中绘制饼图的函数为pie,其基本使用语法如下:
matplotlib.pyplot.pie(x,explode=None,labels=Nonecolors=None,autopctNone,pctdistance=0.6shadow=False,labeldistance=1.1,startangle=None,radius=None,counterclock=Truewedgeprops=Nonetextprops=Nonecenter=(0.0)frame=False
hold=Nonedata-None)

Insert image description here

import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

data = np.load('35data.npz/国民经济核算季度数据.npz', allow_pickle=True)
name = data['columns']
values = data['values']
label = ['第一产业', '第二产业', '第三产业']
explode = [0.01, 0.01, 0.01]
plt.pie(values[-1, 3:6], explode=explode, labels=label, autopct='%1.1f%%')
plt.figure(figsize=(6, 6))
plt.title('2000~ 2017年各产业与行业各季度国民生产总值饼图')
plt.savefig('2000~ 2017年各产业与行业各季度国民生产总值占比饼图.png')
plt.show()

Insert image description here

绘制箱线图

import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

plt.figure(figsize=(6, 4))
data = np.load('35data.npz/国民经济核算季度数据.npz', allow_pickle=True)
name = data['columns']
values = data['values']
label = ['第一产业', '第二产业', '第三产业']
gdp = (list(values[:, 3]), list(values[:, 4]), list(values[:, 5]))
plt.boxplot(gdp, notch=True, labels=label, meanline=True)
plt.title('2000-2017年各产业国民生产总值箱线图')
plt.savefig('2000-2017年各产业过敏生产总值箱线图')
plt.show()

Insert image description here

任务实现

任务1
import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

plt.figure(figsize=(6, 6))
data = np.load('35data.npz/国民经济核算季度数据.npz', allow_pickle=True)
name = data['columns']
values = data['values']
label1 = ['第一产业', '第二产业', '第三产业']
label2 = ['农业', '工业', '建筑', '批发', '交通', '餐饮', '金融', '房地产', '其他']
p = plt.figure(figsize=(12, 12,))
# 子图1
ax1 = p.add_subplot(2, 2, 1)
plt.bar(range(3), values[0, 3:6], width=0.5)
plt.xlabel('年份')
plt.ylabel('生产总值(亿元)')
plt.title('2000~ 2017年各产业与行业各季度国民生产总值构成分布直方图')
plt.xticks(range(3), label1)

# 子图2
ax2 = p.add_subplot(2, 2, 2)
plt.bar(range(3), values[0, 3:6], width=0.5)
plt.xlabel('年份')
plt.ylabel('生产总值(亿元)')
plt.title('2000~ 2017年各产业与行业各季度国民生产总值构成分布直方图')
plt.xticks(range(3), label1)
# 子图3
ax3 = p.add_subplot(2, 2, 3)
plt.bar(range(9), values[0, 6:], width=0.5)
plt.xlabel('年份')
plt.ylabel('生产总值(亿元)')
plt.title('2000~ 2017年各产业与行业各季度国民生产总值构成分布直方图')
plt.xticks(range(9), label2)
# 子图4
ax4 = p.add_subplot(2, 2, 4)
plt.bar(range(9), values[0, 6:], width=0.5)
plt.xlabel('年份')
plt.ylabel('生产总值(亿元)')
plt.title('2000~ 2017年各产业与行业各季度国民生产总值构成分布直方图')
plt.xticks(range(9), label2)
plt.savefig('2000~ 2017年各产业与行业各季度国民生产总值构成分布直方图.png')
plt.show()

Insert image description here

任务2

绘制国民生产总值构成分布饼图

import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

plt.figure(figsize=(6, 6))
data = np.load('35data.npz/国民经济核算季度数据.npz', allow_pickle=True)
name = data['columns']
values = data['values']
label1 = ['第一产业', '第二产业', '第三产业']
label2 = ['农业', '工业', '建筑', '批发', '交通', '餐饮', '金融', '房地产', '其他']

explode1 = [0.01, 0.01, 0.01]
explode2 = [0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01]
p = plt.figure(figsize=(12, 12))
# 子图1
ax1 = p.add_subplot(2, 2, 1)
plt.pie(values[0, 3:6], explode=explode1, labels=label1, autopct='%1.1f%%')
plt.title('2000年第一季度国民生产总值产业构成分布饼图')
# 子图2
ax2 = p.add_subplot(2, 2, 2)
plt.pie(values[-1, 3:6], explode=explode1, labels=label1, autopct='%1.1f%%')
plt.title('2000年第一季度国民生产总值产业构成分布饼图')
# 子图3
ax3 = p.add_subplot(2, 2, 3)
plt.pie(values[0, 6:], explode=explode2, labels=label2, autopct='%1.1f%%')
plt.title('2000年第一季度国民生产总值产业构成分布饼图')
# 子图4
ax4 = p.add_subplot(2, 2, 4)
plt.pie(values[-1, 6:], explode=explode2, labels=label2, autopct='%1.1f%%')
plt.title('2000年第一季度国民生产总值产业构成分布饼图')
#保存并显示图形
plt.savefig('国民生产总值产业构成分布饼图.png')
plt.show()

Insert image description here

任务3
import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

plt.figure(figsize=(6, 6))
data = np.load('35data.npz/国民经济核算季度数据.npz', allow_pickle=True)
name = data['columns']
values = data['values']

label1 = ['第一产业', '第二产业', '第三产业']
label2 = ['农业', '工业', '建筑', '批发', '交通', '餐饮', '金融', '房地产', '其他']

gdp1 = (list(values[:, 3]), list(values[:, 4]), list(values[:, 5]))
gdp2 = ([list(values[:, i]) for i in range(6, 15)])
p = plt.figure(figsize=(8, 8))

# 子图1
ax1 = p.add_subplot(2, 1, 1)
plt.boxplot(gdp1, notch=True, labels=label1, meanline=True)
plt.title('2000-2017年各产业国民生产总值箱线图')
plt.ylabel('生产总值(亿元)')
# 子图2
ax2 = p.add_subplot(2, 1, 2)
plt.boxplot(gdp2, notch=True, labels=label2, meanline=True)
plt.title('2000-2017年各产业国民生产总值箱线图')
plt.xlabel('行业')
plt.ylabel('生产总值(亿元)')
plt.savefig('2000-2017年各产业过敏生产总值箱线图.png')
plt.show()

Insert image description here

实训

实训1

需求说明
人口数据总共拥有6个特征,分别为年末总人口、男性人口、女性人口、城镇人口、乡村人口和年份。查看各个特征随着时间推移发生的变化情况可以分析出未来男女人口比例、城乡人口变化的方向。
具体步骤
(1)使用NumPy库读取人口数据。
(2)创建画布,并添加子图。
(3)在两个子图上分别绘制散点图和折线图。
(4)保存,显示图片。
(5)分析未来人口变化趋势。

import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

data = np.load('Data/populations.npz', allow_pickle=True)
feature_names = data['feature_names']
data = data['data']
# for i in data:
#     print(i)
p = plt.figure(figsize=(10, 9))
# 子图1
ax1 = p.add_subplot(2, 1, 1)
plt.scatter(range(data.shape[0] - 2), data[:-2, 1], marker='o', c='r')
plt.scatter(range(data.shape[0] - 2), data[:-2, 2], marker='D', c='b')
plt.scatter(range(data.shape[0] - 2), data[:-2, 3], marker='v', c='y')
plt.scatter(range(data.shape[0] - 2), data[:-2, 4], marker='+', c='c')
plt.scatter(range(data.shape[0] - 2), data[:-2, 5], marker='p', c='g')
plt.xlabel('时间-年份')
plt.ylabel('人口数(万人)')
plt.xticks(range(data.shape[0] - 2), data[:-2, 0], rotation=45)
plt.title('1996~2015年各特征人口变化散点图')
plt.legend(['年末人口', '男性人口', '女性人口', '城镇人口', '乡村人口和年份', '年份'])
# 子图2
ax2 = p.add_subplot(2, 1, 2)
plt.plot(range(data.shape[0] - 2), data[:-2, 1], c='r', linestyle='--')
plt.plot(range(data.shape[0] - 2), data[:-2, 2], c='b', linestyle='--')
plt.plot(range(data.shape[0] - 2), data[:-2, 3], c='y', linestyle='--')
plt.plot(range(data.shape[0] - 2), data[:-2, 4], c='g', linestyle='--')
plt.plot(range(data.shape[0] - 2), data[:-2, 5], c='c', linestyle='--')
plt.legend(['年末总人口', '男性人口', '女性人口', '城镇人口', '乡村人口'])
plt.xlabel('时间-年份')
plt.ylabel('人口数(万人)')
plt.xticks(range(data.shape[0] - 2), data[:-2, 0], rotation=45)
plt.title('1996-2015年各特征人口数折线图')
plt.show()

Insert image description here

实训2

Description of requirements:
By drawing histograms of the number of male and female population and urban and rural population in each year, and pie charts of the male and female population ratio and urban and rural population ratio, changes in the population structure can be found. Drawing a box plot of each feature can reveal whether the growth or decrease rate of different features has become slower.
Implement step
(1) Create 3 canvases and add the corresponding number of sub-pictures.
(2) Draw corresponding graphics on each sub-picture.
(3) Save and display graphics.
(4) Based on the graph, analyze the changes in my country’s population structure and the increase or decrease in the rate of change.

import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

data = np.load('Data/populations.npz', allow_pickle=True)
feature_names = data['feature_names']
data = data['data']
pt = plt.figure(figsize=(12, 11))

# 创建子图1
ax1 = pt.add_subplot(2, 1, 1)
plt.bar(range(data.shape[0] - 2), data[:-2, 2], width=0.5)
plt.xticks(range(data.shape[0] - 2), data[:-2, 0], rotation=45)
plt.xlabel('1996~2015年男性人口总数')
plt.ylabel('人口数据特征')
plt.title('1996~2015年人口数据特征间的关系的直方图')


# 创建子图2
ax2 = pt.add_subplot(2, 2, 2)
plt.bar(range(data.shape[0] - 2), data[:-2, 3], width=0.5)
plt.xticks(range(data.shape[0] - 2), data[:-2, 0], rotation=45)
plt.xlabel('1996-2015年女性人口数目')
plt.ylabel('人口数目(万人)')

# 创建子图3
ax3 = pt.add_subplot(2, 2, 3)
plt.bar(range(data.shape[0] - 2), data[:-2, 4], width=0.5)
plt.xticks(range(data.shape[0] - 2), data[:-2, 0], rotation=45)
plt.xlabel('1996-2015年城市人口数目')
plt.ylabel('人口数目(万人)')

# 创建子图4
ax4 = pt.add_subplot(2, 2, 4)
plt.bar(range(data.shape[0] - 2), data[:-2, 5], width=0.5)
plt.xticks(range(data.shape[0] - 2), data[:-2, 0], rotation=45)
plt.xlabel('1996-2015年乡村人口数目')
plt.ylabel('人口数目(万人)')
plt.show()

Insert image description here

# 绘制饼图
import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

data = np.load('Data/populations.npz', allow_pickle=True)
feature_names = data['feature_names']
data = data['data']

pt2 = plt.figure(figsize=(12, 10))
# 创建子图1
ax1 = pt2.add_subplot(2, 2, 1)
plt.pie(data[:-2, 2], labels=data[:-2, 0], autopct='%1.1f%%')
plt.title('1996-2015年男性人口比例')

# 创建子图2
ax2 = pt2.add_subplot(2, 2, 2)
plt.pie(data[:-2, 3], labels=data[:-2, 0], autopct='%1.1f%%')
plt.title('1996-2015年女性人口比例')
# 创建子图3
ax3 = pt2.add_subplot(2, 2, 3)
plt.pie(data[:-2, 4], labels=data[:-2, 0], autopct='%1.1f%%')
plt.title('1996-2015年城市人口比例')

# 创建子图4
ax4 = pt2.add_subplot(2, 2, 4)
plt.pie(data[:-2, 5], labels=data[:-2, 0], autopct='%1.1f%%')
plt.title('1996-2015年乡村人口比例')
plt.show()

Insert image description here

# 绘制箱线图
import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'  # 汉字字体,优先使用楷体,如果找不到楷体,则使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 这两行需要手动设置

data = np.load('Data/populations.npz', allow_pickle=True)
feature_names = data['feature_names']
data = data['data']

pt3 = plt.figure(figsize=(12, 10))
label = ['年末总人口', '男性人口', '女性人口', '城镇人口', '乡村人口']
plt.boxplot(([list(data[:-2, i]) for i in range(1, 6)]), labels=label, meanline=True)
plt.title('1996-2015年各特征人口数线箱图')
plt.ylabel('人口数(万人)')
plt.show()

Insert image description here

Guess you like

Origin blog.csdn.net/m0_49265034/article/details/125177347