Python data visualization --matplotlib library

Python data visualization – matplotlib library

0. Foreword

In the process of data analysis, as the saying goes, "a picture is worth a thousand words", a picture with a reasonable layout and beautiful colors can not only help us understand and explain the data, but also better convey information and stories, so the importance of data visualization it goes without saying.

In python, commonly used data visualization libraries include matplotlib library and seaborn library. Here we mainly introduce the related image drawing functions of matplotlib library.

Official documentation: https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.scatter.html

Before use, import the library and set related display parameters.

import matplotlib.pyplot as plt

# 解决中文显示与负号显示
plt.rcParams["font.sans-serif"] = ["SimHei"]
plt.rcParams["axes.unicode_minus"] = False
# 画布大小与清晰度
plt.rcParams['figure.figsize'] = [8, 6]
plt.rcParams['figure.dpi'] = 300

1. Pie chart

related functions

matplotlib.pyplot.pie(x, explode=None, labels=None, colors=None, autopct=None, pctdistance=0.6,
 shadow=False, labeldistance=1.1, startangle=0, radius=1, counterclock=True, wedgeprops=None, 
 textprops=None, center=(0, 0), frame=False, rotatelabels=False, *, normalize=True, hatch=None, data=None)[source]

Detailed Explanation of Common Parameters

  • x: Numerical data divided by the pie chart
  • labels: The label text corresponding to each part of the pie chart
  • colors: the color of each part of the pie chart
  • autopct: Controls how the text inside each pie is formatted, e.g. '%1.1f%%'
  • explode: The offset of each part in the pie chart from the center of the circle, provided in a list
  • shadow: Whether to display the shadow effect
  • startangle: Starting angle, you can set the starting point of the rotation, the default is 0 degrees
  • radius: Radius size of the pie chart

code example

# 创建数据
x = [46, 57]

# 绘制直方图
plt.pie(x, explode=[0.1, 0], colors=['skyblue', 'orange'],
        labels=['标签1', '标签2'], shadow=True,
        autopct='%1.1f%%', radius=0.8)
plt.legend()
plt.title('饼图')
plt.tight_layout()
# plt.savefig('pie.png')

# 显示图形
plt.show()

insert image description here

2. Histogram

By passing the data to plot, it automatically calculates and plots the distribution of the data.

related functions

matplotlib.pyplot.hist(x, bins=None, range=None, density=False, 
weights=None, cumulative=False, bottom=None, histtype='bar', 
align='mid', orientation='vertical', rwidth=None, log=False, 
color=None, label=None, stacked=False, *, data=None, **kwargs)

Detailed Explanation of Common Parameters

  • x: input data;
  • bins: the number of columns in the histogram;
  • range: the range of the x-axis, in tuple form;
  • density: whether to normalize the histogram;
  • cumulative: whether to calculate the cumulative distribution;
  • histtype: histogram type, optional 'bar', 'barstacked', 'step', 'stepfilled';
  • color: column color;
  • alpha: transparency;
  • label: label;
  • orientation: column orientation, optional 'horizontal' or 'vertical';
  • edgecolor: border color;
  • linewidth: border width.

code example

# 创建数据
data = [1, 2, 2, 3, 3, 3, 4, 4, 5]

# 绘制直方图
plt.hist(data, density=False, alpha=1)
plt.xlabel('x')
plt.ylabel('count')
plt.title('直方图')
plt.grid(True)
# plt.tight_layout()
# plt.savefig('hist.png')

# 显示图形
plt.show()

insert image description here

3. Line chart

related functions

matplotlib.pyplot.plot(*args, scalex=True, scaley=True,
 data=None, **kwargs)

Detailed Explanation of Common Parameters

  • x: abscissa data
  • y: ordinate data
  • color: line color (default is blue)
  • linestyle: line style (default is solid line)
  • linewidth: line width (default is 1)
  • marker: data point style (default none)
  • markersize: data point size (default 6)
  • label: label, used to distinguish different data series
  • xlabel: abscissa label
  • ylabel: ordinate label
  • title: chart title
  • xlim: abscissa range
  • ylim: ordinate range
  • legend: Whether to display the legend (default is False)

code example

# 创建数据
data = [1, 2, 2, 3, 3, 3, 4, 4, 5]

# 绘制折线图
plt.plot(data, color='red', linestyle='-',
         linewidth=1.2,marker='o',label='data')
plt.legend()
plt.xlabel('x')
plt.ylabel('y')
plt.title('折线图')
plt.grid(True)
# plt.tight_layout()
# plt.savefig('plot.png')

# 显示图形
plt.show()

insert image description here

4. Scatter plot

related functions

matplotlib.pyplot.scatter(x, y, s=None, c=None, marker=None, 
cmap=None, norm=None, vmin=None, vmax=None, alpha=None, 
linewidths=None, *, edgecolors=None, plotnonfinite=False, 
data=None, **kwargs)

Detailed Explanation of Common Parameters

  • x, y: array or scalar, representing the horizontal and vertical coordinates of the scattered points
  • s: scalar or array representing the scatter size
  • c: color or sequence of colors, can specify scatter color. Default is blue (b)
  • marker: the marker representing the shape of the scatter point, the default is a circle (o)
  • alpha: Indicates transparency, the value range is [0, 1], the default is 1, that is, opaque
  • cmap: color mapping, if c specifies a color sequence, you can use this parameter to map these colors to a continuous color band

code example

import numpy as np
# 创建数据
x1 = np.random.random(size=50)
y1 = np.random.random(size=50)
x2 = np.random.random(size=50)
y2 = np.random.random(size=50)

# 绘制散点图
plt.scatter(x1, y1, label='type1', alpha=0.8, c='orange')
plt.scatter(x2, y2, label='type2', alpha=1, c='skyblue')
plt.legend()
plt.xlabel('x')
plt.ylabel('y')
plt.title('散点图')
plt.grid(True)
# plt.tight_layout()
# plt.savefig('scatter.png')

# 显示图形
plt.show()

insert image description here

5. Histogram

related functions

#垂直条形图
matplotlib.pyplot.bar(x, height, width=0.8, bottom=None, 
*, align='center', data=None, **kwargs)
#水平条形图
matplotlib.pyplot.barh(y, width, height=0.8, left=None,
 *, align='center', data=None, **kwargs)

Detailed Explanation of Common Parameters

  • x: the abscissa value of each column
  • height: the height of each column
  • width: the width of each column (default 0.8)
  • bottom: position of the bottom of each bar (default 0)
  • align: Column alignment, options include 'center', 'edge' (default is 'center')
  • color: column color, can be a string, RGB tuple or RGBA tuple
  • edgecolor: Column edge color
  • linewidth: column boundary line width

code example

# 创建数据
x = [1, 2, 3, 4, 5, 6]
y = [4, 4, 5, 9, 2, 3]

# 绘制柱状图
plt.subplot(1, 2, 1)
plt.bar(x, height=y, width=0.8, bottom=0, label='vertical', align='center', color='orange')
plt.legend()
plt.xlabel('x')
plt.ylabel('y')
plt.title('垂直柱状图')
plt.grid(True)

plt.subplot(1, 2, 2)
plt.barh(x, width=y, height=0.6, left=0, label='horizon', align='center', color='red')
plt.legend()
plt.xlabel('y')
plt.ylabel('x')
plt.title('水平柱状图')
plt.grid(True)

# plt.tight_layout()
# plt.savefig('bar.png')

# 显示图形
plt.show()

insert image description here

6. Box plot

related functions

matplotlib.pyplot..boxplot(x, notch=None, sym=None, 
vert=None, whis=None, positions=None, widths=None, 
patch_artist=None, meanline=None, showmeans=None, 
showcaps=None, showbox=None, showfliers=None, boxprops=None, 
labels=None, flierprops=None, medianprops=None, meanprops=None, 
capprops=None, whiskerprops=None)

Detailed Explanation of Common Parameters

  • x: Specify the data to draw the boxplot;
  • notch: Whether to display the boxplot in the form of a notch, the default is not notch;
  • sym: Specifies the shape of the abnormal point, the default is 'o';
  • vert: Whether to place the boxplot vertically, the default is to place vertically;
  • whis: Specify the distance between the upper and lower whiskers and the upper and lower quartiles, the default is 1.5 times the quartile difference;
  • positions: Specify the position of the boxplot, the default is [0,1,2…];
  • widths: Specify the width of the boxplot, the default is 0.5;
  • patch_artist: Whether to fill the color of the box;
  • meanline: Whether to represent the mean in the form of a line, and the default is to represent it in points;
  • showmeans: Whether to display the mean value or not by default;
  • showcaps: Whether to display the two lines at the top and end of the boxplot, which are displayed by default;
  • showbox: Whether to display the box of the box plot, the default display;
  • showfliers: whether to display abnormal values, the default display;
  • boxprops: Set the properties of the box, such as border color, fill color, etc.;
    • boxprops = {‘color’:‘g’, ‘facecolor’:‘yellow’}
    • 'color' : 'g' the color of the box border
    • 'facecolor' : 'yellow' box fill color
  • labels: Add labels to the boxplot, similar to the role of the legend;
  • flierprops: Set the properties of outliers, such as the shape, size, fill color, etc. of outliers;
  • medianprops: Set the properties of the median, such as line type, thickness, etc.;
  • meanprops: Set the properties of the mean, such as point size, color, etc.;
  • capprops: Set the properties of the top and end lines of the boxplot, such as color, thickness, etc.;
  • whiskerprops: Set whisker properties, such as color, thickness, line type, etc.;

code example

import numpy as np

# 创建数据
# 利用 numpy库生成三组正态分布随机数
x = [np.random.normal(0, std, 100) for std in range(1, 4)]

# 绘制箱线图
plt.boxplot(x,
            patch_artist=True, sym='o',
            labels=['一组', '二组', '三组'],
            showmeans=True,
            boxprops={
    
    'color': 'black', 'facecolor': '#9999ff'},
            flierprops={
    
    'marker': 'o', 'markerfacecolor': 'red', 'color': 'black'},
            meanprops={
    
    'marker': 'D', 'markerfacecolor': 'indianred', 'color': 'y', },
            medianprops={
    
    'linestyle': '--', 'color': 'orange'})
plt.xlabel('x')
plt.ylabel('y')
plt.title('箱线图')
plt.grid(True)
# plt.tight_layout()
# plt.savefig('boxplot.png')

# 显示图形
plt.show()

insert image description here

7. Polar coordinate diagram

related functions

matplotlib.pyplot.polar(theta, r, *args, **kwargs)

Detailed Explanation of Common Parameters

  • theta: Polar angle data sequence, expressed in radians.
  • r: Polar diameter sequence, that is, radius sequence (the length is thetaequal to ).
  • *args: Variable number of positional arguments. These parameters usually include color (color), line type (linestyle), line width (linewidth) and so on.
  • **kwargs: Keyword arguments. These parameters support attribute control such as label (label), transparency (alpha), and visibility (visible).

code example

import numpy as np
# 创建数据
# 构建极角和极径数据序列
theta = np.linspace(0, 2 * np.pi, 8, endpoint=False)
r = 10 * np.random.rand(len(theta))

# 绘制极坐标图
plt.polar(theta, r, color='black')
plt.title('极坐标图')
# plt.tight_layout()
# plt.savefig('polar.png')

# 显示图形
plt.show()

insert image description here

8. Step diagram

related functions

matplotlib.pyplot.step(x, y, *args, where='pre', 
data=None, **kwargs)

Detailed Explanation of Common Parameters

  • x: array_like, representing the value on the x-axis
  • y: array_like, representing the value on the y-axis
  • where: {'pre', 'post', 'mid'}, indicates whether the polyline is drawn from the front or from the back at the intersection of the x-axis and y-axis. Defaults to 'pre'
  • data: DataFrame, Series, or array_like, optional parameter, if data is specified, you can directly use the column name in DataFrame or Series as the variable name of other parameters
  • label: str, optional parameter, used to label the name of the line
  • color: Can be a single color (eg 'red') or a list of colors. If multiple colors are specified, the colors are cycled through for each line in turn
  • linestyle: {'-', '–', '-.', ':', '', (offset, on-off-seq), …}, optional parameter, specifies the style of the line
  • linewidth: float, optional parameter, specifies the line width
  • alpha: float, optional parameter, specifies the transparency of the line

code example

import numpy as np
# 生成数据
x = np.arange(0, 5, 0.1)
y = np.sin(x)

# 绘制步阶图
fig, ax = plt.subplots()
ax.step(x, y, label='sin', color='r', linestyle='-')

# 添加标签和标题
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_title('步阶图')
plt.legend()
plt.grid(True)
# plt.tight_layout()
# plt.savefig('step.png')

# 显示图形
plt.show()

insert image description here

9. Spectrum

The plt.specgram function is used to draw a spectrogram, which divides an audio signal into small time windows and calculates its spectrum within each time window. This function is commonly used in audio signal processing and analysis.

related functions

matplotlib.pyplot.specgram(x, NFFT=None, Fs=None, Fc=None, 
detrend=None, window=None, noverlap=None, cmap=None, xextent=None, 
pad_to=None, sides=None, scale_by_freq=None, mode=None, scale=None, 
vmin=None, vmax=None, *, data=None, **kwargs)[source]

Detailed Explanation of Common Parameters

  • x: One-dimensional array, representing the audio signal.
  • Fs: Audio sampling rate (Hz).
  • NFFT: The number of FFT points (default is 256), which determines the resolution of the spectrum.
  • noverlap: number of overlapping samples between each period (default is Noneie NFFT//8).
  • detrend: Monotonicity removal method, the default is None.
  • window: Specifies the window function (default is Hanning window).
  • mode: FFT calculation mode, optional are psdand magnitude(default is psd).
  • scale: Spectral scale, optional are linearand dB(default is dB).
  • cmap: The colormap to use for the spectrogram (default is None).
  • xextent: The range in the X-axis direction of the spectrum (default is None).
  • extent: The range in the X and Y axis directions of the spectrum (default is None).
  • vmin: The minimum value of the spectrogram colormap (default is None).
  • vmax: The maximum value of the spectrogram colormap (default is None).
  • **kwargs: Other parameters, including label, alpha, linestyleetc.

code example

import numpy as np
# 生成数据
x = np.random.randn(3000)

# 绘制时频图
plt.specgram(x, NFFT=200, Fs=100, noverlap=100)

plt.ylabel('Frequency [Hz]')
plt.xlabel('x')
plt.title('时频图')
plt.grid(True)
# plt.tight_layout()
# plt.savefig('specgram.png')

# 显示图形
plt.show()

10. Power density map

plt.psd() is used to calculate and draw the power spectral density (Power Spectral Density, PSD) of the signal. It returns frequencies and corresponding power spectral density values, which can be used to plot PSD charts to visualize the spectral information of a signal.

related functions

matplotlib.pyplot.psd(x, NFFT=None, Fs=None, Fc=None, detrend=None,
 window=None, noverlap=None, pad_to=None, sides=None, 
 scale_by_freq=None, return_line=None, *, data=None, **kwargs)

Detailed Explanation of Common Parameters

  • x: a one-dimensional array representing the signal sample data
  • NFFT: FFT window size, usually a power of 2. If not specified, 256 is used by default.
  • Fs: Sampling frequency, in Hz. If not specified, it defaults to 2π.
  • detrend: Specifies whether to detrend. Possible values ​​are 'linear', False or True. When it is 'linear', it means linear detrending; when it is False, it means no detrending; when it is True, it means use the default method for detrending.
  • window: window function type. Possible values ​​include 'hanning', 'hamming', 'bartlett', 'blackman' and None. If not specified, 'hanning' is used by default.
  • noverlap: number of overlapping samples. Defaults to 0 if not specified.
  • pad_to: The output length after FFT calculation. If not specified, defaults to NFFT.
  • sides: Specifies whether the output returns a single-sided spectrum (True by default) or a double-sided spectrum (False).
  • scale_by_freq: Specifies whether to divide the value of each frequency bin by the number of FFT points in that frequency bin (default is True) or multiply the entire FFT result by a scale factor.
  • return_line: Whether to return a Line2D object (default is True).
  • kwargs: Additional optional keyword arguments such as color, linetype, label, etc.

code example

The code will plot the power spectral density of the sum of two sine waves with frequencies 10 Hz and 20 Hz. plt.psd()The first parameter of the function is the signal, and the second parameter is the sampling frequency (here using 1/Δt to calculate). The function has other optional parameters that can be used to set the window type, amount of overlap, etc.

import numpy as np
# 生成信号数据
N = 1024  # 采样点数
dt = 0.01  # 采样时间间隔
t = np.arange(0, N*dt, dt)  # 时间数组
f1 = 10  # 信号频率1
f2 = 20  # 信号频率2
s1 = np.sin(2*np.pi*f1*t)  # 信号1
s2 = np.sin(2*np.pi*f2*t)  # 信号2
s = s1 + s2  # 信号和

# 绘制功率谱密度图
plt.psd(s, Fs=1/dt)
plt.xlim([0, 50])  # x轴范围限制
plt.xlabel('Frequency (Hz)')  # x轴标签
plt.ylabel('Power Spectral Density')  # y轴标签
plt.title('功率谱密度图')
plt.grid(True)
# plt.tight_layout()
# plt.savefig('psd.png')

# 显示图形
plt.show()

insert image description here

11. Coherence Spectrum

The plt.cohere function is used to calculate and plot the coherence spectrum of two signals.

related functions

matplotlib.pyplot.cohere(x, y, NFFT=256, Fs=2, Fc=0, 
detrend=<function detrend_none>, window=<function window_hanning>, 
noverlap=0, pad_to=None, sides='default', scale_by_freq=None,
 *, data=None, **kwargs)

Detailed Explanation of Common Parameters

  • x, y: The two input signals.
  • fs: sampling frequency.
  • NFFT: FFT window size.
  • detrend: Specifies the method of detrending, the optional values ​​are 'mean', 'linear' and False.
  • noverlap: The length of the overlapping part of the FFT window, usually NFFT*0.5.
  • cmap: colormap name.
  • vmax: The maximum value of the colormap.
  • vmin: The minimum value of the colormap.
  • sides: Specify whether to draw only one-sided coherence spectrum, the optional values ​​are 'onesided' and 'twosided'.
  • scale_by_freq: Specifies whether to scale the coherent spectrum by frequency, the optional values ​​are True and False.
  • xlabel, ylabel: x-axis and y-axis labels.
  • title: Chart title.

code example

import numpy as np
# 生成两个信号
fs = 1000
t = np.arange(0, 10, 1/fs)
x1 = np.sin(2*np.pi*50*t)
x2 = np.sin(2*np.pi*120*t)

# 计算并绘制相干图
f, Cxy = plt.cohere(x1, x2, fs)
plt.xlabel('frequency [Hz]')
plt.ylabel('Coherence')
plt.title('相干图')
plt.grid(True)
# plt.tight_layout()
# plt.savefig('cohere.png')

# 显示图形
plt.show()

insert image description here

Guess you like

Origin blog.csdn.net/weixin_49588575/article/details/131046727