From entry to giving up: python data analysis series-matplotlib

Before starting, please configure the python environment and check whether the third-party library: matplotlib is installed.
This article is a blogger's study notes, please point out any shortcomings.

1. Introduction to matplotlib library

1.1 Introduction to matplotlib library

The matplotlib library is a third-party library for data visualization, which contains more than one hundred images. It combines the advantages and disadvantages of matlab to form matplotlib on this basis. matplotlib is composed of various visualization classes and has a complex internal structure. Therefore, we can call these basic functions through a command sub-library of matplotlib, matplotlib.pyplot.
Generally, we refer to the pyplot sub-library as follows:

import matplotlib.pyplot as plt

1.2 Small test

We first draw a simple image:

    import matplotlib.pyplot as plt
    
    
    plt.plot([1,2,3,7,4,2])      # 传入xy轴参数,默认为y轴
    plt.ylabel("Grade")          # 指定y轴 名称
    plt.savefig("test",dpi=600)  # 保存图像 默认png格式,其中dpi指图片质量
    plt.show()                   # 展示图片

The effect is as follows:
line chart

1.3 The drawing area of ​​pyplot

Pyplot uses the plt.subplot(nrows,ncols,plot_number) method to draw multiple subplots on the same drawing board. The parameters are described as follows:

  • nrows: horizontal number
  • ncols: vertical number
  • plot_number: The area position of the current subgraph.
    For example, plt.subplot(2,2,3) means that a global plotting area will be divided into 4 sub-areas, and the current subplot is located at the position of the third subplot. When the number of subplots is a single digit, the parameters in subplot() can be separated without commas, that is, subplot(223).
    Insert picture description here

1.4 pyplot's plot() drawing function

Plot is the most important drawing function in matplotlib, and all its parameters are introduced as follows:
plt.plot(x,y,format_string,**kwargs)

  • x: The data of the x axis, which can be a list or an array, is an optional parameter
  • y: The data of the y axis, which can be a list or an array, is required.
  • **kwargs: It can be the parameter of the second curve. It should be noted that when drawing multiple curves, the xy parameter must exist; it can also be color, linestyle, marker, etc.

The following is a detailed explanation of the format_string parameter: The
format_string parameter is a string that controls the format of the curve. Although it is an optional parameter, it is very important. It consists of color characters, style characters and mark characters, which can be used in combination.

(1) Color characters, that is, characters that control the color of the curve. Since there are many characters that control the color, only some commonly used characters are listed here:

  • b: blue
  • g: green
  • r: red
  • k: black
  • w: white
  • y: yellow
  • c: turquoise
  • m: magenta
  • #000000: You can also use rgb colors

(2) Style characters, that is, characters that control the curve style:

  • "_":solid line
  • "–": dashed line
  • ":": dotted line
  • "": Wireless bar

(3) Marking character: the mark of the data coordinate point:

  • ".": dot mark
  • ",": pixel mark
  • "v": inverted triangle mark
  • "^": Upper triangle mark
  • ">": Right triangle mark
  • "<": Left triangle mark
  • "1": lower flower triangle mark
  • "2": upper flower triangle mark
  • "3": Left flower triangle mark
  • "4": Right floral triangle mark
  • "s": solid square mark
  • "p": solid pentagonal mark
  • "*": asterisk mark
  • "o": solid circle mark
  • "+": + sign
  • "x": mark x
  • "D": Diamond mark
    import matplotlib.pyplot as plt
    import numpy as np
    
    
    a = np.arange(20)
    
    plt.plot(a,a,"r--o",a,a*2,"b:",a,a*3,"y_")
    plt.savefig("01",dpi=600)
    plt.show()

The image is as follows:
Insert picture description here

1.5 Chinese display of pyplot

Pyplot does not support Chinese display by default, so we need to make some code settings for it (if not set, the output will be some hollow squares).
There are two ways to change the font, let’s take a look at the first one:

(1) Change the global font through rcparams,
let’s take a look at the effect if you don’t set Chinese, as shown in the figure

import matplotlib.pyplot as plt
import numpy as np


a = np.arange(20)

plt.plot(a,np.cos(0.02*a))
plt.xlabel("这是中文")
plt.savefig("03",dpi=600)
plt.show()

The effect is as follows:
Insert picture description here

It can be found that the signs on the x-axis are squares. Next we introduce matplotlib's rcparams to set global font attributes:

    import matplotlib.pyplot as plt
    import numpy as np
    import matplotlib
    
    
    matplotlib.rcParams["font.family"] = "SimHei"   # 设置字体为黑体
    a = np.arange(20)
    
    plt.plot(a,np.cos(0.02*a))
    plt.xlabel("这是中文")
    plt.savefig("03",dpi=600)
    plt.show()

The image is as follows:
Insert picture description here

It can be found that Chinese can be displayed. In addition to setting the font type, it also has the following attributes:

  • font.style: font style (normal or italic)
  • font.size: font size

However, if you observe carefully, whether it is the x-axis logo or the xy-axis scale, their fonts have all become bold. This is very unfriendly to the whole world, so we can set the place where Chinese is displayed separately. The second method is as follows:

(2) Set the font individually through the fontproperties parameter:


    import matplotlib.pyplot as plt
    import numpy as np
    import matplotlib
    
    
    # matplotlib.rcParams["font.family"] = "SimHei"   # 设置字体为黑体
    a = np.arange(20)
    
    plt.plot(a,np.cos(0.02*a))
    plt.xlabel("这是中文",fontproperties="SimHei",fontsize=20)
    plt.savefig("03",dpi=600)
    plt.show()

The output effect of the image is as follows:

Insert picture description here
Through the fontproperties parameter, we can easily set the required text, making the drawing more flexible.

1.6 pyplot text display

Text display is to add content such as axis, legend, title and text annotation to the image to make the image more professional. At the same time, it supports LaTeX syntax, making it more professional. Earlier we used plt.xlabel() to add text labels to the x-axis, which is one of the text display functions. All text display labels are as follows:

  • plt.xlabel(): Add a text label to the x axis
  • plt.ylabel(): Add a text label to the y axis
  • plt.tilte(): Add a title to the image, located in the middle of the top of the image
  • plt.text(): Add text comments at any position
  • plt.annotate(): Add annotations with arrows to the image
    import matplotlib.pyplot as plt
    import numpy as np
    
    
    a = np.arange(0, 5, 0.02)
    plt.plot(a,np.cos(2*np.pi*a), "r--")
    
    plt.xlabel("时间",fontproperties="SimHei",fontsize=18,color="yellow")
    plt.ylabel("振幅",fontproperties="SimHei",fontsize=18,color="green")
    plt.title(r"正弦波图 $y=cos(2\pi x)$",fontproperties="SimHei",fontsize=26)
    plt.text(2,1,r"$\mu=100$",fontsize="15")
    
    plt.axis([-1,6,-2,2])  # x轴坐标尺度
    plt.grid(True)         # 显示网格
    plt.savefig("04",dpi=600)

The image is as follows:
Insert picture description here

Let's look at the plt.annotate() function again. Its parameters are as follows:

  • s: comment text content
  • xy=arrow_crd: where the arrow points
  • xytext=text_crd: the location of the text
  • arrowprops=dict: text properties

Let's modify the above example:

    import matplotlib.pyplot as plt
    import numpy as np
    
    
    a = np.arange(0, 5, 0.02)
    plt.plot(a,np.cos(2*np.pi*a), "r--")
    
    plt.xlabel("时间",fontproperties="SimHei",fontsize=18,color="yellow")
    plt.ylabel("振幅",fontproperties="SimHei",fontsize=18,color="green")
    plt.title(r"正弦波图 $y=cos(2\pi x)$",fontproperties="SimHei",fontsize=22)
    # plt.text(2,1,r"$\mu=100$",fontsize="15")
    plt.annotate(r"$\mu=100$",xy=(2,1),xytext=(3,1.5),arrowprops=dict(facecolor="black",shrink=0.1,width=2))
    plt.axis([-1,6,-2,2])  # x轴坐标尺度
    plt.grid(True)         # 显示网格
    plt.savefig("04",dpi=600)
    plt.show()

The image is as follows

Insert picture description here

1.7 Sub-drawing area of ​​pyplot

Earlier we mentioned that the subplot function is used to draw multiple regular sub-figure areas, but what should we do if we want to draw irregular images in the area? A plt.subplot2grid() function is provided here to design complex subplot structures. By designing the grid, selecting the grid, determining the selected row and column and extending it to form the area we want. The parameters are as follows:

  • gridspec: set the grid structure
  • curspec: the initial position of the selected grid
  • colspan: the number of columns in the grid expansion (based on the current)
  • rowspan: the number of rows in the grid expansion (based on the current)

We use images to understand:
plt.subplot((4,4)(2,0),colspan=3, rowspan=2) is to divide the drawing area into 4*4 areas, and then use the third row and the first column of sub-areas (Black) as the benchmark, extend 2 columns and 1 row (gray), that is, the red area is but covers the previous sub-image area, as shown in the figure:
Insert picture description here

The above methods have more complicated codes when drawing. Therefore, matplotlib provides us with another simpler library gridspec() for designing complex subgraphs. We will explain it through code and images;
as shown in the figure, if you want to select gray and black Part as a sub-picture area, we can do this:

    import matplotlib.pyplot as plt
    from matplotlib import gridspec
    
    gs = gridspec.GridSpec(4,4)
    
    plt.subplot(gs[2:,0:-1])

This is very similar to the slicing operation of two-dimensional arrays in numpy. That is, in the two-dimensional gs array, we select the one-dimensional subscript 2 and the elements after 2, and then select the two-dimensional subscript starting from 0 to the second-to-last region.

1.8 About pyplot basic chart functions

There are many basic chart functions of pyplot, and some of these charts are very commonly used, and some are rarely used (it should be noted that less use does not mean unimportant). In the following content, the blogger only briefly introduces some commonly used charts. If you want to know more basic charts, you can go to the official website of matplotlib to learn. Here is a link: matplotlib official website

1.9 Drawing of pyplot pie chart

Pie charts are simple and intuitive, and are often used for data with percent signs. The pie chart is drawn by plt.pie():

 import matplotlib.pyplot as plt
    
    
    labels = "a","b","c","d","e","f","g"  # 每一块的标签
    sizes = [10,20,30,15,15,10,20]        # 每一块饼图的尺寸
    explode = (0,0.1,0,0.5,0,0.2,0)       # 指定哪一块是否突出,以及突出强度
    
    plt.pie(sizes,explode=explode,labels=labels,autopct="%1.1f%%",shadow=True,startangle=90)   # autopct 是百分数的显示格式,shadow是否带有阴影 startangle是饼图的起始角度
    plt.axis("equal")  # 画出图像是正圆形
    plt.show()

The image is as follows:
Insert picture description here

For more information, please see the official pie() document

1.10 pyplot histogram drawing

I believe everyone is familiar with histograms, so how to draw histograms, let's first look at a piece of code:

    import numpy as np
    import matplotlib.pyplot as plt
    
    
    np.random.seed(0)
    mu = 100
    sigma = 200
    a = np.random.normal(mu,sigma,size=100)
    
    plt.hist(a,20,histtype="stepfilled",facecolor="b",alpha=0.75)
    plt.title("直方图",fontproperties="SimHei")
    plt.show()

The image is as follows:
Insert picture description here

We use numpy to generate an array a of normal distribution and use it to draw a histogram. We draw the histogram using the plt.hist() function, where a is the incoming data, and the second parameter bin is used to control the number of histograms in the image (here means there are 20 histograms in total), here is the number division method It is to select the largest and smallest two values ​​from the array, which are divided into bin areas in the range. The number of elements in each bin is the height of this histogram.
For more information, please see the official hist() documentation .

1.11 Drawing of pyplot scatter plot

We can draw scatter plots through plot or scatter, where scatter plots are drawn similar to plot. No more introduction here. Official scatter() documentation

Guess you like

Origin blog.csdn.net/qq_45807032/article/details/107506802