Visualization in Python: Matplotlib Basics Tutorial

Visualization in Python: Matplotlib Basics Tutorial

1. Introducing the Matplotlib visualization library

1.1 Features of Matplotlib

Matplotlib is an open source library for creating static, dynamic, and interactive visualization charts. It has the following characteristics:

  • 2D or 3D graphics can be created, including linear graphs, bar graphs, scatter graphs, pie charts, histograms, etc.;
  • Various properties of the graph can be customized, including the size, color, line type, label, annotation, etc. of the graph;
  • Supports a variety of low-level drawing libraries, including Tkinter, wxPython, Qt, GTK, etc.

1.2 Application fields of Matplotlib

Matplotlib is commonly used in fields such as data analysis, scientific research, engineering design and teaching demonstration. It can help people display data and graphics more clearly and intuitively, and extract various data features from them.

1.3 Matplotlib installation and plug-in management

To install Matplotlib you can use the pip command as follows:

pip install matplotlib

Matplotlib also has many plug-ins to choose from to enhance the quality, interactivity, drawing speed, etc. of graphics, including Seaborn, mpld3, ggplot, etc. You can use the pip command to install these plugins. For example, to install the Seaborn plugin, you can run the following command:

pip install seaborn

2. Basic use of Matplotlib visualization library

2.1 Graphics infrastructure

2.1.1 Figure object

Matplotlib's graphics infrastructure consists of Figure objects and Axes objects. The Figure object represents the window or page of the entire figure, and can contain multiple subfigures (Axes objects). To create a Figure object, you can call the subplots() function as follows:

import matplotlib.pyplot as plt

fig, ax = plt.subplots()

2.1.2 Axes object

The Axes object indicates that the subplot contains one or more elements such as axes, graphics, and text. To create an Axes object, you can specify the number of rows, columns, and subplot numbers in the subplots() function, as follows:

import matplotlib.pyplot as plt

fig, axs = plt.subplots(2, 3)  # 创建2x3的子图集合
ax1 = axs[0, 0]  # 获取第1个子图
ax2 = axs[0, 1]  # 获取第2个子图
ax3 = axs[1, 0]  # 获取第3个子图
ax4 = axs[1, 1]  # 获取第4个子图
ax5 = axs[1, 2]  # 获取第5个子图

2.2 Draw curves and scatter plots

2.2.1 Point markers and line styles

Different markers and line styles can be used in Matplotlib to draw scatterplots and curves as follows:

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 10, 100)
y = np.sin(x)

fig, ax = plt.subplots()
ax.plot(x, y, 'ro--', label='sin(x)')
ax.legend()

The above code uses NumPy to generate a set of x and y values, and then uses the plot() function to draw a scatterplot and specify color markers, line styles, and legend labels. In addition, the legend() function is used to display the legend.

2.2.2 Axes and Tick Marks

In Matplotlib, you can use the xlabel(), ylabel(), and title() functions to set the coordinate axes and graph titles, and use the xticks() and yticks() functions to set the scale marks and labels of the coordinate axes. For example:

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = np.cos(x)

fig, ax = plt.subplots()
ax.plot(x, y1, 'r-', label='sin(x)')
ax.plot(x, y2, 'b-', label='cos(x)')

ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_title('Trig Functions')
ax.legend()

In the above code, the values ​​of x, y1 and y2 are first generated, and two curves of different colors are drawn with the plot() function. Then use the set_xlabel(), set_ylabel(), and set_title() functions to set the axes and graph titles, and use the legend() function to display the legend.

The above is the basic use of the Matplotlib visualization library. These functions and methods can be used to create various 2D and 3D graphics and present them to users.

3. Advanced use of Matplotlib visualization library

In this part, we will introduce the advanced usage of Matplotlib visualization library, including how to draw histograms, bar charts, pie charts, radar charts and 3D graphics, and explain the meaning of various available parameters.

3.1 Drawing histograms and bar charts

3.1.1 Horizontal histogram and bar chart

hist()It is very easy to draw vertical histograms and bar charts using functions, but if you need horizontal ones, you can barh()use the function on a horizontal subgraph to complete the drawing of horizontal histograms and bar charts.

import matplotlib.pyplot as plt
import numpy as np

# 随机生成一组数据
x = np.random.randn(1000)

# 创建一个横向子图
fig, ax = plt.subplots()

# 绘制直方图
ax.hist(x, bins=20, orientation='horizontal', color='b')

# 显示图形
plt.show()

In the above code, a set of 1000 random numbers is first generated using NumPy's random number generator function. Then, we use subplots()the function to create a horizontal subgraph, use hist()the function and specify binsthe parameters and orientation='horizontal'parameters to draw the horizontal histogram, and use colorthe parameters to set the color of the histogram.

3.1.2 Stacked histogram and bar chart

hist()Functions and functions are available in Matplotlib bar()to draw simple histograms and bar charts. When you need to display the distribution of multiple sets of data, you can use stacked histograms and bar charts for comparison. We can alphacontrol the transparency between different data groups by setting parameters to make the graph clearer.

import matplotlib.pyplot as plt
import numpy as np

# 随机生成两组数据
x = np.random.randn(1000)
y = np.random.randn(1000)

# 创建一个子图
fig, ax = plt.subplots()

# 绘制直方图
ax.hist([x, y], bins=20, stacked=True, alpha=0.5, label=['x', 'y'])

# 显示图例
ax.legend()

# 显示图形
plt.show()

In the above code, two sets of 1000 random numbers are first generated, and subplots()a subgraph is created using the function. Then use hist()the function and specify binsarguments, stacked=Truearguments, alphaarguments, and labelarguments to plot the stacked histogram, and use legend()the function to display the legend.

3.2 Drawing pie charts and radar charts

3.2.1 Drawing method and parameters of pie chart

Pie charts can be drawn through functions in Matplotlib pie(). pie()The function can accept an array parameter, indicating the size of each part of the pie chart. explodeThe parameter can be used to highlight a certain part or multiple parts.

import matplotlib.pyplot as plt
import numpy as np

# 创建一个数组
sizes = np.array([50, 25, 15, 10])

# 创建一个标签数组
labels = ['A', 'B', 'C', 'D']

# 突出显示某一部分
explode = [0, 0.1, 0, 0]

# 绘制饼图
plt.pie(sizes, labels=labels, explode=explode, autopct='%1.1f%%', shadow=True, startangle=90)

# 显示图形
plt.show()

In the above code, an array of size 4 and a label array are first created, and explodepart B is highlighted using parameters. We then draw the pie chart using pie()the function and specifying labelsarguments, explodearguments, autopctarguments, shadowarguments, and arguments.startangle

3.2.2 Drawing method and parameters of radar chart

In Matplotlib, polar()a function can be used to convert a rectangular coordinate axis to a polar coordinate axis to draw a radar chart. We can use fill()functions and plot()functions to fill polygons and draw line graphs.

import matplotlib.pyplot as plt
import numpy as np

# 创建一个数组
values = [3, 2, 5, 4, 1]

# 创建一个标签数组
labels = ['A', 'B', 'C', 'D', 'E']

# 计算每一部分的角度
angles = np.linspace(0, 2 * np.pi, len(values), endpoint=False)

# 将360度的角度固定在一个平面上
angles = np.concatenate((angles, [angles[0]]))

# 创建一个子图
fig, ax = plt.subplots(nrows=1, ncols=1, subplot_kw=dict(projection='polar'))

# 绘制多边形
ax.fill(angles, values, 'blue', alpha=0.1)

# 绘制线图
ax.plot(angles, values, 'blue',linewidth=2)

# 设置角度间隔,并标记角度
ax.set_xticks(angles[:-1])
ax.set_xticklabels(labels)

# 设定角度范围
ax.set_ylim(0, 6)

# 显示图形
plt.show()

In the above code, first create an array with a size of 5 and a label array, and use linspace()functions and function calculations to divide them into 5 parts, each occupying 7 2 ∘ 72^{\circ}concatenate() in the polar coordinate system72 . We then usefill()functions andplot()functions to draw polygons and line graphs, specifying color and transparency. Finally useset_xticks()function andset_xticklabels()function to set angle interval and label, andset_ylim()function to set angle range.

3.3 Drawing 3D graphics

3.3.1 Simple 3D graphics

In Matplotlib, you can use mplot3dthe subpackages in the module Axes3Dto draw 3D graphics. We can scatter()plot a scatterplot using a function or plot_surface()plot a surface plot using a function. Let's look at a simple example, using scatter()functions to draw a random 3D scatterplot.

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np

# 随机生成一组数据
n = 100
x = np.random.rand(n)
y = np.random.rand(n)
z = np.random.rand(n)

# 创建一个3D子图
fig = plt.figure()
ax = fig.add_subplot(projection='3d')

# 绘制3D散点图
ax.scatter(x, y, z, c='r', marker='o')

# 显示图形
plt.show()

In the above code, a function is first used random.rand()to generate a set of 100 random numbers. Then add_subplot()a 3D subplot was created using the function, and projection='3d'the parameters were specified. Then we used scatter()the function and specified cparameters and markerparameters to draw a 3D scatterplot. Finally use show()the function to display the graph.

3.3.2 Roaming, interaction and customization

3D graphics can be roamed, interacted and customized in Matplotlib to improve visualization. Let's see an example of using plot_surface()the function to draw a polynomial surface.

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np

# 创建一个3D子图
fig = plt.figure()
ax = fig.add_subplot(projection='3d')

# 定义x和y的取值范围
x = np.outer(np.linspace(-2, 2, 100), np.ones(100))
y = x.copy().T

# 定义z的取值范围
z = x**2 + y**3

# 绘制3D多项式表面
ax.plot_surface(x, y, z, cmap='coolwarm', edgecolor='none')

# 设定坐标轴标签
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')

# 添加颜色条
fig.colorbar(ax.plot_surface(x, y, z, cmap='coolwarm', edgecolor='none'))

# 显示图形
plt.show()

In the above code, a function is first used add_subplot()to create a 3D subgraph. Then, we defined the range of values ​​for x, y, and z using outer()functions and functions. copy()Next, we plot_surface()draw a polynomial surface using the function, specifying the colormap and edge colors. Finally, we add the axis labels using set_xlabel()function, set_ylabel()function and function, and the colorbar using function.set_zlabel()colorbar()

4. Advanced application and optimization skills

In this part, we will introduce the advanced application of Matplotlib and some optimization techniques, aiming to help you use Matplotlib more flexibly and efficiently.

4.1 Advanced applications of Matplotlib

4.1.1 Draw the exact position and size of the figure

In Matplotlib, you can use the Figure object and Subplot object to control the exact position and size of the graph.

  • FigureThe object controls the size of the graphic, in inches, by setting figsizethe parameter .
  • SubplotObjects can be positioned and sized using set_position()the method . It takes a rectangle argument whose bottom-left and top-right coordinates are pixels and concatenated, respectively.
import matplotlib.pyplot as plt

# 创建一个 Figure 对象
fig = plt.figure(figsize=(6, 4))

# 创建一个 Axes 对象
ax = fig.add_subplot(111)
ax.plot([1, 2, 3], [4, 5, 6])

# 设置 Subplot 的位置和大小
ax.set_position([0.1, 0.1, 0.8, 0.8])

# 显示图形
plt.show()

In the above code, first use figure()the function to create an Figureobject, and specify figsizethe parameter as (6, 4), indicating that the size of the graphic is 6 inches wide and 4 inches high. Next, we use add_subplot()the method to create a coordinate system object, and use set_position()the method to set the position and size of the coordinate system. set_position()The four parameters of the method are the x coordinate of the lower left corner, the y coordinate of the lower left corner, the width of the rectangle, and the height of the rectangle. Finally use show()the function to display the graph.

4.1.2 Processing and plotting big data

When processing and plotting large amounts of data, Matplotlib can become very slow or become unresponsive. To solve this problem, we can use the following trick:

  1. Process data in chunks. Process and plot the data in small chunks instead of loading the entire dataset into memory.
  2. Reduce the number of data points. Use sample()the function to reduce the size of the dataset, or use a smooth transition instead of a dotplot.
  3. Use parallel computing. Use multi-threading technology or process pool technology to improve the running speed of the program.
import numpy as np
import matplotlib.pyplot as plt

# 创建一个大数据集
n = 1000000
x = np.random.normal(size=n)
y = np.random.normal(size=n)

# 1. 分块处理数据
k = 1000
for i in range(n // k + 1):
    x_chunk = x[i * k: (i + 1) * k]
    y_chunk = y[i * k: (i + 1) * k]
    plt.scatter(x_chunk, y_chunk, s=1)

# 2. 减小数据点的数量
plt.hist2d(x, y, bins=100, cmap=plt.cm.jet)

# 3. 使用并行计算
from multiprocessing import Pool
with Pool() as p:
    result = p.map(compute, data)

In the above code numpy.random.normal(), the function to generate a data set containing 1 million random numbers xand y. We can use the first method, divide the data into small blocks with a size of 1000, and use scatter()the function to draw a scatter plot block by block; or use the second method, use hist2d()the function to draw a 2D histogram instead of a point map; or use the third A method that uses multithreading to process data sets to increase the speed of the program.

4.2 Matplotlib optimization techniques

4.2.1 Use subplots()to optimize the typesetting of plots

In Matplotlib, subplots()the function to draw multiple subplots and typesetting.

import numpy as np
import matplotlib.pyplot as plt

# 创建一个大小为 (6, 8) 的 Figure 对象和两个大小为 (2, 2) 的 Subplot 对象
fig, axs = plt.subplots(2, 2, figsize=(6, 8))

# 子图 1
x1 = np.arange(0, 10, 0.1)
y1 = np.sin(x1)
axs[0, 0].plot(x1, y1)
axs[0, 0].set_title('Subplot 1')

# 子图 2
x2 = np.arange(0, 10, 0.1)
y2 = np.cos(x2)
axs[0, 1].plot(x2, y2)
axs[0, 1].set_title('Subplot 2')

# 子图 3
x3 = np.random.normal(size=1000)
y3 = np.random.normal(size=1000)
axs[1, 0].scatter(x3, y3, s=1)
axs[1, 0].set_title('Subplot 3')

# 子图 4
x4 = np.random.gamma(shape=2, size=1000)
axs[1, 1].hist(x4, bins=50)
axs[1, 1].set_title('Subplot 4')

# 调整子图之间的间距
plt.subplots_adjust(hspace=0.3)

# 显示图形
plt.show()

The above code first uses subplots()the function to create an object(6, 8) of size and two objects of size . Next, use the , , and functions to draw different types of subplots, and use the function to add titles to the subplots. Finally, use the function to adjust the spacing between the subplots and the function to display the graph.Figure(2, 2)Subplotplot()scatter()hist()set_title()subplots_adjust()show()

4.2.2 Using color bars to display data

In Matplotlib, a color bar (colorbar) can be used to display the magnitude or change trend of the values ​​in the data set.

import numpy as np
import matplotlib.pyplot as plt

# 创建一个大小为 (6, 4) 的 Figure 对象和一个 Subplot 对象
fig, ax = plt.subplots(figsize=(6, 4))

# 生成一个包含 1000 个二维坐标的数据集
x, y = np.random.normal(size=(2, 1000))

# 绘制散点图,并使用颜色条展示数据的分布
sc = ax.scatter(x, y, s=20, c=x, cmap=plt.cm.jet)
fig.colorbar(sc)

# 添加标题和标签
ax.set_title('Scatter Plot with Colorbar')
ax.set_xlabel('X')
ax.set_ylabel('Y')

# 显示图形
plt.show()

In the above code, subplots()the function to create an object(6, 4) of size and a object. Next, we use the function to generate a data set containing 1000 two-dimensional coordinates and , and use the function to draw a scatter plot, and use the parameter to specify the value of the color bar. Then, use the function to add a colorbar, and use the , and functions to add titles and labels. Finally, use the function to display the graph.FigureSubplotnumpy.random.normal()xyscatter()ccolorbar()set_title()set_xlabel()set_ylabel()show()

4.2.3 Use grid lines to better display data

Grid lines can be used in Matplotlib to better show the data distribution and trends in the dataset.

import numpy as np
import matplotlib.pyplot as plt

# 创建一个大小为 (6, 4) 的 Figure 对象和一个 Subplot 对象
fig, ax = plt.subplots(figsize=(6, 4))

# 生成一个包含 1000 个正态分布数据的数据集
data = np.random.normal(size=1000)

# 绘制直方图,并使用网格线展示数据的分布
ax.hist(data, bins=50, alpha=0.5, edgecolor='black', linewidth=1.2)
ax.grid(True)

# 添加标题和标签
ax.set_title('Histogram with Grid Lines')
ax.set_xlabel('Value')
ax.set_ylabel('Frequency')

# 显示图形
plt.show()

In the above code, subplots()the function to create an object(6, 4) of size and a object. Next, generate a data set containing 1000 normal distribution data , and use the function to draw a histogram, use the parameter to specify the transparency, and use the parameters and parameters to set the border color and width. Then, use the function to add gridlines, and the , and functions to add titles and labels. Finally, use the function to display the graph.FigureSubplotdatahist()alphaedgecolorlinewidthgrid()set_title()set_xlabel()set_ylabel()show()

5. Analysis of actual combat cases

In this section, we will learn how to use Python's visualization library Matplotlib to create weather forecast graphs and data analysis graphs.

5.1 Making a Weather Forecast Map

5.1.1 Import weather data

Weather forecast maps usually require historical weather data, here we can use read_csv()the function to import weather datasets in CSV format.

import pandas as pd

# 读取天气数据集
weather_data = pd.read_csv('weather_data.csv')

# 访问天气数据集中的前五行数据
print(weather_data.head())

The above code reads a weather dataset called using read_csv()the function , and uses the function to access the first five rows of the dataset.weather_data.csvhead()

5.1.2 Drawing weather forecast maps

Next, you can use plot()the functions fill_between()to draw a simple weather forecast map.

import matplotlib.pyplot as plt

# 设置图形大小和标题
plt.figure(figsize=(10, 6))
plt.title('Weather Forecast')

# 绘制最高温度和最低温度的曲线
plt.plot(weather_data['date'], weather_data['high_temp'], color='red', label='High Temp')
plt.plot(weather_data['date'], weather_data['low_temp'], color='blue', label='Low Temp')

# 填充最高温度和最低温度曲线之间的区域
plt.fill_between(weather_data['date'], weather_data['high_temp'], weather_data['low_temp'], color='grey', alpha=0.2)

# 设置横轴和纵轴标签
plt.xlabel('Date')
plt.ylabel('Temperature')

# 显示图例
plt.legend()

# 显示图形
plt.show()

In the above code, first use figure()the function to set the size and title of the graph. Next, use plot()the function to plot the curves for the highest temperature and the lowest temperature, and use fill_between()the function to fill the area between the two curves with gray. Then xlabel(), ylabel()the labels for the horizontal and vertical axes were set using the and functions, and the legend was displayed using legend()the function . Finally, use show()the function to display the graph.

5.2 Make data analysis chart

5.2.1 Import data analysis data

The data analysis graph usually needs to use the data set containing the data analysis results, here we can use read_excel()the function to import the data analysis data set in Excel format.

import pandas as pd

# 读取数据分析数据集
analysis_data = pd.read_excel('analysis_data.xlsx')

# 访问数据分析数据集中的前五行数据
print(analysis_data.head())

The above code uses read_excel()the function to read a analysis_data.xlsxdata analysis data set named , and uses head()the function to access the first five rows of data in the data set.

5.2.2 Drawing bar charts, line charts and pie charts

Next, you can use the functions, functions, and functions bar()in the Matplotlib library to draw histograms, line charts, and pie charts.plot()pie()

import matplotlib.pyplot as plt

# 设置图形大小和标题
plt.figure(figsize=(10, 6))
plt.suptitle('Data Analysis')

# 绘制柱状图
plt.subplot(1, 2, 1)
plt.bar(analysis_data['name'], analysis_data['value'])
plt.xticks(rotation=45)

# 绘制折线图
plt.subplot(2, 2, 2)
plt.plot(analysis_data['name'], analysis_data['value'], marker='o')

# 绘制饼图
plt.subplot(2, 2, 4)
plt.pie(analysis_data['value'], labels=analysis_data['name'], autopct='%1.1f%%', shadow=True, startangle=90)

# 显示图形
plt.show()

The above code uses figure()the function to set the size and overall title of the graph. Then use subplot()the function to set the position to draw the histogram, line chart and pie chart. The histogram was then plotted using bar()the function and xticks()the labels on the horizontal axis were rotated using the function. A line chart is drawn using plot()the function , and markerthe style of the data points is specified using the parameter. Finally, use pie()the function to draw the pie chart, and use labelsthe parameter to specify the label of the data block, use autopctthe parameter to specify the format of the percentage, use shadowthe parameter to add a shadow, and use startanglethe parameter set the starting angle. Finally use show()the function to display the graph.

6. Practical application scenarios

Python visualization can not only be used for general data visualization, but also can be widely used in various industries and fields, such as finance, life sciences, etc.

6.1 Application of Python visualization in the financial industry

6.1.1 Draw candlestick chart

K-line chart is a type of chart commonly used in the financial industry, which is used to display the price change trend of financial products such as stocks and futures. In Python, we can use the mpl_finance module in the Matplotlib library to draw candlestick charts.

import matplotlib.pyplot as plt
from mpl_finance import candlestick_ohlc
import pandas as pd
import matplotlib.dates as mpl_dates

# 读取股票数据
data = pd.read_csv('stock_data.csv')

# 将日期信息从字符串格式转换为日期格式
data['date'] = pd.to_datetime(data['date'])
data['date'] = data['date'].apply(mpl_dates.date2num)

# 绘制K线图
fig, ax = plt.subplots()
candlestick_ohlc(ax, data.values, width=0.6, colorup='green', colordown='red', alpha=0.8)
plt.xlabel('Date')
plt.ylabel('Price')
plt.title('Stock Price Trend')
plt.xticks(rotation=45)
plt.show()

In the above code, the Pandas library is first used to read a stock data set stock_data.csvnamed , and the date information in it is converted from string format to date format. Then, the K-line graph was drawn using candlestick_ohlcthe function . The first parameter of the function is the coordinate axis of the drawing, the second parameter is the drawing data, the third parameter is the width of each K line, the fourth parameter is the color when rising, and the fifth parameter is the color when falling Color, the sixth parameter is transparency. Then, use xlabel()functions and ylabel()functions to set the labels of the horizontal and vertical axes respectively, use title()functions to set the title of the chart, use xticks()functions to rotate the date labels on the abscissa, and finally use show()functions to display the graph.

6.1.2 Plotting time series graphs

Time series charts are another chart type commonly used in finance to show trends in financial data over time. In Python we can use the Pandas library and Matplotlib library to draw time series graphs.

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

# 读取股票数据
data = pd.read_csv('stock_data.csv')

# 将日期信息从字符串格式转换为日期格式
data['date'] = pd.to_datetime(data['date'])

# 将日期列作为索引
data.set_index('date', inplace=True)

# 绘制时间序列图
fig, ax = plt.subplots()
ax.plot(data.index, data['price'], label='Price')
ax.plot(data.index, data['volume'], label='Volume')
plt.xlabel('Date')
plt.ylabel('Value')
plt.title('Stock Data')
ax.xaxis.set_major_locator(mdates.YearLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y'))
plt.legend()
plt.show()

In the above code, the Pandas library is first used to read a stock data set stock_data.csvnamed , and the date information in it is converted from string format to date format. Then, use the date column as an index. Then, use plot()the function to plot the stock price and trading volume trends over time, respectively. Use xlabel()the function and ylabel()function to set the labels of the horizontal and vertical axes respectively, use title()the function to set the title of the chart, and use the methodxaxis and method of the object to set the position and format of the scale on the horizontal axis. Finally, the legend is displayed using the function and the graph is displayed using the function.set_major_locator()set_major_formatter()legend()show()

6.2 Application of Python visualization in life sciences

6.2.1 Drawing bioinformatics graphs

Python is also widely used in the field of life sciences, and can be used to draw various bioinformatics graphics, such as gene structure diagrams, sequence logo diagrams, gene expression profiles, etc. Here we take the gene structure map as an example to introduce how to use the BioPython library and Matplotlib library to draw a gene structure map.

from Bio import SeqIO
import matplotlib.pyplot as plt

# 读取基因序列
gene = SeqIO.read('gene.fasta', 'fasta')

# 绘制基因结构图
fig, ax = plt.subplots(figsize=(10, 5))
for index, feature in enumerate(gene.features):
  # 只绘制CDS区域的部分
  if feature.type == 'CDS':
    # 绘制CDS的矩形框
    rect = plt.Rectangle((feature.location.start, -index - 1), len(feature), 1, alpha=0.5, color='blue')
    ax.add_patch(rect)

# 设置坐标轴标签
plt.xlabel('Position')
plt.ylabel('Feature')
plt.title('Gene Structure')

# 隐藏Y坐标轴刻度
plt.yticks([])

# 显示图形
plt.show()

In the above code, first use SeqIO.read()the function to read a gene.fastagene sequence named , and use enumerate()the function and featuresthe attribute of the gene sequence object to iteratively draw the gene structure map. Since we only care about the part of the CDS (Coding Sequence) area, we only feature.typedraw CDSthe rectangle with the property of , which is implemented using Rectangle()the function . Next, use xlabel()functions and ylabel()functions to set the labels for the horizontal and vertical axes, and title()functions to set the chart's title. Finally, use yticks()the function to hide the scale of the Y-axis, and use show()the function to show the graph.

6.2.2 Mapping molecular structures and biomarkers

Python can also be used to draw diagrams such as molecular structures and biomarkers, which has important applications in drug development. Here we take the py3Dmol library as an example to introduce how to use the Py3Dmol library to draw molecular structures and biomarkers.

import py3Dmol

def view_molecule(smiles):
  # 创建py3Dmol.view对象
  view = py3Dmol.view(width=400, height=400)
  
  # 绘制分子结构
  view.addModel(smiles, 'smi')
  
  # 添加生物标记
  view.addResLabels()
  
  # 设定视角
  view.zoomTo()
  
  # 返回页面
  return view.show()

# 绘制分子结构和生物标记
view_molecule('O=C1C(=O)CCC1C(=O)O')

In the above code, py3Dmol.view()a function is first used to create an py3Dmol.viewobject to display molecular structures and biomarkers. Then, use addModel()the function to draw an organic compound with a certain structure, and pass it to the function in SMILES format. Use addResLabels()the function to label the covalent bonds of each atom. Finally, adjust the perspective of the structure by calling zoomTo()the function . Finally, use view.show()the function to display the graph.

7. Summary and review

Python visualization has gradually been widely used and valued in recent years, and has its popular application fields in all walks of life, such as finance, life sciences, engineering technology, etc. The advantages of Python visualization lie in its ease of use, diverse charts, and rich community support. It can be used not only for quick display of results, but also for professional scientific research data analysis and charting. In the future, Python visualization will become more and more mainstream. The update, development and innovation of Python visualization library will bring unlimited imagination and creation space to data scientists and engineers, and bring convenience and innovation to people's production and life. .

Guess you like

Origin blog.csdn.net/u010349629/article/details/130663630