python draws a scatter plot | the size and color of the scatter points are determined by numerical values

python drawing series article directory

往期python绘图合集:
python draws a simple line chart
python reads data in excel and draws multiple subgraphs and multiple groups of graphs on one canvas
python draws a histogram with error bars
python draws multiple subgraphs and displays them separately
python reads excel data and draws multiple y Axis image
Python draws a histogram and beautifies it | Filling the columns with different colors
Python randomly generates data and uses dual y-axes to draw two line charts with error bars
Python draws a histogram with error bars Gradient color filling with data annotation (advanced)


1. Introduction

Python is an extremely popular programming language that is used in various fields such as data processing, machine learning, and visualization. Matplotlib is one of the most popular data visualization libraries in Python. This article will introduce how to use Python and Matplotlib to draw scatter plots. We'll start with data generation and cover all aspects of plotting, including setting axis ranges, tick and label font styles, and adjusting the font and size of the legend.
First, make sure you have the following libraries installed:

  1. Python 3.x
  2. NumPy
  3. Matplotlib

If you don't have these libraries installed yet, you can install them from the command line via:

pip install numpy matplotlib

2. Generate data

Before we can draw a scatter plot, we need to generate some data first. In this blog post, we will generate three sets of random data representing the Chinese, math, and English scores of the entire class. Specifically, we will use the random module of the numpy library to generate an array of 1000 integers representing each student's score. code show as below:

import numpy as np

nscores = 1000  # 生成 1000 个学生的分数数据
chinese_scores = np.random.randint(50, 100, size=nscores)
math_scores = np.random.randint(40, 90, size=nscores)
english_scores = np.random.randint(60, 100, size=nscores)

Three sets of random data were generated using the random module of the numpy library, representing the Chinese, mathematics and English scores of the whole class respectively. Among them, the randint() function is used to generate random integers within a specified range. Here we limit the Chinese score to 50 to 100 (exclusive), the math score to 40 to 90 (exclusive), and the English score to 60 to 100 (exclusive).

3. Draw a scatter plot

Use Matplotlib's scatter() function to draw a scatter plot. Specifically, we will use the scatter() function three times to plot Chinese scores against math scores, Chinese scores against English scores, and math scores against English scores. code show as below:

import matplotlib.pyplot as plt

# 创建画布和子图
fig, ax = plt.subplots()

# 绘制散点图
ax.scatter(chinese_scores, math_scores, s=20, alpha=0.8, label='Chinese and Math')
ax.scatter(chinese_scores, english_scores, s=20, alpha=0.8, label='Chinese and English')
ax.scatter(math_scores, english_scores, s=20, alpha=0.8, label='Math and English')

# 展示图形
plt.show()

Explanation: Use the subplots() function of the matplotlib.pyplot module to create a canvas and a subplot. Then, we called the scatter() function three times and passed the corresponding parameters. Among them, the s parameter specifies the size of the scatter points, the alpha parameter specifies the transparency of the scatter points, and the label parameter specifies the label of each set of data.

4. Set the axis range, scale and label font style

Further beautify the chart, such as setting the axis range, scale and label font style

4.1 Set the coordinate axis range

In a scatter plot, you usually need to set the range of the axis to ensure that the data can be displayed in the chart. In Matplotlib, we can use the set_xlim() and set_ylim() functions to set the range of the x-axis and y-axis. For example, if we want the x-axis to range from 0 to 100 and the y-axis to range from 20 to 100, we can modify the code as follows:

ax.set_xlim(0, 100)
ax.set_ylim(20, 100)

4.2 Set axis scale

In a scatter plot, you may need to set different tick sizes to better represent the data. In Matplotlib, we can use the set_xticks() and set_yticks() functions to set the tick positions of the x-axis and y-axis. For example, if we want the x-axis and y-axis to display ticks every 10 units, we can modify the code as follows:

ax.set_xticks(range(0, 101, 10))
ax.set_yticks(range(20, 101, 10))

4.3 Set axis label font style

In a scatter plot, you need to add labels to the x-axis and y-axis to explain the meaning of the data. In Matplotlib, we can use the set_xlabel() and set_ylabel() functions to add labels, and use the fontname parameter to specify the label font. For example, if we want to set the x-axis label to "Chinese Score", set the y-axis label to "Math Score/English Score", and set the label font to Times New Roman, then we can modify the code as follows:

ax.set_xlabel('Chinese Scores', fontname='Times New Roman')
ax.set_ylabel('Math Scores / English Scores', fontname='Times New Roman')
for tick in ax.get_xticklabels():
    tick.set_fontname("Times New Roman")
for tick in ax.get_yticklabels():
    tick.set_fontname("Times New Roman")

5. Complete code

The following complete code is obtained, which includes steps such as data generation, drawing, setting the axis range, scale and label font styles, and adjusting the font and size of the legend.

import matplotlib.pyplot as plt
import numpy as np

font = {
    
    'family':'Times New Roman','size':14}
# 生成三组随机数据,代表全班的语文、数学和英语分数
nscores = 60  # 生成 60 个学生的分数数据
chinese_scores = np.random.randint(50, 100, size=nscores)
math_scores = np.random.randint(40, 90, size=nscores)
english_scores = np.random.randint(60, 100, size=nscores)

# 创建画布和子图
fig, ax = plt.subplots(figsize=(8,5),dpi = 600)

# 绘制散点图,并指定不同颜色、大小和透明度
x = np.arange(nscores)
y = x
s1 = 3*chinese_scores   # 大小与分数相关
s2 = math_scores*2
s3 = english_scores *1
c1 = chinese_scores  # 颜色深浅与温度相关
c2 = math_scores 
c3 = english_scores 
ax.scatter(chinese_scores, math_scores, s=s1, c=c1, cmap='coolwarm', norm=plt.Normalize(vmin=0, vmax=1), alpha=0.8, label='Chinese')
ax.scatter(chinese_scores, english_scores, s=s2, c=c2, cmap='PiYG', norm=plt.Normalize(vmin=0, vmax=1), alpha=0.8, label='Math')
ax.scatter(math_scores, english_scores, s=s3, c=c3, cmap='rainbow', norm=plt.Normalize(vmin=0, vmax=1), alpha=0.8, label='English')



# 设置坐标轴范围、刻度和标签字体样式
ax.set_xlim(40, 100)
ax.set_ylim(40, 110)
ax.set_xticks(range(40, 110, 10))
ax.set_yticks(range(40, 110, 10))
plt.rcParams['font.family'] = 'serif'
plt.rcParams['font.size'] = 20
ax.set_xlabel('Chinese Scores', fontname='Times New Roman')
ax.set_ylabel('Math Scores / English Scores', fontname='Times New Roman')
for tick in ax.get_xticklabels():
    tick.set_fontname("Times New Roman")
for tick in ax.get_yticklabels():
    tick.set_fontname("Times New Roman")

ax.legend(ncol=3,loc=1,prop=font)
plt.tight_layout()
# plt.savefig('散点图.jpg')
# 展示图形
plt.show()


6. Operation results

Insert image description here

Guess you like

Origin blog.csdn.net/m0_58857684/article/details/130736666