Use Python to draw cool scatter plots

Preface

The text and pictures in this article are from the Internet and are for learning and communication purposes only, and do not have any commercial use. If you have any questions, please contact us for processing.

PS: If you need Python learning materials, you can click on the link below to get it by yourself

Python free learning materials, codes and exchange answers click to join


1. Overview of scatter plots

1. What is a scatter chart?

Scatter plot refers to the distribution of data points on the plane of the rectangular coordinate system in the regression analysis of mathematical statistics. The scatter plot represents the general trend of the dependent variable changing with the independent variable. From this trend, you can select a suitable function for empirical distribution To find the functional relationship between the variables.

2. What is the use of scatter plots?

1. The data is displayed in graphs, which is obviously more intuitive. It can have a multiplier effect in work reports and other occasions, making it easier for listeners to accept and understand the data you are dealing with.

2. Scatter charts are more inclined to research charts, allowing us to discover hidden relationships between variables and make important guidance for our decision-making.

3. The core value of the scatter chart is to discover the relationship between variables. Don't simply interpret this relationship as a linear regression relationship. There are many relationships between variables, such as linear relationships, exponential relationships, logarithmic relationships, etc. Of course, no relationship is also an important relationship.

4. After the scatter plot undergoes regression analysis, it can predict and analyze related objects, and then make scientific decisions instead of ambiguity. For example, the white blood cell scatter chart in medicine can provide accurate analysis of our health in medical testing, and provide important technical support for the doctor's follow-up judgment.

Three, the basic elements of the scatter chart

The main constituent elements of the scatter chart are: data source, horizontal and vertical axis, variable name, research object. The basic element is the point, that is, the data we count. Only by the distribution of these points can we observe the relationship between the variables.

The scatter plot generally studies the relationship between two variables, which often cannot meet our daily needs. Therefore, the birth of the bubble chart is to add variables to the scatter chart to provide richer information. The size or color of the point can be defined as the third variable, because the scatter chart created is similar to a bubble, hence the name Bubble chart.

2. Scatter plot drawing

One, simple scatter plot

The more data the scatter chart presents, the more obvious the effect will be. This is the principle of regression fitting when we usually perform modeling. If the data follows a certain functional relationship, we can train through the machine and iterate continuously to achieve the best results.

import pyecharts.options as opts
from pyecharts.charts import Scatter

data = [
    [10.0, 8.04],
    [8.0, 6.95],
    [13.0, 7.58],
    [9.0, 8.81],
    [11.0, 8.33],
    [14.0, 9.96],
    [6.0, 7.24],
    [4.0, 4.26],
    [12.0, 10.84],
    [7.0, 4.82],
    [5.0, 5.68],
]
data.sort(key=lambda x: x[0])
x_data = [d[0] for d in data]
y_data = [d[1] for d in data]

(
    Scatter(init_opts=opts.InitOpts(width="1200px", height="600px"))
    .add_xaxis(xaxis_data=x_data)
    .add_yaxis(
        series_name="",
        y_axis=y_data,
        symbol_size=20,
        label_opts=opts.LabelOpts(is_show=False),
    )
    .set_series_opts()
    .set_global_opts(
        xaxis_opts=opts.AxisOpts(
            type_="value", splitline_opts=opts.SplitLineOpts(is_show=True)
        ),
        yaxis_opts=opts.AxisOpts(
            type_="value",
            axistick_opts=opts.AxisTickOpts(is_show=True),
            splitline_opts=opts.SplitLineOpts(is_show=True),
        ),
        tooltip_opts=opts.TooltipOpts(is_show=False),
    )
    .render("简单散点图.html")
)

Two, multi-dimensional data scatter plot

In our usual application scenarios, we find that too many scatter plots present too many renderings. We only need to know the number of distributions in a certain area. Originally, the histogram can be solved, but this scatter plot is better. Reflecting the distribution of the area, you can mainly see the change of its quantity trend, and use it according to your own business needs.

from pyecharts import options as opts
from pyecharts.charts import Scatter
from pyecharts.commons.utils import JsCode
from pyecharts.faker import Faker

c = (
    Scatter()
    .add_xaxis(Faker.choose())
    .add_yaxis(
        "类别1",
        [list(z) for z in zip(Faker.values(), Faker.choose())],
        label_opts=opts.LabelOpts(
            formatter=JsCode(
                "function(params){return params.value[1] +' : '+ params.value[2];}"
            )
        ),
    )
    .set_global_opts(
        title_opts=opts.TitleOpts(title="多维度数据"),
        tooltip_opts=opts.TooltipOpts(
            formatter=JsCode(
                "function (params) {return params.name + ' : ' + params.value[2];}"
            )
        ),
        visualmap_opts=opts.VisualMapOpts(
            type_="color", max_=150, min_=20, dimension=1
        ),
    )
    .render("多维数据散点图.html")
)
print([list(z) for z in zip(Faker.values(), Faker.choose())])

Third, the scatter plot shows the dividing line

The dividing line is displayed, which is actually no different from the previous one.

from pyecharts import options as opts
from pyecharts.charts import Scatter
from pyecharts.faker import Faker

c = (
    Scatter()
    .add_xaxis(Faker.choose())
    .add_yaxis("A", Faker.values())
    .set_global_opts(
        title_opts=opts.TitleOpts(title="标题"),
        xaxis_opts=opts.AxisOpts(splitline_opts=opts.SplitLineOpts(is_show=True)),
        yaxis_opts=opts.AxisOpts(splitline_opts=opts.SplitLineOpts(is_show=True)),
    )
    .render("分割线.html")
)

Fourth, the size of the scatter plot protrusion (two-dimensional)

Two-dimensional data is used to show the distribution of each category, and the chart can display multiple categories, which greatly enhances the effect of our interpretation.

from pyecharts import options as opts
from pyecharts.charts import Scatter
from pyecharts.faker import Faker

c = (
    Scatter()
    .add_xaxis(Faker.choose())
    .add_yaxis("1", Faker.values())
    .add_yaxis("2", Faker.values())
    .set_global_opts(
        title_opts=opts.TitleOpts(title="标题"),
        visualmap_opts=opts.VisualMapOpts(type_="size", max_=150, min_=20),
    )
    .render("凸出大小散点图.html")
)

Five, 3D scatter chart display

 

 

Six, dynamic ripple scatter plot

The previous scatter points are static, let's take a look at the dynamic scatter plot.

from pyecharts import options as opts
from pyecharts.charts import EffectScatter
from pyecharts.faker import Faker

c = (
    EffectScatter()
    .add_xaxis(Faker.choose())
    .add_yaxis("", Faker.values())
    .set_global_opts(title_opts=opts.TitleOpts(title="散点图"))
    .render("动态散点图.html")
)

Seven, arrow sign scatter chart

from pyecharts import options as opts
from pyecharts.charts import EffectScatter
from pyecharts.faker import Faker
from pyecharts.globals import SymbolType

c = (
    EffectScatter()
    .add_xaxis(Faker.choose())
    .add_yaxis("", Faker.values(), symbol=SymbolType.ARROW)
    .set_global_opts(title_opts=opts.TitleOpts(title="标题"))
    .render("箭头动态散点图.html")
)

 

Almost scatter plots are almost introduced here. The most practical one is the first one, and the others are used according to your scenario.

Guess you like

Origin blog.csdn.net/pythonxuexi123/article/details/114744385