seaborn summary

Seaborn data visualization foundation
introduced
Matplotlib Python language support open source graphics library, all kinds of people because of their support for rich graphics types, simple and perfect way of drawing interface documentation, by the Python engineers, scientific researchers, engineers and other like data. Matplotlib Seaborn is at the core of high-end graphics library without going through complex custom to draw a more beautiful graphics, ideally suited for data visualization exploration.

Knowledge
associated diagram
class diagram
map
regression diagram
Matrix
Combination Chart
Seaborn introduce
Matplotlib Python language should be based on the best graphics library, but it also has a very troubling issue, and that is too complicated. More than 3,000 pages of official documents, as well as tens of thousands of method parameters, typical of you can use it to do anything, but not start. In particular, when you want to bring up a very nice effect by Matplotlib, often headache, very troublesome.

Seaborn was based on higher order API package Matplotlib core library that lets you easily draw more beautiful graphics. Pretty mainly Seaborn more comfortable in the color, style and graphic elements more delicate, below is given with reference to FIG Seaborn official.

Seaborn has the following characteristics:

Several built-optimized style effects.
Increase tool palette, you can easily match the color data.
Univariate and bivariate distributions drawing simpler, a subset of data can be used to compare with each other.
Independent variables and related variables regression fit and visualization easier.
To visualize data matrix, and using a clustering algorithm analysis.
Based rendering time series and statistical functions, more flexible uncertainty estimates.
Grid-based draw more complex set of images.
In addition, the data structure is highly compatible Seaborn Matplotlib Pandas and is very suitable for the process of data mining visualization tool.

Quick optimize graphics
when we use Matplotlib drawing, the default image style really beautiful. At this point, you can use Seaborn completed quickly optimize. Now, let's draw a simple image using Matplotlib.

Teaching Code:

import matplotlib.pyplot as plt
%matplotlib inline

x = [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
y_bar = [3, 4, 6, 8, 9, 10, 9, 11, 7, 8]
y_line = [2, 3, 5, 7, 8, 9, 8, 10, 6, 7]

plt.bar(x, y_bar)
plt.plot(x, y_line, '-o', color='y')

copy
hands-on exercises | If you are using the curriculum laboratory building Notebook online environment is not familiar with, you can learn to use the program guide.

copy
using Seaborn complete image of fast optimization method is very simple. Code sns.set style declaration only need to provide the Seaborn () can be placed in front of the drawing.

import seaborn as sns

sns.set()  # 声明使用 Seaborn 样式

plt.bar(x, y_bar)
plt.plot(x, y_line, '-o', color='y')

Copy
Copy
we found that, compared to the default Matplotlib pure white background, Seaborn default light gray grid background does look to be fine comfortable. The histogram of colors, font size axes also have some changes.

The default parameters sns.set () is:

sns.set(context='notebook', style='darkgrid', palette='deep', font='sans-serif', font_scale=1, color_codes=False, rc=None)
copy
其中:

context = '' default parameters control frame size, respectively {paper, notebook, talk, poster } four values. Which, poster> talk> notebook> paper .
style = '' default style control parameters, respectively {darkgrid, whitegrid, dark, white , ticks}, you can change their own view different therebetween.
palette = '' parameter preset palette. Respectively {deep, muted, bright, pastel , dark, colorblind} and the like, you can change their own view different therebetween.
The remaining font = '' for setting the font, font_scale = set the font size, color_codes = not employed palette previous 'r' Abbreviation other colors.
Seaborn drawing API
Seaborn has a total of more than 50 API class, compared to the thousands of Matplotlib scale, can be counted as a short and pithy. Wherein, according to the fitness of the scene graph, the drawing method are roughly classified Seaborn 6 categories, namely: correlation graph, class diagram, map, FIG regression, Matrix and combinations FIG. This and the following six categories comprise a different number of drawing functions.

Next, we demonstrate by actual data, using a graphical rendering Seaborn adapt to different scenarios.

Association graph
when we need the data association analysis may be used provided the following API Seaborn.

Correlation analysis describes
relplot draw diagrams
scatterplot analysis of multi-dimensional scatter plot
lineplot multi-dimensional analysis of a line graph
relplot is an acronym for relational plots, which can be used in relations after the presentation of data, there are two kinds of bar charts and scatter plots style . In this experiment, we use data set Iris drawing exploration.

Before the drawing, familiarize yourself with iris iris data set. A total of 150 lines of data sets, composed of five. Represent: sepal length, sepal width, petal length, petal width, flower category. Among them, the former four are numeric data, the final classification of a flower into three, namely: Iris Setosa, Iris Versicolour, Iris Virginica.

iris = sns.load_dataset("iris")
iris.head()

copy
copy

At this point, we specify the characteristics xx and yy default can plot a scatter plot.

sns.relplot(x="sepal_length", y="sepal_width", data=iris)

Copy
Copy
However, the figure does not see the connection between the categories of data, if we add the category coloring feature of the data, the better its position.

sns.relplot(x="sepal_length", y="sepal_width", hue="species", data=iris)

Copy
Copy
Seaborn has a large number of practical function parameters, for example, we specify the style parameters can be given different types of scatter in different shapes. More parameters, hope you understand by reading the official documentation.

sns.relplot(x="sepal_length", y="sepal_width",
            hue="species", style="species", data=iris)

Copy
Copy
not only scatter plots, the method further line charts, only need to specify the kind = "line" parameter. Line, scatter, for different types of data. Also automatically given when the 95% confidence interval drawn line form.

sns.relplot(x="sepal_length", y="petal_length",
            hue="species", style="species", kind="line", data=iris)

Copy
Copy
you will find that the above mentioned we have a three API, namely: relplot, scatterplot and lineplot. In fact, you can practice that we have been seen as a combination of relplot version of scatterplot and lineplot.

Here we must mention API concept of hierarchy Seaborn, Seaborn in Figure-level API is divided into Axes-level and two kinds. relplot is a Figure-level interface, and scatterplot lineplot is Axes-level interfaces.

Figure-level difference and Axes-level API that the function Axes-level can be achieved with Matplotlib more flexible and tightly integrated, and Figure-level is more like a "lazy functions", suitable for fast applications.

For example, the top of the chart, we can also use the function lineplot draw, you just need to cancel the kind of parameters can be relplot.

sns.lineplot(x="sepal_length", y="petal_length",
             hue="species", style="species", data=iris)

Copy
Copy
class diagram
associated with FIG similar to Figure-level class diagram of the interface is catplot, which is an abbreviation categorical plots. In fact the combination of the following catplot Axes-level API of the drawing:

Scatter Category:

stripplot() (kind="strip")
swarmplot() (kind="swarm")

Categories map:

boxplot() (kind="box")
violinplot() (kind="violin")
boxenplot() (kind="boxen")

Category estimation map:

pointplot() (kind="point")
barplot() (kind="bar")
countplot() (kind="count")

Below, we look at catplot graphics effects. The default method is to draw the kind = "strip" scatter plot.

sns.catplot(x="sepal_length", y="species", data=iris)
copy
kind="swarm" 可以让散点按照 beeswarm 的方式防止重叠,可以更好地观测数据分布。

sns.catplot(x="sepal_length", y="species", kind="swarm", data=iris)

copy
Similarly, hue = parameter may introduce another dimension to the image, because the iris data set only one category column, here we will no longer add a hue = parameter. If a data set has more than one category, hue = parameter data points can make a better distinction.

Next, we turn to try to draw several other graphic effects. Boxplot drawing:

sns.catplot(x="sepal_length", y="species", kind="box", data=iris)

Copy
Copy
draw violin map:

sns.catplot(x="sepal_length", y="species", kind="violin", data=iris)

Copy
Copy
drawn reinforcing box plots:

sns.catplot(x="species", y="sepal_length", kind="boxen", data=iris)

Copy
Copy
plotted points chart:

sns.catplot(x="sepal_length", y="species", kind="point", data=iris)

Copy
Copy
draw bar graphs:

sns.catplot(x="sepal_length", y="species", kind="bar", data=iris)

Copy
Copy
draw bar count:

sns.catplot(x="species", kind="count", data=iris)

Copy
Copy
profile
distribution is mainly used to visualize the distribution of the variable, the distribution is generally divided into univariate and multivariate distributions. Of course, here refers to multiple multivariate binary variable, the variable can not be drawn more intuitive visualization pattern.

Drawing Seaborn distribution method generally provides several: jointplot, pairplot, distplot, kdeplot. Next, we turn look at the use of these drawing methods.

Seaborn Quick View univariate distribution method is distplot. By default, this method will draw a histogram and kernel density estimation map fit.

sns.distplot(iris["sepal_length"])

Copy
Copy
distplot provides the parameters to adjust the histogram and FIG kernel density estimation, e.g. kde = False setting can only draw a histogram hist = False or draw only Kernel Density Estimation FIG. Of course, kdeplot can be dedicated to drawing FIG kernel density estimation, and the effect distplot (hist = False) the same, but with more kdeplot customizations.

sns.kdeplot(iris["sepal_length"])

Copy
Copy
jointplot primarily for drawing bivariate distribution. For example, we explore the relationship between the two yuan sepal_length and sepal_width characteristic variables.

sns.jointplot(x="sepal_length", y="sepal_width", data=iris)

Copy
Copy
jointplot not a Figure-level interface, but its support kind = parameter specifies draw distribution of different styles. For example, kernel density estimation comparison drawn in FIG.

sns.jointplot(x="sepal_length", y="sepal_width", data=iris, kind="kde")

Copy
Copy
hexagonal FIG Count:

sns.jointplot(x="sepal_length", y="sepal_width", data=iris, kind="hex")

Copy
Copy
regression diagram:

sns.jointplot(x="sepal_length", y="sepal_width", data=iris, kind="reg")

Copy
Copy
pairplot last to introduce more powerful, it supports one-time centralized data characteristic variables twenty-two contrast drawing. By default, the diagonal univariate distribution, while others are bivariate distribution.

sns.pairplot(iris)

Copy
Copy
this point, we introduce the third dimension hue = "species" will be more intuitive.

sns.pairplot(iris, hue="species")

Copy
Copy
regression diagram
Next, we continue to introduce the return map, drawn mainly function regression graph: lmplot and regplot.

When regplot map drawn back, only you need to specify the independent and dependent variables can, regplot will auto-complete linear regression fit.

sns.regplot(x="sepal_length", y="sepal_width", data=iris)

Copy
Copy
lmplot also for drawing FIG regression, but introducing a third dimension to support lmplot compare, for example, we set the hue = "species".

sns.lmplot(x="sepal_length", y="sepal_width", hue="species", data=iris)

Copy
Copy
Matrix
Matrix most commonly used only two, namely: heatmap and clustermap.

Meaning as the name suggests, heatmap is mainly used to draw heat map.

import numpy as np

sns.heatmap(np.random.rand(10, 10))

Copy
Copy
thermodynamic diagram useful in some scenarios, for example, the correlation coefficient plotted thermodynamic variables FIG.

In addition, clustermap draw support hierarchical clustering structure. As shown below, let's focus on removing the last column of the original target data, wherein data can be passed. Of course, you need to understand the hierarchical clustering, it would be difficult to see the image understand the meaning expressed.

iris.pop("species")
sns.clustermap(iris)

Copy
Copy
If you visit the official document, you will find there are still a lot of Seaborn has larger class beginning with the letter, for example JointGrid, PairGrid and so on. These classes are actually only a function of the corresponding lower case letters jointplot, further encapsulated in pairplot. Of course, the two may be slightly different, but not essentially different.

In addition, Seaborn official documents as well as introduction to the style and color custom control and other ancillary components. Not much difficulty for the application of these API, the focus needs to be diligent practice.

Guess you like

Origin www.cnblogs.com/hannahzhao/p/11959465.html