Distribution data visualization

Univariate

1, the histogram displot

seaborn.distplot(a, bins=None, hist=True, kde=True, rug=False, fit=None, hist_kws=None, kde_kws=None, rug_kws=None, fit_kws=None, color=None, vertical=False, norm_hist=False, axlabel=None, label=None, ax=None)

Number of bins → box

hist, ked, rug → bool, whether me / density curve / data distribution

norm_hist → density according to whether the histogram display, if the False, the display count

{Hist, kde, rug, fit} _kws: dictionary, portions corresponding to various parameters.

Whether vertical → horizontal display

fit → scipy may be incorporated in the image library does fit

label → Legend

axlabel → x axis labeled

file

file

file

file

2, kernel density estimate FIG kdeplot

Step kernel density estimation:

  • Each observation with a near normal distribution curve approximation

  • All observations superimposed normal distribution curve

Normalized

seaborn.kdeplot(data,data2 = None,shade = False,vertical = False,kernel ='gau',bw ='scott',gridsize = 100,cut = 3,clip = None,legend = True,cumulative = False,shade_lowest = True,cbar = False,cbar_ax =无,cbar_kws =无,ax =无, kwargs )

shade: (contour when filled with a color or bivariate data) If True, then filled with a color of the area under the curve KDE

kernel: { 'gau' | 'cos' | 'biw' | 'epa' | 'tri' | 'triw'} for fitting core, can bivariate Gaussian kernel values ​​(GAU)

bw: { 'scott' | 'silverman' | scalar |} determined one pair of scalar nuclear size, understood as the fit approximation, the larger BW, the more gentle curve.

gridsize: int, discrete grid points

cumulative: whether to draw a cumulative distribution

cbar: If the parameter is True, a color bar is added (color bar image kde bivariate only)

file

file

file

file

FIG nuclear density distribution can draw only a single variable, the variables can be plotted bis!

file

file

Bivariate

1、jointplot

seaborn.jointplot(x,y,data = None,kind ='scatter',color = None,size = 6,ratio = 5,space = 0.2,dropna = True,xlim = None,ylim = None,joint_kws = None,marginal_kws =None,annot_kws =None, kwargs )

This function is a lightweight interface JoinGrid class, if you want to draw more flexible, can be used JoinGrid function.

kind: Set Type: "scatter", "reg", "resid", "kde", "hex"

size: int, the size of the image (the image is automatically adjusted to a square)

radio: height ratio int, and a main edge map of FIG.

space: # Set main map and the edge map pitch

{X, y} lim: shaft disposed before the drawing limits

{Joint, marginal, annot} _kws: other keywords assembly drawing parameters dicts

file

file

seaborn directly given Pearson correlation coefficient and the variable value P

pearson correlation coefficient calculation:

file

p: probability of sampling error caused by the difference between the samples is less than p.

file

file

file

file

file

2, JointGrid

Recall jointplot JoinGrid actually a package, in order to have a more flexible configuration, may be used JoinGrid class.

init(x,y,data = None,size = 6,ratio = 5,space = 0.2,dropna = True,xlim = None,ylim = None)

method:

plot (joint_func, marginal_func, annot_func) → draw the complete graphic

plot_joint (func, ** kwargs) → bivariate graphical drawing

plot_marginals (func, ** kwargs) → drawing pattern edge univariate

savefig( args,* kwargs)→ 保存

set_axis_labels ([xlabel, ylabel]) → disposed bivariate shaft axis labels.

file

file

file

file

file

file

Explore the relationship between the two bivariate

In general, our data is not only one or two variables, then for a number of variables, we often need to explore the distribution and bivariate relationship between the two is that we need to use pairplot function or PairGrid class.

3、pairplot

seaborn.pairplot(data,hue = None,hue_order = None,palette = None,vars = None,x_vars = None,y_vars = None,kind ='scatter',diag_kind ='auto',markers = None,s = 2.5,aspect = 1,dropna = True,plot_kws = None,diag_kws = None,grid_kws = None)

hue: string (variable names): color will be classified according to the specified variable

hue_order: list set tone color palette variable level

palette: Palette

vars: list variable name list, otherwise use all numeric variables columns

markers: Point Style

file

sepal_length sepal_width petal_length petal_width species

5.1 3.5 1.4 0.2 silky

4.9 3.0 1.4 0.2 silky

4.7 3.2 1.3 0.2 silky

4.6 3.1 1.5 0.2 silky

5.0 3.6 1.4 0.2 silky

file

file

file

4, PairGrid

Equivalent to the relationship jointplot and JointGrid, PairGrid scatterplot matrix has a more flexible control

init(data,hue = None,hue_order = None,palette = None,hue_kws = None,vars = None,x_vars = None,y_vars = None,diag_sharey = True,size = 2.5,aspect = 1,despine = True,dropna = True)

method:

add_legend ([legend_data, title, label_order]) a drawn legend, may be placed outside the shaft and adjust the pattern size.

map_diag (func, ** kwargs): Drawing FIG univariate function having at each diagonal sub FIG.

map_lower (func, ** kwargs): Drawing FIG bivariate function with the lower sub-diagonal FIG.

map_upper (func, ** kwargs): Drawing with FIG bivariate function on the diagonal submap

map_offdiag (func, ** kwargs): Drawing FIG bivariate function having on-diagonal sub FIG.

set (** kwargs): setting a property on each sub portfolio Axes.

file

file

Published 38 original articles · won praise 1 · views 2186

Guess you like

Origin blog.csdn.net/wulishinian/article/details/104916886