How to draw a category scatter plot using striplot() function?

There are many types of data in the data set. In addition to continuous feature variables, the most common type of data is categorical data, such as gender, education, hobbies, etc. These data types cannot be represented by continuous variables. It is represented by classified data. Seaborn provides special visualization functions for categorical data. These functions can be roughly divided into the following three types:

Categorical data scatter plot: swarmplot() and stripplot().

Distribution plots of class data: boxplot() and violinplot().

Statistical estimation plots for categorical data: barplot() and pointplot().

Next, stripplot() is used to draw category scatter plots. The syntax format of the stripplot() function is as follows.

seaborn.stripplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, jitter=False)

The meanings of the commonly used parameters in the above functions are as follows:

(1) x, y, hue: input for drawing long-format data.

(2) data: Data set used for drawing. If x and y are not present, it will be used as wide format, otherwise it will be used as long format.

(3) jitter: Indicates the degree of jitter (only along the category axis). When many data points overlap, you can specify the amount of jitter or set it to Tue to use the default value.

In order to give everyone a better understanding, next, draw a scatter plot through the stripplot() function. The sample code is as follows.

# 获取tips数据
tips = sns.load_dataset("tips")
sns.stripplot(x="day", y="total_bill", data=tips)

The running results are shown in the figure below.

As can be seen from the above figure, the abscissa in the chart is classified data, and some data points overlap each other, making it difficult to observe. In order to solve this problem, you can pass in the jitter parameter when calling the striplot() function to adjust the position of the abscissa. The modified sample code is as follows.

sns.stripplot(x="day", y="total_bill", data=tips, jitter=True)

The running results are shown in the figure below.

In addition, the swarmplot0 function can also be called to draw a scatter plot. The advantage of this function is that all data points will not overlap, and the distribution of data can be clearly observed. The sample code is as follows.

sns.swarmplot(x="day", y="total_bill", data=tips)

The running results are shown in the figure.

Guess you like

Origin blog.csdn.net/zy1992As/article/details/132581003