Box plot learning

Box plot

Box plot is based on a five-digit display data distribution standardization method

UXKV4e.png

as the picture shows:

  • Median: Median
  • Q1: The first quartile (25% quantile)
  • Q3: The third quartile (75% quantile) defines the distance between Q1 and Q3 as the interquartile range (IQR)
  • Minimum:Q1-1.5*IQR
  • Maximum : Q3 + 1.5 * IQR
  • Outliers: Data beyond Minimum or Maximum, namely outliers

If the data is normally distributed, the corresponding probability distribution can be seen in the figure below, that is, outliers only account for 0.7%

UXMfzj.png

use

  • Box plots are for continuous variables
  • Sometimes if there are many outliers in the data, you may need to consider some transformations (such as taking the logarithm)
  • The effective way to use box plots is to compare and draw grouped box plots with one or more qualitative data.

Use matplotlib.pyplot to draw box plots

plt.boxplot([range(20), range(15)], labels=['a', 'b'])
plt.show()

Ux8el8.png

Reference article:

https://blog.csdn.net/dujiahei/article/details/82056283

https://zhuanlan.zhihu.com/p/110580568?from_voters_page=true

Guess you like

Origin blog.csdn.net/weixin_44338712/article/details/107572110