Use Python to test whether the variances of several sets of data are equal

 Sometimes we will encounter to determine whether the mean or variance of the two sets of data are equal, we can use t-test to determine whether the means are equal ( how to perform t-test ), and for variance, we also have a test method, which is better than using t-test It is much simpler to determine whether the means are equal.

 We use scipy.stats.levene(), just enter the data we want to test, but it must be one-dimensional. The final result outputs the test statistics and ppp value, ifppp- value is less than our givenα \alphaα threshold, then we can determine that the variances of these sets of data are not equal.

import numpy as np
from scipy import stats

#先产生50个服从标准正态分布的样本和50个均值为0方差为4的数据
np.random.seed(2020)
data_ran = np.random.normal(0, 1, 50)
data_ran2 = np.random.normal(0, 2, 50)
#检验两组数据的方差是否相等(这两组数据的方差并不相等,因此结果应该是拒绝原假设)
r1 = stats.levene(data_ran, data_ran2)
print(r1)

Output: LeveneResult(statistic=14.941411312615362, pvalue=0.00019943084704952306)

 It can be seen from the output that ppThe p- value is very small, even if we changeα \alphaα is set to0.001 0.0010 . 0 0 1 , you can still reject the variance of the two data sets are not equal (in fact, too, because they come from a normal distribution of variance, a variance from the normal distribution 4).

Guess you like

Origin blog.csdn.net/TSzero/article/details/111877496