post hoc multiple comparison method

1. Case introduction

In the case of one-way analysis of variance, in order to study the effect of turmeric on the survival time of hypotonic hypoxic mice, 36 mice were randomly generated into three groups A, B and C, 12 in each group, half male and half male, respectively Three different doses of turmeric, 10g/kg, 20g/kg, and 40g/kg, were administered orally. The mice in each group were placed in 250ml airtight jars with soda lime at the same time, and the survival time of the mice was observed and recorded. I want to study whether the survival time of mice under different doses of turmeric is different. After analysis, it is found that there is a significant difference in the model, indicating that the overall averages of the two groups are not all equal, but whether the overall averages of the three groups A, B, and C are different. Whether all the numbers are different, or two of them have different overall means, needs further research (case data source: Yan Hong, Xu Yongyong. Medical Statistics [J]. People's Health Publishing House, 2015.).

2. Problem Analysis

After one-way analysis of variance, it is found that the models are significantly different, indicating that the overall means of the pair are not all equal, but whether the overall means of the three groups A, B, and C are all unequal, or two of the overall means are unequal, It is necessary to further perform pairwise comparisons of multiple means, using the multiple comparison method.

3. Software operation and result interpretation

(1) Data import

1. Data format

First, organize the data into the correct format. Generally, X is a column and Y is an example. If the analyzed data has data labels, another table needs to be added for explanation. The data format is as follows:

2. Import data

Upload the organized data to the SPSSAU system as follows:

The upload results are as follows:

(2) Post-hoc multiple comparison analysis

For the single-factor analysis of variance process, you can refer to the previous article, and the related process will not be repeated here.

  1. Software operation
    After the one-way analysis of variance, conduct pairwise comparisons for further research. The analysis path is to click [General Method] → [Analysis of Variance] and then analyze:

  1. Post-hoc multiple comparison method description

There are many methods for multiple comparisons, which can generally be found according to the probability distribution and development process used in his development, generally including methods derived from t-test and q-test. At present, these two methods are widely used. In addition, there are some methods based on F The method of distribution is described in detail as follows:

(1) LSD method

LSD is the earliest multiple comparison method. LSD is very similar to the independent sample t test. The main difference is that the LSD method first meets the F test to achieve significance, and uses the mean square of the error of the F test as the combined variance. The LSD method is simple because of its calculation , the test efficiency is high, so it is widely used. Some researchers simulated and compared different post-hoc multiple comparison methods, and found that when the F test is significant in the variance, the LSD method is the multiple comparison method with the highest test efficiency, but the LSD method also has obvious Insufficiencies, such as involving too many means to compare pairs, have a higher probability of making Type I errors. Its calculation formula is as follows:

tα/2 is the critical value of the t distribution, obtained by checking the t distribution table, its degree of freedom is nk, n is the total number of samples, k is the number of different levels in the factor; MSE is the variance within the group; ni and nj are respectively The sample size of the i-th sample and the j-th sample.
4. Make a decision based on the significance level α, if the absolute value of the mean difference is greater than LSD, then reject H0, otherwise not reject H0.

(2) Chiefs

The Scheffe test sets all possible linear combinations. The critical value of the Scheffe test is to multiply the degree of freedom between groups by the critical value of the F test. This method is more effective for unbalanced designs, but the Scheffe calculation is more complicated than other methods. The calculation formula as follows:

(3) Bonferroni correction

The Bonferroni inequality states that the total probability of one or more events occurring is not greater than the sum of the individual probabilities of these events. However, it is generally believed that this method is too conservative (it is more difficult to get the conclusion of rejecting the null hypothesis), and the Bonferroni method is suitable for multiple comparisons where the number of comparisons is not too many (the more the number of comparisons, the more conservative). Calculated as follows:

(4)sighting

The basic idea of ​​Sidak is close to the Bonferroni method. Generally, sidak is better than the Bonferroni method. It uses a special case in the Bonferroni correction to estimate the common significant level between all hypothesis tests.

(5) Tamhane T2 (inhomogeneous variance)

If the variance is not homogeneous, but you want to perform multiple comparisons, use this method. The premise of its use is that the data needs to meet normality, but it does not need to meet the variance homogeneity.

(6) SNK Q test

The basic goal of the SNK method is to divide the treatment means into subsets of different treatments. These subsets are homogeneous internally, but there are differences between different subsets, so SNK is also called stratified test, which uses different test standards to determine significance according to different scopes. Because its statistic is q, it is also called q test. Its statistics are calculated as follows:

In the formula, the numerator is the difference between the sample means of any two comparison groups A and B; the denominator is the standard error of the difference; in the formula and in �� and �� are the number of cases of A and B samples respectively, MS _ The error is the mean square of the error calculated in the aforementioned analysis of variance.

(7) Duncan test

Duncan's new multiple range method, currently more statistical literature does not recommend the use of this method, because the derivation of the error rate of Duncan's multiple comparison test is based on the monotonic condition (monotonically increasing or monotonically decreasing without shock), which is also There are many reasons why Duncan's new multiple range method is used in many animal test results.

3. Make pairwise comparisons

In this example, the SNK Q test is used for pairwise comparisons. The result is as follows:

ANOVA table:

The SNK Q test is as follows:

  1. When the average value
    is compared between group A and group B, the average value is the average value of the data of group A and group B respectively. The rest are the same.
  2. Absolute value of difference

(I) Average value - (J) Absolute value of average value = absolute value of difference, such as |40.083-52.985|=12.875, and so on.

3)K

The boundary value of the Q test not only needs to consider the degree of freedom, but also needs to consider the difference in rank R, which is represented by K here

The ranks of Group A, Group B and Group C are as follows:

For example, Group A and Group B: K=|1-2|+1=2; and so on.

4)df

df is the degree of freedom. The degree of freedom here is the degree of freedom of error. There are 3 sets of data, each with 12 samples, so the total degree of freedom is 12×3-1=25, and the degree of freedom of the independent variable is 3-1= 3, so the degree of freedom of error is 35-2=33;

5) 5% and 1%

According to K and degrees of freedom, check the q test boundary value table to determine the p value.

6) with

Se is the standard error of the difference, calculated as follows:

7) LSR value

LSR value=Q critical value*se, such as 2.887*1.665=4.792; and so on;

4. Conclusion

After one-way analysis of variance, it was found that there was a significant difference in the model, indicating that the means of the two groups were not equal. After the method of SNK Q test for multiple comparisons after the event, it was found that group A and group B were significant at the 0.01 level, indicating that both There are differences, and Group A and Group C, Group B and Group C are all significant at the 0.01 level, indicating that there are significant differences between the three groups.

5. Knowledge Tips

1. How to do Dunnett's post hoc multiple comparisons?

If Dunnett's post hoc test is required, you can use the non-parametric test in the SPSSAU general method, and select the multiple comparison method as Dunn's t method.

2. How to solve the prompt 'Data Quality Abnormal'?

If the number of Y is less than 2 under a certain group of X, an abnormal situation will occur at this time. It is recommended to use classification and summary to check. After confirming the abnormal situation, use the filter sample function to process and analyze again.

Guess you like

Origin blog.csdn.net/m0_37228052/article/details/131978278