Statistics_Jia Junping——Thinking Questions Chapter 10 Analysis of Variance

1. What is ANOVA? What does it study?

Answer: ANOVA is to judge whether the independent variable of the classification type has a significant impact on the dependent variable of the numerical type by testing whether the means of each population are equal.

On the surface, analysis of variance is a statistical method to test whether multiple overall means are equal, but in essence it studies the impact of categorical independent variables on numerical dependent variables, for example, whether there is a relationship between variables, the strength of the relationship how to wait.

2. When you want to test whether multiple population means are equal, why not make pairwise comparisons, but use the analysis of variance method?

Answer: When testing whether multiple overall means are equal, if you make pairwise comparisons, you need to perform multiple t
tests. It is very cumbersome to make such pairwise comparisons, and as the number of individual significance tests increases, accidental factors lead to The likelihood of differences also increases (not that the means really differ). The ANOVA method considers all samples at the same time, thus eliminating the probability of error accumulation, thereby avoiding the rejection of a true null hypothesis. Analysis of variance can not only improve the efficiency of the test, but also increase the reliability of the analysis because it combines all the sample information together. Therefore, when you want to test the equality of multiple population means, you usually use the analysis of variance method.

3. What are the types of analysis of variance? How do they differ?

Answer: (1) According to the number of categorical independent variables analyzed, ANOVA can be divided into one-way ANOVA and two-way ANOVA.

(2) Difference:

① One-way analysis of variance studies the influence of a categorical independent variable on a numerical dependent variable;

②Two-way ANOVA studies the effect of two categorical variables on the numerical dependent variable.

4. What are the basic assumptions in ANOVA?

Answer: There are three basic assumptions in ANOVA:

(1) Each population should obey a normal distribution. That is, for each level of a factor, its observations are simple random samples from a normally distributed population.

(2) Variance σ 2 σ^2 of each populationp2 must be the same. That is, for each group of observation data, it is drawn from a normal population with the same variance.

(3) The observations are independent.

5. Briefly describe the basic idea of ​​variance analysis.

Answer: The basic idea of ​​variance analysis: by analyzing the contribution of variation from different sources to the total variation in the research, the influence of controllable factors on the research results can be determined.

6. Explain the meaning of factors and treatments.

Answer: In ANOVA, the object to be tested is called a factor or factor; the different performance of a factor is called a level or treatment. For example: To analyze whether the industry (retail industry, tourism industry, airline company, home appliance manufacturing industry) has a significant impact on the number of complaints, the "industry" here is the object to be tested, which is called "factor" or "factor" ; retailing, tourism, airlines, home appliance manufacturing are different manifestations of the factor "industry", which is called "level" or "handling".

7. Explain the meaning of within-group error and between-group error.

Answer: (1) Due to the random error caused by the randomness of sampling, this kind of data error from within the level is called within-group error.

(2) The data error from different levels is called inter-group error, and this difference may be due to the random error formed by sampling itself, or the systematic error caused by the systematic factors of the industry itself. Therefore, the between-group error is the sum of random error and systematic error.

8. Explain the meaning of the within-group variance and the between-group variance.

Answer: The mean square of the intragroup square sum SSE is called the intragroup mean square or intragroup variance, which is recorded as MSE, and its calculation formula is:

MSE = sum of squares within the group / degrees of freedom = SSE / (n-k)

The mean square of the sum of squares and SSA between groups is called the mean square between groups or the variance between groups, denoted as MSA, and its calculation formula is:

MSA = sum of squares between groups / degrees of freedom = SSA / (k-1)

9. Briefly describe the basic steps of analysis of variance.

Answer: (1) The basic steps of one-way ANOVA include:

① put forward a hypothesis

H0: μ1=μ2=…=μi=…=μk, the independent variable has no significant effect on the dependent variable

H1: μi (i=1, 2, ..., k) are not all equal, the independent variable has a significant impact on the dependent variable

②Statistics of construction test

F = between-group variance MSA/intra-group variance MSE ~ F (k-1, n-k)

③ Statistical decision-making

If F>Fα, reject the null hypothesis H0: μ1=μ2=…=μk, indicating that the difference between μi (i=1, 2,…, k) is significant;

If F<Fα, the null hypothesis H0 is not rejected, and there is no evidence that there is a significant difference among μi (i=1, 2, ..., k).

(2) The basic steps of two-factor ANOVA without interaction include:

① put forward a hypothesis

The assumptions made for the row factor are:

H0: μ1=μ2=...=μi=...=μk, the row factor (independent variable) has no significant effect on the dependent variable

H1: μi (i=1, 2, ..., k) are not exactly equal, row factors (independent variables) have a significant impact on the dependent variable

The assumptions made for the column factors are:

H0: μ1=μ2=…=μj=…=μr, the column factors (independent variables) have no significant effect on the dependent variable

H1: μj (j=1, 2, ..., r) are not exactly equal, column factors (independent variables) have a significant impact on the dependent variable

②Statistics of construction test

Statistics to test whether the effect of the row factor on the dependent variable is significant:

FR = mean square MSR of row factors/mean square MSE of random error ~ F(k-1,(k-1)(r-1))

Statistics to test whether the effects of the column factors are significant:

FC = mean square MSC of column factors/mean square MSE of random error ~ F(r-1,(k-1)(r-1))

③ Statistical decision-making

If FR>Fα, reject the null hypothesis H0: μ1=μ2=…=μi=…=μk, indicating that
the difference between μi (i=1, 2,…, k) is significant. That is, the row factor being tested has a significant effect on the observed value.

If FC>Fα, reject the null hypothesis H0: μ1=μ2=…=μj=…=μr, indicating that
the difference between μj (j=1, 2,…, r) is significant, that is, the tested The following factors have a significant impact on the observed value.

(3) The basic steps of two-factor ANOVA with interaction include:

① put forward a hypothesis

The assumptions made for the row factor are:

H0: μ1=μ2=...=μi=...=μk, the row factor (independent variable) has no significant effect on the dependent variable

H1: μi (i=1, 2, ..., k) are not exactly equal, row factors (independent variables) have a significant impact on the dependent variable

The assumptions made for the column factors are:

H0: μ1=μ2=…=μj=…=μr, the column factors (independent variables) have no significant effect on the dependent variable

H1: μj (j=1, 2, ..., r) are not exactly equal, column factors (independent variables) have a significant impact on the dependent variable

The assumptions made for the interaction are:

H0: μ1 = μ2 = ... = μt = ... = μm, the interaction has no significant effect on the dependent variable

H1: μt (t = 1, 2, ..., m) are not exactly equal, and the interaction has a significant effect on the dependent variable

②Statistics of construction test

Statistics to test whether the effect of the row factor on the dependent variable is significant:

FR = mean square MSR of row factors/mean square MSE of error ~ F(k-1, kr(m-1))

Statistics to test whether the effects of the column factors are significant:

FC = mean square MSC of column factors / mean square MSE of error ~ F(r-1, kr(m-1))

Statistics to test whether the effect of the interaction is significant:

FRC = mean square MSC of interaction/mean square MSE of error ~ F((k-1)(r-1), kr(m-1))

③ Statistical decision-making

If FR>Fα, reject the null hypothesis H0: μ1=μ2=…=μi=…=μk, indicating that
the difference between μi (i=1, 2,…, k) is significant. That is, the row factor being tested has a significant effect on the observed value.

If FC>Fα, reject the null hypothesis H0: μ1=μ2=…=μj=…=μr, indicating that
the difference between μj (j=1, 2,…, r) is significant, that is, the tested The following factors have a significant impact on the observed value.

If FRC>Fα, the null hypothesis H0 is rejected: μ1=μ2=…=μj=…=μm, indicating that the
difference between μj (j=1, 2,…, m) is significant, that is, the tested The interaction has a significant effect on the observed value.

10. What is the role of multiple comparisons in ANOVA?

Answer: The multiple comparison method is to further test which means are different by pairwise comparison between the overall means. There are many methods for multiple comparisons, the least significant difference (LSD) method proposed by Fisher is commonly used.

11. What is an interaction?

Answer: Interaction is the effect of one factor on different levels of another factor. For example: For two-factor ANOVA, there is an interaction, which means that the combination of two factors produces a new effect on the dependent variable.

12. Interprets two-way ANOVA with no interaction and with interaction.

Answer: If the influence of two independent variable factors on the dependent variable is independent of each other, it is necessary to judge the influence of the two independent variable factors on the dependent variable separately, which is called a two-factor analysis of variance without interaction.

If in addition to the independent influence of the two independent variable factors on the dependent variable, the combination of the two factors will also produce a new effect on the dependent variable, then the two-factor ANOVA is called a two-factor ANOVA with interaction.

13. Explain the meaning and function of $R^2$.

Answer: (1) In one-way analysis of variance, R 2 R^2R2 represents the proportion of the sum of squares (SSA) between groups to the total sum of squares (SST), and its square root R
reflects the strength of the relationship between two variables. Its calculation formula is:

R 2 R^2 R2 = between-group sum of squares SSA/total sum of squares SST=SSA/(SSA+SSE)

(2) In the analysis of variance without interaction, the row sum of squares and the column sum of squares together measure the joint effect of the two independent variables on the dependent variable, and the ratio of the joint effect to the total sum of squares is defined as R 2 R^
2R2 , and its square root R reflects the strength of the relationship between the two independent variables and the dependent variable. Right now

R 2 R^2R2 = Combined Effect/Total Effect = (SSR+SSC)/SST=(SSR+SSC)/(SSR+SSC+SSE)

(3) In ANOVA with interaction, R 2 R^2R2 is defined as:

R 2 R^2 R2=(SSR+SSC+SSRC)/SST=(SSR+SSC+SSRC)/(SSR+SSC+SSRC+SSE)

where SSRC is the sum of squares of the interaction.

Guess you like

Origin blog.csdn.net/J__aries/article/details/130858279