ANOVA Notes

 

 Analysis of variance is used to compare the means of multiple population samples for differences. This method was first proposed by RA.Fisher, and later perfected by GW.Snedecor. In order to commemorate Fisher, the analysis of variance is called the F test.

 
The basic idea of ​​ANOVA is to first divide the total variation into between-group and within-group variation, and then calculate the F value of both. The larger the F value, the greater the difference between groups, and the treatment works. Otherwise, it does not work, which is caused by random errors.
 
Sum of squared deviations: is the sum of the squares of the differences between the terms and the mean term
Let x be a random variable, let η=x-Ex, then η is called the dispersion of x. It reflects the degree of deviation of x from its mathematical expectation Ex


DF - degrees of freedom for each source. Within-group DF=nm, between-group DF=m-1, where n is the total number of samples and m is the number of groups
SSb—Sum of squared variance between groups SSb (factor)
SSw — Within-group sum of squared deviations SSw (error)
MS - The mean square of the sum of squares divided by the degrees of freedom.
F - MS between/MS within; this ratio can be compared to the critical F found in the table, or the p-value can be used to determine if a factor is significant.
P - Used to determine whether a factor is significant; usually compared to an alpha value of 0.05. The factor is significant if the p-value is below 0.05.
 
Each F table will be marked with a significance level α, such as 0.05 or 0.01
α: Significance level, generally enter 0.05, which is 95% confidence.
 
Then the calculated F value is compared with the F table value obtained from the lookup table (with a confidence level), if
F < F table indicates that there is no significant difference between the two groups of data;
The F ≥ F table indicates that there is a significant difference between the two groups of data
 
In addition, the P value can be used to determine whether the hypothesis holds:
The formula for calculating the P value is
=2[1-Φ(z0)] When the tested hypothesis H1 is that p is not equal to p0;
=1-Φ(z0) When the tested hypothesis H1 is that p is greater than p0;
=Φ(z0) When the tested hypothesis H1 is that p is less than p0;
Among them, Φ(z0) is obtained by looking up the table.
z0=(xn*p0)/(under the square root (np0(1-p0)))
Finally, when the P value is less than a significant parameter (usually 0.05, marked as α, the person who gave you the question may confuse the two concepts), we can reject the hypothesis. Otherwise, the hypothesis cannot be rejected.
Note that here p0 is that missing hypothetical satisfaction, not the required P value.
There is no hypothesis test without p0, and there is no P value
 
An example of using Excel to perform ANOVA

 
After the data input is completed, go to Operation Tools > Data Analysis, and select in the Data Analysis Tools dialog box.
One-way ANOVA, the dialog box shown in Figure 3-2 appears, the content of the dialog box is as follows:
1. Input area: select the area where the analysis data is located, you can select the horizontal mark, for Figure 3-1
selected (green) and yellow areas for analysis.
2. Grouping method: provide the choice of column and row, when the data of the same level are in the same row
Select rows, select columns when they are in the same column, select rows in this example.
3. If the horizontal flag is included when selecting data, the selection flag is on the first row, in this example
Select.
4. α: significance level, generally enter 0.05, which is 95% confidence.
5. Output options: Select the appropriate storage location for analysis results as required.
After entering options according to Figure 3-2, the result of data analysis in Figure 3-1 is shown in Figure 3-3.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326266450&siteId=291194637