At its simplest, analysis of variance (ANOVA) is a method to test the hypothesis that two or more sample means could have been obtained from identical populations - against an alternate hypothesis that these samples represent populations that have different means, but are otherwise identical. Units 12 to 14 show how ANOVA goes much further than this, by providing a means to model the effects of one or more factors each at a number of levels on the dependent variable. Here we focus on one-way fixed effects ANOVA. This is where we have a single 'treatment' factor (= group) with several levels, and replicated observations at each level. We are interested in comparing the means of the observations between the different levels. The levels being compared are fixed by the researcher, rather than being chosen at random.
Analysis of variance compares means by looking at variances. It compares two estimates of the null population's variance:
- - One is derived from the variance of the sample means about their overall (grand) mean, and is known as the between groups (or treatment) mean square.
- - The other is obtained by combining the variances of each group of observations about their respective group mean, and is known as the within groups (or error or residual) mean square.
When the null hypothesis is true - that sample means are obtained from identical populations - provided those populations are normal, random selection causes the ratio of these two estimates of the population variance to be F
-distributed. Under the alternate hypothesis the distribution of this F
-ratio statistic would be skewed to the right - as compared to its null distribution.
One-way fixed-effects ANOVA is based on a mathematical model that describes the effects that are assumed to determine the value of any given observation:
- Yij is the observed value of the jth individual of group i,
- Here i is the treatment group, and there are assumed to be a groups in all,
- Where groups are of equal size, j denotes which replicate it is - otherwise group i contains ni observations.
- μ is the combined population mean,
- αi is the fixed deviation of the mean of group i from the grand mean μ; we will refer to this in future as the fixed effect of factor A at level i ,
- Under the null hypothesis all αi equal zero.
- Under the alternate hypothesis some or all αi are nonzero, and their value does not vary.
- εij is a random deviation of the jth individual of group i from μ + αi; we will refer to this in future as the random error effect.
- In the common, parametric, ANOVA model εij is assumed to be a random normal variate - whose mean value is zero.
Expected mean squares
Whenever we give an ANOVA model in this and subsequent units, we will specify what components of variation each mean square describes. In other words we will specify the expected mean squares for the model. At this stage with one-way fixed effects ANOVA, this is straightforward.
|Source of variation
- a is the number of groups, and n is the number of observations per group,
- N is the total number of observations (= an),
- σ2 is the error variance,
- Σα2/(a-1) is the added component due to treatment, and α are the deviations of the treatment means from the group mean.
The traditional way to obtain the required variance estimates and their degrees of freedom is by partitioning sums of squares. An important property of sums of squares (unlike variances) is that they are additive. This makes it easier to calculate them, as the residual sum of squares can be obtained by subtracting the group sum of squares from the total sum of squares.
If we have 'a' treatment levels, and the same number of independent replicates 'n' of each treatment, we can use the notation shown below:
| ||Treatment 1||Treatment 2||....Treatment a||Totals
(n per treatment)
Observations are shown as Y1,1 to Yn,a, treatment totals as T1 to Ta, and the grand total as G.
Each of the treatment totals is comprised of n observations, and the grand total is comprised of N (=Σ(ni) observations.
There are no totals for replicates as replicate 1 of treatment 1 has nothing in common with replicate 1 of treatment 2.
Above we have assumed that there are the same number of observations (n) for each treatment. But for one way ANOVA the calculations are just as straightforward if we have unequal numbers of replicates (ni) for each treatment. Hence for calculations below we do not assume group sizes are equal.
Algebraically speaking -
- SStotal is the total sums of squares, or Σ(-)2, summed for all N observations,
- Yij is the value of the jth observation in group i, and is the overall mean (G/N),
- G is the overall total (ΣYij)and N is the total number of observations, (Σni)
||=||SStotal − SSgroups
- SSgroups is sums of squares between groups, or Σ(ni[i-]2), summed for all i groups,
- i is the mean of group i, or Ti/ni,
- Ti is the total of group i, or ΣYi,
- ni are the number of observations in group i,
- SSwithin is sums of squares within groups, or residual sums of squares, or Σ(-i)2
When group sizes are equal:
- ni equals n for all i=1 to a,
- a is the number of treatment groups,
- N equals an,
- SSB equals ΣTi2/n-G2/N, or nΣ(i-)2, summed for all i groups. If this sum of squares formula looks rather familiar, compare it with the calculator variance formula - without its degrees of freedom.
The value of these sums of squares are then inserted into the ANOVA table (see below), along with their respective degrees of freedom. The mean squares are obtained by dividing the sums of squares by their respective degrees of freedom.
|Source of variation
||a − 1
||SSgroups / a−1
||MSgroups / MSwithin
||N − a
||SSwithin / N−a
||N − 1
The F-ratio for the 'treatment effect' is obtained by dividing MSBetween by MSWithin. The P-value of this F-ratio is obtained for a − 1 and N − a degrees of freedom. Here P is the proportion of that F-distribution which exceeds the observed value of F.
Estimating effect sizes
If the treatment effect is significant, we conclude that we cannot consider the treatments as a homogeneous set. In that case one moves to the next step of estimating the treatment means, and standard errors, and establishing which means are significantly different from which other means. This can be done either by continued partitioning of sums of squares, each of which is assessed by ANOVA and F-test, or by various techniques of multiple comparison of means - techniques which we cover in a separate More Information page on multiple comparison of means. Note that the comparison of means and estimation of standard errors (or confidence intervals) should not be considered as an afterthought - it is actually the most important part of an analysis of variance.
- Random sampling
Random sampling from populations or random allocation to treatments is an essential precondition for analysis of variance (as it is for most other statistical analysis). For observational studies, strictly random sampling can only be achieved if sampling units can be listed. Unfortunately this is simply not feasible in most ecological studies, although every effort should still be made to avoid bias, so that as representative a sample as possible is obtained. For experimental studies, allocation of experimental units to treatment should always be random - again with every effort made to avoid bias through (for example) blinding.
- Independent errors
The important thing here is to select the appropriate unit of analysis or (in ecological parlance) to avoid pseudoreplication. There is now greater awareness of the need for independent replication in experimental work, but pseudoreplication is still rife in observational studies. Unit 7 considers this topic in depth.
- Same error distributions
ANOVA assumes that the distribution of errors in each group represent the same population. This assumption is often summarized together with previous assumptions as IIED (independent and identical error distributions), and applies to all analysis of variance, whether parametric or non-parametric. For a parametric (normal) model, if the 'within group' variance differs from group to group, then the logic of ANOVA breaks down - and you can no longer assess whether there is a treatment effect by comparing the 'between group' variance estimate with the 'within group' variance estimate. Fortunately, providing sample sizes are equal, parametric ANOVA is reasonably robust to non-homogeneity of variances - which is why some authorities recommend homogeneity of variance tests that have rather low power. However, when sample sizes differ and/or variances vary considerably, then your ANOVA P-value may be wildly misleading. A number of tests, each with their own strengths and weaknesses, are available for testing homogeneity of variances. These include Hartley's and Cochrane's tests, Bartlett's test and Levene's test.
- Normal distribution of errors
Parametric ANOVA assumes errors in each of the groups are normally distributed. This is best assessed on the raw data (prior to analysis) using box and whisker plots and/or graphical techniques. This must be done for each group separately - pooling all observations and testing the pooled data is meaningless. A transformation may be applied to normalize distributions, although if a transformation is used, it should have some a priori justification (for example because effects tend to be multiplicative rather than additive). The distribution of residuals should also be checked for normality after fitting the model. There is some dispute as to how robust ANOVA is to non-normality.
- Additive Effects
The ANOVA model assumes that effects are additive - which is why the model is Yij = μ + αi
+ εij. What this means is that we are assuming (for example) that yield at treatment level (i=) 1 tends to be a certain number of units greater than yield at treatment level (i=) 2. This may not the case, and we find instead that yield at one treatment level tends to be (say) double the yield at another treatment level. In this case the biological model is multiplicative. Fortunately a log transform may render such effects additive, as well as (sometimes) homogenizing variances and normalizing distributions.
Hartley's & Cochran's tests
Unequal variance ANOVA