 
Principles
At its simplest, analysis of variance (ANOVA) is a method to test the hypothesis that two or more sample means could have been obtained from identical populations  against an alternate hypothesis that these samples represent populations that have different means, but are otherwise identical. Units 12 to 14 show how ANOVA goes much further than this, by providing a means to model the effects of one or more factors each at a number of levels on the dependent variable. Here we focus on oneway fixed effects ANOVA. This is where we have a single 'treatment' factor (= group) with several levels, and replicated observations at each level. We are interested in comparing the means of the observations between the different levels. The levels being compared are fixed by the researcher, rather than being chosen at random.
Analysis of variance compares means by looking at variances. It compares two estimates of the null population's variance:
 (a)  one is derived from the variance of the sample means about their overall (grand) mean, and is known as the between groups (or treatment) mean square
 (b)  the other is obtained by combining the variances of each group of observations about their respective group mean, and is known as the within groups (or error or residual) mean square
 When the null hypothesis is true  that sample means are obtained from identical populations  provided those populations are normal, random selection causes the ratio of these two estimates of the population variance to be Fdistributed. Under the alternate hypothesis the distribution of this Fratio statistic would be skewed to the right  as compared to its null distribution. 
Model
Oneway fixedeffects ANOVA is based on a mathematical model that describes the effects that are assumed to determine the value of any given observation:
Fixed effects
Y_{ij} = μ
 +
 α_{i
}  +
 ε_{ij
} 
where:
 Y_{ij} is the observed value of the jth individual of group i,
 Here i is the treatment group, and there are assumed to be a groups in all,
 Where groups are of equal size, j denotes which replicate it is  otherwise group i contains n_{i} observations.
 μ is the combined population mean,
 α_{i} is the fixed deviation of the mean of group i from the grand mean μ; we will refer to this in future as the fixed effect of factor A at level i ,
 Under the null hypothesis all α_{i} equal zero.
 Under the alternate hypothesis some or all α_{i} are nonzero, and their value does not vary.
 ε_{ij} is a random deviation of the jth individual of group i from μ + α_{i}; we will refer to this in future as the random error effect.
 In the common, parametric, ANOVA model ε_{ij} is assumed to be a random normal variate  whose mean value is zero.

Expected mean squares
Whenever we give an ANOVA model in this and subsequent units, we will specify what components of variation each mean square describes. In other words we will specify the expected mean squares for the model. At this stage with oneway fixed effects ANOVA, this is straightforward.
Fixed effects
Source of variation
 df
 Expected MS
 Variance ratio


1.  Groups
 a1
 σ^{2
} 
 + nΣα^{2}/(a1)
 MS_{1}/MS_{2
} 
2.  Error
 Na
 σ^{2
} 



Total variation
 N1





where
 a is the number of groups, and n is the number of observations per group,
 N is the total number of observations (= an),
 σ^{2} is the error variance,
 Σα^{2}/(a1) is the added component due to treatment, and α are the deviations of the treatment means from the group mean.

Computational formulae
The traditional way to obtain the required variance estimates and their degrees of freedom is by partitioning sums of squares. An important property of sums of squares (unlike variances) is that they are additive. This makes it easier to calculate them, as the residual sum of squares can be obtained by subtracting the group sum of squares from the total sum of squares.
If we have 'a' treatment levels, and the same number of independent replicates 'n' of each treatment, we can use the notation shown below:
 Treatment 1  Treatment 2  ....Treatment a  Totals

Replicates: (n per treatment)  Y_{1,1}  Y_{1,2}  Y_{1,a} 

Y_{2,1}  Y_{2,2}  Y_{2,a
} 
Y_{n,1}  Y_{n,2}  Y_{n,a
} 
Totals  T_{1}  T_{2}  T_{a}  G

Observations are shown as Y_{1,1} to Y_{n,a}, treatment totals as T_{1} to T_{a}, and the grand total as G. Each of the treatment totals is comprised of n observations, and the grand total is comprised of N (=Σ(n_{i}) observations.
There are no totals for replicates as replicate 1 of treatment 1 has nothing in common with replicate 1 of treatment 2.
Above we have assumed that there are the same number of observations (n) for each treatment. But for one way ANOVA the calculations are just as straightforward if we have unequal numbers of replicates (n_{i}) for each treatment. Hence for calculations below we do not assume group sizes are equal.
Algebraically speaking 
SS_{total
}  =
 Σ(
 Y_{ij}^{2
}  )
 −
 G^{2
} 

N

where:
 SS_{total} is the total sums of squares, or Σ()^{2}, summed for all N observations,
 Y_{ij} is the value of the jth observation in group i, and is the overall mean (G/N),
 G is the overall total (ΣY_{ij})and N is the total number of observations, (Σn_{i})
SS_{groups
}  =  Σ(
 T_{i}^{2
}  )
 −
 G^{2
} 
 
n_{i}  N

SS_{within
}  =  SStotal − SS_{groups
} 
 SS_{groups} is sums of squares between groups, or Σ(n_{i}[_{i}]^{2}), summed for all i groups,
 _{i} is the mean of group i, or T_{i}/n_{i},
 T_{i} is the total of group i, or ΣY_{i},
 n_{i} are the number of observations in group i,
 SS_{within} is sums of squares within groups, or residual sums of squares, or Σ(_{i})^{2}
When group sizes are equal:
 n_{i} equals n for all i=1 to a,
 a is the number of treatment groups,
 N equals an,
 SS_{B} equals ΣT_{i}^{2}/nG^{2}/N, or nΣ(_{i})^{2}, summed for all i groups. If this sum of squares formula looks rather familiar, compare it with the calculator variance formula  without its degrees of freedom.

The value of these sums of squares are then inserted into the ANOVA table (see below), along with their respective degrees of freedom. The mean squares are obtained by dividing the sums of squares by their respective degrees of freedom.
Source of variation 
df 
SS 
MS 
Fratio 
Pvalue 
Between groups 
a − 1 
SS_{groups} 
SS_{groups} / a−1 
MS_{groups} / MS_{within} 

Within groups 
N − a 
SS_{W} 
SS_{within} / N−a 

Total 
N − 1 
SS_{Total} 

The Fratio for the 'treatment effect' is obtained by dividing MS_{Between} by MS_{Within}. The Pvalue of this Fratio is obtained for a − 1 and N − a degrees of freedom. Here P is the proportion of that Fdistribution which exceeds the observed value of F.
Estimating effect sizes
If the treatment effect is significant, we conclude that we cannot consider the treatments as a homogeneous set. In that case one moves to the next step of estimating the treatment means, and standard errors, and establishing which means are significantly different from which other means. This can be done either by continued partitioning of sums of squares, each of which is assessed by ANOVA and Ftest, or by various techniques of multiple comparison of means  techniques which we cover in a separate More Information page on multiple comparison of means. Note that the comparison of means and estimation of standard errors (or confidence intervals) should not be considered as an afterthought  it is actually the most important part of an analysis of variance.
Assumptions
 Random sampling
Random sampling from populations or random allocation to treatments is an essential precondition for analysis of variance (as it is for most other statistical analysis). For observational studies, strictly random sampling can only be achieved if sampling units can be listed. Unfortunately this is simply not feasible in most ecological studies, although every effort should still be made to avoid bias, so that as representative a sample as possible is obtained. For experimental studies, allocation of experimental units to treatment should always be random  again with every effort made to avoid bias through (for example) blinding.
 Independent errors
The important thing here is to select the appropriate unit of analysis or (in ecological parlance) to avoid pseudoreplication. There is now greater awareness of the need for independent replication in experimental work, but pseudoreplication is still rife in observational studies. Unit 7 considers this topic in depth.
 Same error distributions
ANOVA assumes that the distribution of errors in each group represent the same population. This assumption is often summarized together with previous assumptions as IIED (independent and identical error distributions), and applies to all analysis of variance, whether parametric or nonparametric. For a parametric (normal) model, if the 'within group' variance differs from group to group, then the logic of ANOVA breaks down  and you can no longer assess whether there is a treatment effect by comparing the 'between group' variance estimate with the 'within group' variance estimate. Fortunately, providing sample sizes are equal, parametric ANOVA is reasonably robust to nonhomogeneity of variances  which is why some authorities recommend homogeneity of variance tests that have rather low power. However, when sample sizes differ and/or variances vary considerably, then your ANOVA Pvalue may be wildly misleading. A number of tests, each with their own strengths and weaknesses, are available for testing homogeneity of variances. These include Hartley's and Cochrane's tests, Bartlett's test and Levene's test.
 Normal distribution of errors
Parametric ANOVA assumes errors in each of the groups are normally distributed. This is best assessed on the raw data (prior to analysis) using box and whisker plots and/or graphical techniques. This must be done for each group separately  pooling all observations and testing the pooled data is meaningless. A transformation may be applied to normalize distributions, although if a transformation is used, it should have some a priori justification (for example because effects tend to be multiplicative rather than additive). The distribution of residuals should also be checked for normality after fitting the model. There is some dispute as to how robust ANOVA is to nonnormality.
 Additive Effects
The ANOVA model assumes that effects are additive  which is why the model is Y_{ij} = μ + α_{i}
+ ε_{ij}. What this means is that we are assuming (for example) that yield at treatment level (i=) 1 tends to be a certain number of units greater than yield at treatment level (i=) 2. This may not the case, and we find instead that yield at one treatment level tends to be (say) double the yield at another treatment level. In this case the biological model is multiplicative. Fortunately a log transform may render such effects additive, as well as (sometimes) homogenizing variances and normalizing distributions.
Related
topics :

Hartley's & Cochran's tests
Bartlett's test

Levene's test
Unequal variance ANOVA

