Note that if exposure could be assessed on a measurement scale, then the group (treatment) factor could be assessed directly using a regression-type design. As it is with the available information, and using ANOVA, we can only compare sites (rather than wave exposure).
Draw boxplots and assess normality
Plot out data to assess how appropriate (parametric) ANOVA is for each level of the data set. The transect means are the replicates for comparing sites so we need an aggregated dataset on which to base the boxplots. Unfortunately in this example we only have (b = 2) transects in each site so the boxplots are uninformative - neverthless do it below for the sake of completeness.
We then assess the distribution of observations within each transect.
The boxplots suggest that variances are fairly similar for each transect. We assess normality of observations using normal QQ plots for each subgroup.
These data seem remarkably 'well behaved' although we have single unusually high values in each of transects 3 and 4. At least we can be reasonably certain that the site means follow a normal distribution given central limit theorem.
Some authors (notably Logan (2010)) suggest that the distribution of observations within subgroups is unimportant given that the comparison of fixed factor levels is based on subgroup means. Whilst there is an element of truth in this, it is still extremely important to assess within subgroup distributions for two reasons:|
- In many cases (as with this one) the number of observations of subgroup means will be too small to adequately assess the distribution of means. If you also have evidence of normality within subgroups, then one has more grounds for believing that the subgroup means are normally distributed. Just relying on central limit theorem with small sample sizes is very unwise!
- If you are a 'pooler' (we are not, but have to accept that some people have still not seen the light), then you will use the P-value for subgroups within groups to decide whether or not to pool. But the validity of this test depends critically on the distribution of observations within subgroups.
Check homogeneity of variances
Since P = 0.450, there is no evidence for heterogeneity of variances.
Carry out analysis of variance
Sums of squares can be calculated manually as follows:
|| 35634.6 - 30470.4
|| = 5164.2
|SStransects within sites
|| + .....
||35799.6 − 35634.6|| = 165
||6243.6 − 5164.2 − 165 = 914.4
These values are inserted into the ANOVA table and the mean squares calculated.
|Transects w'in sites
The difference between sites was highly significant (P = 0.002). The difference between transects within sites was not significant (P = 0.244).
There are two ways to carry out a nested ANOVA in R. The most straightforward approach is to use the aov function with multiple error strata.
This gives the P-value for sites as 0.0018 correctly using the MStransects w'in sites as the denominator for the F-ratio. The P-value for transects within sites is not given but can be readily calculated as MSresiduals:tran / MSresiduals:within.
Use of aov() is only appropriate for balanced designs and a lme () (a linear mixed-effects model) should be used for unbalanced designs. For a balanced design it will give the same mean square and F-ratio as the aov function.
Both methods give an F-ratio between sites of 41.73 between sites with a P-value of 0.0018 as we obtained in the manual calculations.
Should we pool denominators?
Reference to the full ANOVA table above shows that transects within sites were not a significant source of variation. The provider of these data used this fact to justify pooling MStransects within sites and MSwithin transects to provide a new estimate of the Mean square errorwithin. We would regard this as pseudoreplication, and would not pool. In fact, since the P- value is < 0.25, even well known advocates of pooling (for example Quinn & Keough (2002) and Doncaster & Davey (2007)) would not pool in this case. The key point is that using a mean square error with 36 degrees of freedom implies that you have taken 10 random independent samples from each type of site - which you simply haven't! You have only sampled from two randomly located transects in each type of site - your analysis should reflect this fact!
Examine model diagnostics
The next step is to examine the fitted model diagnostics focusing on the first error stratum. The figure below shows residuals versus fitted values, and a qq plot to assess normality:
There is no evidence of any increase in the spread of residuals as fitted values increase so we accept that the model is adequate.
Comparison of means
Lastly we compare means. Kerr only made one comparison (site A versus Site B) but we will assume a planned full set of orthogonal comparisons, namely site A versus Site B, site C versus site D, and mean of site A & B versus mean of site C+D (whether this is a logical set of comparisons depends entirely on the particular study - we do it here only as an example). We do not give the manual calculations for this but only the R-code and output:
In the first table the values above or below the cross-product matrix diagonal are all zero. This confirms that the set of planned contrasts is orthogonal. The next table decomposes the SSsites into the three planned contrasts - all prove to be significant.