 InfluentialPoints.com
Biology, images, analysis, design...
 Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)  #### Principles

A common extension of one way ANOVA is to have additional nominal variables (factors) nested within the main factor of interest. By nested we mean that each level of the 'lower' nominal variable occurs in only one level of the 'higher' nominal variable.

For example in a trial of fertilizers, we might have 15 plots with 5 plots randomly allocated to each of three treatments. We assess yield on just one plant in each plot. In this case our plots are nested in treatment because any particular plot only gets one of the three treatment levels. But since we only have one observation for each plot, we use these observations as replicates in a simple one-way ANOVA.

However, we may consider that there is too much variation between plants within plots to rely on the yield of just one plant. Hence we measure yield on (say) 20 different plants in each plot. Plot becomes a second (nested) factor in the experiment, but plots still consitute the replicates for the treatment factor (they were the unit randomly allocated to treatment). Since any particular plant only occurs in one plot, the variable 'plant' is nested within plot. Plants therefore become 'replicates' for the plot factor and are properly termed evaluation units. They do not constitute replicates for the treatment factor.

If the top level nominal variable (in this case treatment) is a fixed factor (for example treatment), and the lower level nominal variable is a random variable, then we are dealing with a mixed effects nested ANOVA. If the top level nominal variable is a random factor, and the lower level nominal variable is a random variable, then we have a random effects (or pure Model II) nested ANOVA. In nested ANOVA all lower level nominal variables are usually random factors.

As in one-way ANOVA, in nested ANOVA we compare two estimates of the null population's variance: one derived from the variance of the sample means about their overall (grand) mean, and the other obtained by combining the variances of each group of observations about their respective group mean. But there is a crucial difference between a nested ANOVA and a simple one-way ANOVA. In a simple one way ANOVA you have a single error term. In a nested ANOVA you have several different error terms reflecting each level of the hierarchy. Hence in a two level nested ANOVA one calculates a separate F-ratio for each level.

#### Model & expected mean squares

The mixed effects nested ANOVA with the top level nominal variable a fixed factor is based on the following mathematical model:

#### Factor A fixed, factor B random

 Yijk  =  μ + αi + Bj{i} + εijk
where:
• Yijk is the kth observation in subgroup j of group i,
• μ is the population (grand) mean,
• αi is the fixed effect for the ith level of factor A,
• Bj{i} is the random effect for the jth subgroup of the ith group.
The notation Bj{i} indicates that the effect of level Bj is nested within A,
• εijk is the random error effect.

 Source of variation df Expected MS Variance ratio 1. Groups a-1 σ2 + nσ2B {α} + nbΣα2/(a-1) MS1/MS2 2. Subgroups w'in groups (b-1)a σ2 + nσ2B {α} MS2/MS3 3. Residual (n-1)ab σ2 Total variation N-1
where
• a is the number of groups, and b is the number of subgroups in each group,
• n is the number of observations per subgroup, and N is the total number of observations (= abn),
• σ2 is the error variance,
• 2B {α} is the subgroups within group variance component,
• nbΣα2/(a-1) is the added group component.

Random effects nested ANOVA (both factors random) is based on the following mathematical model:

#### Factors A and B both random

 Yijk  =  μ + Ai + Bj{i} + εijk
where:
• Yijk is the kth observation in subgroup j of group i,
• μ is the population (grand) mean,
• Ai is the random effect for the ith level of factor A,
• Bj{i} is the random effect for the jth subgroup of the ith group.
The notation Bj{i} indicates that the effect of level Bj is nested within A,
• εijk is the error term.

 Source of variation df Expected MS Variance ratio 1. Groups a-1 σ2 + nσ2B {α} + nσ2A MS1/MS2 2. Subgroups w'in groups (b-1)a σ2 + nσ2B {α} MS2/MS3 3. Residual (n-1)ab σ2 Total variation N-1
where
• a is the number of groups, and b is the number of subgroups in each group,
• n is the number of observations per subgroup, and N is the total number of observations (= abn),
• σ2 is the error variance,
• 2B {α} is the subgroups within group variance component,
• 2A is the groups added variance component.

Note that for both the mixed and random effects models the denominator for the variance ratio for testing an effect is taken from the level below it in the hierarchy. Hence the F-ratio for the 'group effect' is obtained by dividing MSA by MSB. The F-ratio for the 'subgroup effect' is then obtained by dividing MSB by MSW.

Some authorities advocate pooling of the mean square (subgroups within groups) and the mean square (error) if the subgroups within groups effect is not significant at some specified level. We have discussed this issue in the core text - suffice to say we do not recommend the practice and expect that (sooner or later) it will regarded as unacceptable pseudoreplication.

#### Computational formulae

As designs get more complicated we have to modify our notation somewhat to cope. We take a balanced experiment with 'a' group (= treatment) levels (1...i...a), each replicated in 'b' experimental units (1...j...b) and with 'n' (1...k...n) evaluation units in each. For an observational study, there would be 'a' group levels (1...i...a), with 'b' sampling units in each (1...j...b) and 'n' (1...k...n) evaluation units per sampling unit. This notation is shown diagrammatically below:

 Groups 1 i a Total Subgroups 1 j b 1 j b 1 j b Evaln units: 1 Y1,1,1 Y1,j,1 Y1,b,1 Yi,1,1 Yi,j,1 Yi,b,1 Ya,1,1 Ya,j,1 Ya,b,1 k Y1,1,k Y1,j,k Y1,b,k Yi,1,k Yi,j,k Yi,b,k Ya,1,k Ya,j,k Ya,b,k n Y1,1,n Y1,j,n Y1,b,n Yi,1,n Yi,j,n Yi,b,n Ya,1,n Ya,j,n Ya,b,n Subgroup tots S1 Sj Sb S1 Sj Sb S1 Sj Sb Group tots T1 Ti Ta G

Observations are shown as Y1,1,1 to Ya,b,n, group totals as T1 to Ta, subgroup totals as S1 to Sb and the grand total as G. Each of the subgroup totals is comprised of n observations, group totals are comprised of b x n observations and the grand total is comprised of N (=Σ(ni) observations. There are no totals for evaluation units as evaluation unit of treatment 1 has nothing in common with evaluation unit 1 of treatment 2.

The group, subgroup within group, within subgroup and total sums of squares are calculated as follows:

Algebraically speaking -
 SSTotal = Σ( Yijk2 ) − G2 N
where:
• SSTotal is the total sums of squares (or Σ( - )2,where is the overall mean),
• Yijk is the value of the kth observation in subgroup j and treatment group i,
• G is the overall total (or ΣYijk) and N is the total number of observations (or abn).

 SSA (Groups) = Σ( Ti2 ) − G2  bn N
where:
• SSA is the groups (treatment) sums of squares, (or nbΣ( i- )2),
• Ti is the sum of the observations in treatment group i,
• b is the number of subgroups and n is the number of observations in each subgroup

 SSB(A) (Subgroups within groups) = SSsubgroups − SSgroups = Σ( Sj2 ) − Σ( Ti2 )  n bn
where:
• SSB(A) is the subgroups within groups sums of squares, (or nΣ( ij- i)2)
• Sj is the sum of the observations in subgroup j.

 SSW (Within subgroups) = SSTotal - SSA - SSB (A)
where:
• SSW is the within subgroups sums of squares (or Σ(Yijk- ij)2) ### Assumptions

The same assumptions made for simple one way ANOVA also apply to nested ANOVA. We briefly repeat these assumptions here:

1. Random sampling
Random sampling from populations or random allocation to treatments is an essential precondition for analysis of variance (as it is for most other statistical analysis). For nested ANOVA this applies both to the groups and the subgroups. If strictly random sampling is not possible, then every effort should still be made to avoid bias, so that as representative a sample as possible is obtained.

2. Independent and identical error distribution
The distribution of errors in each (sub)group must represent the same population.

3. Normal errors
Parametric nested ANOVA assumes that the distribution of errors in each of the subgroups groups is normal. This is often difficult to assess if there are few observations in each group, although QQ plots are still useful. The distribution of residuals should also be checked for normality after fitting the model.