Principles
We introduced the principles of blocked designs in Unit 7
but will recap here. In blocked designs the experimental units are first divided into
(relatively) homogeneous groups which constitute the blocks or strata. The aim is to
minimize the variance among units within blocks relative to the variance among blocks.
Treatment levels are then assigned randomly to experimental units within each block.
The commonest design, known as the randomized complete
block design (RCBD), is to have one unit assigned to each treatment level per
block. Providing block is a truly random factor  and there really is no interest in
comparing blocks  this can be the most efficient design. The alternative is to have several
replicates of each treatment per block (sometimes termed a generalized randomized block design). The advantage of having
replicated treatments in each block is that any interaction between blocks and treatments
can be evaluated (see below), and is strongly recommended if the blocks represent a clear
environmental gradient (for example soil moisture content). However, precision usually
decreases as the number of experiment units (or size of units) per block increases. We deal with analysis of the generalized randomized block design in the More Information page on Factorial ANOVA
If there are two blocking factors, then the Latin square design
may be appropriate. However, they are much less used than randomized block designs
and make additional (sometimes highly questionable) assumptions. If there are a very large
number of treatment levels (often the case in agricultural variety trials), it may not even be
possible to have every treatment level within each block. Instead a carefully selected set
of treatment levels are put in each block giving an randomized
incomplete block design. Such designs are not recommended unless
unavoidable.
Randomized complete block ANOVA
Model & expected mean squares
We will assume a mixed model for the randomized block design  with the treatment
effect fixed and the block effect random  but see the discussion on this issue in the core
text ) We have included two confounded (unmeasurable) effects
in the model  firstly the interaction between block and treatment, and secondly the
restriction error component generated by the restricted randomization (see )
Factor A fixed, factor S random
Y_{ij} = μ
 +
 α_{i
}  +
 S_{j
}  +
 [δ_{j}]
 +
 [(αS)_{ij}]
 +
 ε_{ij
}   
where:
 Y_{ij} is the observation for treatment i in subject j,
 μ is the population (grand) mean,
 S_{j} is the random effect for the jth block,
 [δ_{j}] is the confounded restriction error effect,
 [(αS)_{ij}] is the confounded interaction effect between
treatments and blocks,
 ε_{ij} is the random error effect

Source of variation
 df
 Expected MS
 VC estimate or Fratio


1.  Blocks (S)
 s1
 σ^{2
}  + [aσ^{2}_{δ}]
 + aσ^{2}_{S
}  VC = (MS_{1} 
MS_{3})/a

2.  Treatment (A)
 a1
 σ^{2
}  + [σ^{2}_{αS}]
 + sΣα^{2}/(a1)
 MS_{2}/MS_{3
} 
3.  Residual
 ar
 σ^{2
}  + [σ^{2}_{αS}]



Total variation
 N1




  
where
 a is the number of levels of the treatment factor (A),
 s is the number of blocks and N is the total number of observations (= as),
 σ^{2} is the error
variance
 [aσ^{2}_{δ}] is the confounded restriction error component,
 [σ^{2}_{αS}] is the confounded treatment x blocks interaction
component,
 sΣα^{2}/(a1) is the added treatment component,
 aσ^{2}_{S} is
the block variance component.

Examination of the expected mean squares shows that we can obtain an unbiased test
of the treatment effect using the residual mean square as the denominator in the F
ratio. The Fratio for the treatment effect is therefore obtained by dividing MS_{A}
by MS_{Res}. The Pvalue for this Fratio is obtained for
a − 1 and (s −
1)(a − 1) degrees of freedom.
There is no unbiased test of the block effect unless we assume there is no restriction error and no treatment × block interaction. If we make those assumptions, an approximate Fratio for the block effect is obtained by dividing MS_{S} by MS_{Res}. The Pvalue for this Fratio is obtained for s− 1 and (s − 1)(a − 1) degrees of freedom. This value can be used to assess whether it was worthwhile using a blocked versus a completely randomized design. If block is assumed to be a random factor, one may instead wish to estimate the added variance component.
Great care must be taken when analyzing randomized block designs with statistical
packages. The widely used general linear model cannot accommodate random factors  it
assumes all factors are fixed. This produces what are called narrow sense estimates of the
standard errors. These represent variation over repetitions of the experiment only if one uses exactly the same blocks and simply rerandomizes
the assignment of treatments to the experimental units.
Fortunately the standard error of a difference between
two least squares means is the same whichever model is used because differences between
two means does not involve the blocking factor. Thus, inferences for pairwise differences
are unaffected. But estimated confidence intervals of means are much too narrow. If
blocks are random, we really need broad sense estimates of the standard error which
would correspond to repetitions of the experiment with another sample of blocks. In
recent years some statistical packages (including SAS and R) can analyze mixed model
ANOVAs by fitting the random effects using maximum likelihood techniques.
Computational formulae
We will take a balanced experiment with 'a' group (= treatment) levels, each replicated
once in 's' blocks.
Group (treatment) totals are denoted as TA_{1} to TA_{a}, block
totals as TS_{1} to TS_{s} and the grand total as G.
The total, block, group and residual sums of squares are calculated as follows:
Algebraically speaking 
SS_{Total
}  =
 Σ(
 Y_{ij}^{2
}  )
 −
 G^{2
} 

N

 
where:
 SS_{Total} is the total sums of squares (or Σ()
^{2},where is the overall mean),
 Y_{ij} is the value of the ijth observation in block j and treatment group i,
 G is the overall total (or ΣY_{ijk}) and N is the
total number of observations.
SS_{S (Blocks)
}  =  Σ(
 TS_{j}^{2
}  )
 −
 G^{2
} 
 
a  N

 
where:
 SS_{S} is the blocks sums of squares, (or aΣ(_{S})^{2})
 TS_{j} is the sum of the observations in block j,
 a is the number of treatment levels.
SS_{A (Treatment)
}  =  Σ(
 TA_{i}^{2
}  )
 −
 G^{2
} 

 
s  N
  
where:
 SS_{A} is the treatment sums of squares, (or sΣ(_{A})^{2}),
 TA_{i} is the sum of the observations in treatment group i,
 s is the number of blocks
SS_{residual
}  =  SS_{Total}  SS_{A }  SS_{S}
  
If blocks are taken as a fixed factor, the standard error of a treatment mean is given by
SE_{Treatment mean}
 =  √


(
 MS_{residual}
 )


s
  
If blocks are taken as a random factor, the standard error of a treatment mean is given
by
SE_{Treatment mean}
 =  √


(
 MS_{residual} + MS_{blocks}
 )


s
  
If blocks fixed or random, the standard error of the difference between means is given
by:
SE_{Diff between means}
 =  √


(
 2MS_{residual}
 )


s
  

Pooling
After using a randomized block design, it is not unusual to find that the block effect is
not only not significant, but so small that it would have been better to have not blocked in
the first place. It might then be tempting to reanalyze the data using a completely
randomized design in order to gain degrees of freedom. In fact this approach is
specifically recommended by some statisticians when analyzing matched pairs cluster
randomized trials. Statisticians (as usual) do not agree on this issue, but the predominant
view is that pooling would represent another case of pseudoreplication. Treatments have
clearly not been allocated at random overall, but only within blocks. Hence it would be
incorrect to ignore the blocks in the analysis of the experiment. Moreover, if blocks are
left in the model, the resulting Pvalues closely approximate randomization test
Pvalues. Conceptually, therefore, the Pvalues are tied directly to the
chance mechanism involved in randomization.
We have said the aim is to minimize the variance among units within blocks relative to
the variance among blocks. But that does not necessarily mean we should try to maximize
differences between blocks. If there is a strong interaction between treatment and
blocks, then maximizing differences between blocks may make the situation worse. We
consider this again below in relation to the Latin square ANOVA.
Very small Fratios
Since the main interest is in whether Fratios are significantly large, it is not
surprising that little attention is usually paid to an Fratio that is unusually small. If
the model is correct and all assumptions are satisfied, then the ratios of the block :
residual and treatment : residual mean squares should be either
near 1.0 or greater than 1.0. If the value is near 0.0, Meek et
al. (2007)
have argued that it may indicate potential problems with the design or analysis, and should
therefore be treated as a red flag and investigated. Possible causes for values near
zero include nonadditivity in the model (for example multiplicative effects), violations of
distributional
assumptions, an omitted factor(s) in the model and/or lack of fit. It may of course just be
simply a chance occurrence, but all other possibilities should be eliminated before that
conclusion is reached.
Latin square ANOVA
Model & expected mean squares
We will assume for the Latin square design that the treatment effect is fixed, whilst
the row and column effects are random.
Factor A fixed, factors B & C random
Y_{ijk} = μ
 +
 α_{i
}  +
 R_{j
}  +
 C_{k
}  +
 [αR_{ij}]
 +
 [αC_{ik}]
 +
 [RC_{jk}]
 +
 [αRC_{ijk}]
 +
 ε_{ijk
}   
where:
 Y_{ijk} is the observation for treatment i in row j and column k,
 μ is the population (grand) mean,
 α_{i} is the fixed effect for the ith level of
factor A,
 R_{j} is the random effect for the jth row,
 C_{k} is the random effect for the kth column,
 all interaction effects between treatments, rows and columns shown in [] are assumed to be zero
 ε_{ijk} is the random error effect.

Source of variation
 df
 Expected MS
 Variance ratio


1.  Rows
 a1
 σ^{2
} 
 + aσ^{2}_{S
}  MS_{2}/MS_{4
} 
2.  Columns
 a1
 σ^{2
} 
 + aσ^{2}_{C
}  MS_{3}/MS_{4
} 
3.  Treatment
 a1
 σ^{2
} 
 + aΣα2/(a1)
 MS_{1}/MS_{4
} 
4.  Residual
 (a1)(a2)
 σ^{2
}  + [σ^{2}_{int}]



5.  Total
 N1




  
where
 a is the number of treatments, rows and columns
 N = Total number of observations,
 σ^{2} is the error
variance,
 aΣα^{2}/(a1) is the added treatment component,
 aσ^{2}_{S} is
the row variance component,
 aσ^{2}_{C} is
the column variance component,
 [σ^{2}_{int}]
is the sum of all interaction components assumed to be zero.

The Fratio for the treatment effect (assuming no interaction effects) is
obtained by dividing MS_{A} by MS_{Res}. The Pvalue
for this Fratio is obtained for a − 1
and (s − 1)(a − 1)
degrees of freedom.
Approximate Fratio for the row and column effects (assuming no restriction
error and no interaction effects ) are obtained by dividing MS_{R} and
MS_{C} by MS_{Res}. The Pvalues for these F
ratios are obtained for a−1 and (a−1)(a−1)(a−
1) degrees of freedom. These values can be used to assess whether it was worthwhile using a Latin square versus a blocked or completely randomized design. If row and column are assumed to be a
random factors, one may instead wish to estimate the added variance components.
Computational formulae
In a Latin square design the number of group (= treatment) levels (a) will be the same
as the number of rows and the number of columns. Group (treatment) totals are denoted
as TA_{1} to TA_{a}, row totals as TR_{1} to
TR_{a}, column totals as TC_{1} to TC_{a} and the grand
total as G.
The total, group, row, column, and residual sums of squares are calculated as follows:
Algebraically speaking 
SS_{Total
}  =
 Σ(
 Y_{ijk}^{2
}  )
 −
 G^{2
} 

N

 
where:
 SS_{Total} is the total sums of squares (or Σ()
^{2},where is the overall mean),
 Y_{ijk} is the value of the observation in row j, column k and treatment
group i,
 G is the overall total (or ΣY_{ijk}) and N is the
total number of observations.
SS_{R (Rows)
}  =  Σ(
 TR_{j}^{2
}  )
 −
 G^{2
} 
 
a  N

 
where:
 SS_{R} is the rows sums of squares, (or aΣ(_{R})^{2})
 TR_{j} is the sum of the observations in row j,
 a is the number of treatment levels (= number of rows = number of columns)
SS_{C (Columns)
}  =  Σ(
 TC_{k}^{2
}  )
 −
 G^{2
} 
 
a  N

 
where:
 SS_{C} is the columns sums of squares, (or aΣ(_{C})^{2})
 TC_{k} is the sum of the observations in column k,
 a is the number of treatment levels (= number of rows = number of columns)
SS_{A (Treatment)
}  =  Σ(
 TA_{i}^{2
}  )
 −
 G^{2
} 

 
s  N

where:
 SS_{A} is the treatment sums of squares, (or aΣ(_{A})^{2}),
 TA_{i} is the sum of the observations in treatment group i,
 a is the number of treatment levels (= number of rows = number of columns)
SS_{residual
}  =  SS_{Total}  SS_{A }  SS_{R} 
SS_{C}
  
If rows and columns are both taken as a fixed factors, the standard error of a treatment mean is given by
SE_{Treatment mean}
 =  √


(
 MS_{residual}
 )


a
  
If rows and columns are taken as a random factors, then presumably one would use is the sum of all interaction components assumed to be zero [MS_{residual} + MS_{columns} + MS_{rows}]  
as the numerator  although we have never come across this being done.
If rows and columns are fixed or random, the standard error of the difference between means is given
by:
SE_{Diff between means}
 =  √


(
 2MS_{residual}
 )


a
  
Assumptions
The same assumptions as for a onefactor ANOVA must also hold for blocked
ANOVA, namely:
 Random sampling (equal probability)
 Independence of errors (within the constraint of restricted randomization)
 Homogeneity of variances
 Normal distribution of errors
 Effects are additive.
But in addition, if there are more than two treatment levels, the restricted allocation of
treatments to plots within a block introduces a further assumption  namely
 Homogeneity of covariances.
This assumption is introduced because units within the same block are correlated with
each other. This is not a problem if the degree of correlation within each block is the
same. However, if it differs (in other words if covariances differ) then the Type I error
rate will be increased above the nominal level. For the randomized block design where
treatment is allocated randomly within each block, this assumption will generally be met 
which is why it is seldom mentioned in introductory texts. But the assumption must also
be met in repeated measures designs where treatment may not be randomly allocated  in
that situation covariances are unlikely to be homogeneous. We therefore wait until the
section on repeated measures designs to discuss this
assumption  although you will encounter the test for homogeneity of covariances (or to
be more precise a test for sphericity) in the worked example.
There is one further assumption that must be made for an unbiased assessment of the
treatment effect if 'blocks' is a fixed rather than a random factor:
 There is no interaction between the block factor and the treatment factor.
Related
topics :
Friedman's two way ANOVA
Tukey's test of
nonadditivity

