InfluentialPoints.com
Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

 

 

Principles

We introduced the principles of blocked designs in Unit 7 but will recap here. In blocked designs the experimental units are first divided into (relatively) homogeneous groups which constitute the blocks or strata. The aim is to minimize the variance among units within blocks relative to the variance among blocks. Treatment levels are then assigned randomly to experimental units within each block.

The commonest design, known as the randomized complete block design (RCBD), is to have one unit assigned to each treatment level per block. Providing block is a truly random factor - and there really is no interest in comparing blocks - this can be the most efficient design. The alternative is to have several replicates of each treatment per block (sometimes termed a generalized randomized block design). The advantage of having replicated treatments in each block is that any interaction between blocks and treatments can be evaluated (see below), and is strongly recommended if the blocks represent a clear environmental gradient (for example soil moisture content). However, precision usually decreases as the number of experiment units (or size of units) per block increases. We deal with analysis of the generalized randomized block design in the More Information page on Factorial ANOVA

If there are two blocking factors, then the Latin square design may be appropriate. However, they are much less used than randomized block designs and make additional (sometimes highly questionable) assumptions. If there are a very large number of treatment levels (often the case in agricultural variety trials), it may not even be possible to have every treatment level within each block. Instead a carefully selected set of treatment levels are put in each block giving an randomized incomplete block design. Such designs are not recommended unless unavoidable.


 

Randomized complete block ANOVA

Model & expected mean squares

We will assume a mixed model for the randomized block design - with the treatment effect fixed and the block effect random - but see the discussion on this issue in the core text ) We have included two confounded (unmeasurable) effects in the model - firstly the interaction between block and treatment, and secondly the restriction error component generated by the restricted randomization (see )

Factor A fixed, factor S random

Yij  =  μ  +  αi  +  Sj  +  j]  +  [(αS)ij]  +  εij
where:
  • Yij is the observation for treatment i in subject j,
  • μ is the population (grand) mean,
  • Sj is the random effect for the jth block,
  • j] is the confounded restriction error effect,
  • [(αS)ij] is the confounded interaction effect between treatments and blocks,
  • εij is the random error effect

Source of variation df Expected MS VC estimate
or F-ratio
1.Blocks (S) s-1 σ2 + [aσ2δ] + aσ2S VC = (MS1 - MS3)/a
2.Treatment (A) a-1 σ2 + [σ2αS] + sΣα2/(a-1) MS2/MS3
3.Residual ar σ2 + [σ2αS]    
Total variation N-1        
where
  • a is the number of levels of the treatment factor (A),
  • s is the number of blocks and N is the total number of observations (= as),
  • σ2 is the error variance
  • [2δ] is the confounded restriction error component,
  • 2αS] is the confounded treatment x blocks interaction component,
  • sΣα2/(a-1) is the added treatment component,
  • 2S is the block variance component.

Examination of the expected mean squares shows that we can obtain an unbiased test of the treatment effect using the residual mean square as the denominator in the F -ratio. The F-ratio for the treatment effect is therefore obtained by dividing MSA by MSRes. The P-value for this F-ratio is obtained for a − 1 and (s − 1)(a − 1) degrees of freedom.

There is no unbiased test of the block effect unless we assume there is no restriction error and no treatment × block interaction. If we make those assumptions, an approximate F-ratio for the block effect is obtained by dividing MSS by MSRes. The P-value for this F-ratio is obtained for s− 1 and (s − 1)(a − 1) degrees of freedom. This value can be used to assess whether it was worthwhile using a blocked versus a completely randomized design. If block is assumed to be a random factor, one may instead wish to estimate the added variance component.

Great care must be taken when analyzing randomized block designs with statistical packages. The widely used general linear model cannot accommodate random factors - it assumes all factors are fixed. This produces what are called narrow sense estimates of the standard errors. These represent variation over repetitions of the experiment only if one uses exactly the same blocks and simply re-randomizes the assignment of treatments to the experimental units.

Fortunately the standard error of a difference between two least squares means is the same whichever model is used because differences between two means does not involve the blocking factor. Thus, inferences for pairwise differences are unaffected. But estimated confidence intervals of means are much too narrow. If blocks are random, we really need broad sense estimates of the standard error which would correspond to repetitions of the experiment with another sample of blocks. In recent years some statistical packages (including SAS and R) can analyze mixed model ANOVAs by fitting the random effects using maximum likelihood techniques.

 

Computational formulae

We will take a balanced experiment with 'a' group (= treatment) levels, each replicated once in 's' blocks.

Group (treatment) totals are denoted as TA1 to TAa, block totals as TS1 to TSs and the grand total as G.

The total, block, group and residual sums of squares are calculated as follows:

Algebraically speaking -
SSTotal   =   Σ( Yij2 )   −   G2
N
where:
  • SSTotal is the total sums of squares (or Σ(-) 2,where is the overall mean),
  • Yij is the value of the ijth observation in block j and treatment group i,
  • G is the overall total (or ΣYijk) and N is the total number of observations.
SSS (Blocks) =Σ( TSj2 ) G2
aN
where:
  • SSS is the blocks sums of squares, (or aΣ(S-)2)
  • TSj is the sum of the observations in block j,
  • a is the number of treatment levels.
SSA (Treatment) =Σ( TAi2 ) G2  
sN
where:
  • SSA is the treatment sums of squares, (or sΣ(A-)2),
  • TAi is the sum of the observations in treatment group i,
  • s is the number of blocks
SSresidual =SSTotal - SSA - SSS

If blocks are taken as a fixed factor, the standard error of a treatment mean is given by
SETreatment mean =
( MSresidual )
s

If blocks are taken as a random factor, the standard error of a treatment mean is given by
SETreatment mean =
( MSresidual + MSblocks )
s

If blocks fixed or random, the standard error of the difference between means is given by:
SEDiff between means =
( 2MSresidual )
s

 

Pooling

After using a randomized block design, it is not unusual to find that the block effect is not only not significant, but so small that it would have been better to have not blocked in the first place. It might then be tempting to reanalyze the data using a completely randomized design in order to gain degrees of freedom. In fact this approach is specifically recommended by some statisticians when analyzing matched pairs cluster randomized trials. Statisticians (as usual) do not agree on this issue, but the predominant view is that pooling would represent another case of pseudoreplication. Treatments have clearly not been allocated at random overall, but only within blocks. Hence it would be incorrect to ignore the blocks in the analysis of the experiment. Moreover, if blocks are left in the model, the resulting P-values closely approximate randomization test P-values. Conceptually, therefore, the P-values are tied directly to the chance mechanism involved in randomization.

We have said the aim is to minimize the variance among units within blocks relative to the variance among blocks. But that does not necessarily mean we should try to maximize differences between blocks. If there is a strong interaction between treatment and blocks, then maximizing differences between blocks may make the situation worse. We consider this again below in relation to the Latin square ANOVA.

 

Very small F-ratios

Since the main interest is in whether F-ratios are significantly large, it is not surprising that little attention is usually paid to an F-ratio that is unusually small. If the model is correct and all assumptions are satisfied, then the ratios of the block : residual and treatment : residual mean squares should be either near 1.0 or greater than 1.0. If the value is near 0.0, Meek et al. (2007) have argued that it may indicate potential problems with the design or analysis, and should therefore be treated as a red flag and investigated. Possible causes for values near zero include non-additivity in the model (for example multiplicative effects), violations of distributional assumptions, an omitted factor(s) in the model and/or lack of fit. It may of course just be simply a chance occurrence, but all other possibilities should be eliminated before that conclusion is reached.

 


Latin square ANOVA

Model & expected mean squares

We will assume for the Latin square design that the treatment effect is fixed, whilst the row and column effects are random.

Factor A fixed, factors B & C random

Yijk  =  μ  +  αi  +  Rj  +  Ck  +  [αRij]  +  [αCik]  +  [RCjk]  +  [αRCijk]  +  εijk
where:

  • Yijk is the observation for treatment i in row j and column k,
  • μ is the population (grand) mean,
  • αi is the fixed effect for the ith level of factor A,
  • Rj is the random effect for the jth row,
  • Ck is the random effect for the kth column,
  • all interaction effects between treatments, rows and columns shown in [] are assumed to be zero
  • εijk is the random error effect.

Source of variation df Expected MS Variance ratio
1.Rows a-1 σ2   + aσ2S MS2/MS4
2.Columns a-1 σ2   + aσ2C MS3/MS4
3.Treatment a-1 σ2   + aΣα2/(a-1) MS1/MS4
4.Residual (a-1)(a-2) σ2 + [σ2int]    
5.Total N-1        
where
  • a is the number of treatments, rows and columns
  • N = Total number of observations,
  • σ2 is the error variance,
  • aΣα2/(a-1) is the added treatment component,
  • 2S is the row variance component,
  • 2C is the column variance component,
  • [σ2int] is the sum of all interaction components assumed to be zero.

The F-ratio for the treatment effect (assuming no interaction effects) is obtained by dividing MSA by MSRes. The P-value for this F-ratio is obtained for a − 1 and (s − 1)(a − 1) degrees of freedom.

Approximate F-ratio for the row and column effects (assuming no restriction error and no interaction effects ) are obtained by dividing MSR and MSC by MSRes. The P-values for these F- ratios are obtained for a−1 and (a−1)(a−1)(a− 1) degrees of freedom. These values can be used to assess whether it was worthwhile using a Latin square versus a blocked or completely randomized design. If row and column are assumed to be a random factors, one may instead wish to estimate the added variance components.

 

Computational formulae

In a Latin square design the number of group (= treatment) levels (a) will be the same as the number of rows and the number of columns. Group (treatment) totals are denoted as TA1 to TAa, row totals as TR1 to TRa, column totals as TC1 to TCa and the grand total as G.

The total, group, row, column, and residual sums of squares are calculated as follows:

Algebraically speaking -
SSTotal   =   Σ( Yijk2 )   −   G2
N
where:
  • SSTotal is the total sums of squares (or Σ(-) 2,where is the overall mean),
  • Yijk is the value of the observation in row j, column k and treatment group i,
  • G is the overall total (or ΣYijk) and N is the total number of observations.
SSR (Rows) =Σ( TRj2 ) G2
aN
where:
  • SSR is the rows sums of squares, (or aΣ(R-)2)
  • TRj is the sum of the observations in row j,
  • a is the number of treatment levels (= number of rows = number of columns)
SSC (Columns) =Σ( TCk2 ) G2
aN
where:
  • SSC is the columns sums of squares, (or aΣ(C-)2)
  • TCk is the sum of the observations in column k,
  • a is the number of treatment levels (= number of rows = number of columns)
SSA (Treatment) =Σ( TAi2 ) G2  
sN
where:
  • SSA is the treatment sums of squares, (or aΣ(A-)2),
  • TAi is the sum of the observations in treatment group i,
  • a is the number of treatment levels (= number of rows = number of columns)
SSresidual =SSTotal - SSA - SSR - SSC

If rows and columns are both taken as a fixed factors, the standard error of a treatment mean is given by
SETreatment mean =
( MSresidual )
a

If rows and columns are taken as a random factors, then presumably one would use [MSresidual + MScolumns + MSrows] as the numerator - although we have never come across this being done.

If rows and columns are fixed or random, the standard error of the difference between means is given by:
SEDiff between means =
( 2MSresidual )
a

 

 

Assumptions

The same assumptions as for a one-factor ANOVA must also hold for blocked ANOVA, namely:

  1. Random sampling (equal probability)
  2. Independence of errors (within the constraint of restricted randomization)
  3. Homogeneity of variances
  4. Normal distribution of errors
  5. Effects are additive.

But in addition, if there are more than two treatment levels, the restricted allocation of treatments to plots within a block introduces a further assumption - namely

  1. Homogeneity of covariances.
    This assumption is introduced because units within the same block are correlated with each other. This is not a problem if the degree of correlation within each block is the same. However, if it differs (in other words if covariances differ) then the Type I error rate will be increased above the nominal level. For the randomized block design where treatment is allocated randomly within each block, this assumption will generally be met - which is why it is seldom mentioned in introductory texts. But the assumption must also be met in repeated measures designs where treatment may not be randomly allocated - in that situation covariances are unlikely to be homogeneous. We therefore wait until the section on repeated measures designs to discuss this assumption - although you will encounter the test for homogeneity of covariances (or to be more precise a test for sphericity) in the worked example.

There is one further assumption that must be made for an unbiased assessment of the treatment effect if 'blocks' is a fixed rather than a random factor:

  1. There is no interaction between the block factor and the treatment factor.

Related
topics :

Friedman's two way ANOVA

Tukey's test of non-additivity