Biology, images, analysis, design...
|"It has long been an axiom of mine that the little things are infinitely the most important" |
Randomized experimentsOn this page: Characteristics Global randomization, completely randomized designs Nested designs Stratified randomization, blocked designs Stratified clinical trials Randomized blocks Latin squares Factorial designs Partially nested, Split-plot designs Repeated measures
In an experimental study, the experimenter manipulates or controls the level of the explanatory variable(s). This is in contrast to observational studies where the level of the explanatory variable is either self-selected by the unit concerned or has been imposed haphazardly. In an experiment, the two (or more) levels of the explanatory variable(s) are randomly allocated as treatments usually to a number of independent replicated experimental units. This is again in contrast to an observational study where there is no random allocation of treatments to the (sampling) units. In medical research such experiments are termed randomized controlled trials. Strictly speaking, the term 'controlled' refers to the allocation of treatment to individuals being under the control of the experimenter, although it can also be taken to refer to the presence of a control group for comparison with the treated group. Most randomized experiments are parallel trials - in other words treatments are allocated to two (or more) parallel groups of experimental units, and treatment remains the same throughout the course of the experiment.
The term experiment is sometimes used where there is manipulation, but no random allocation. However, the element of random
If only a small number of experimental units are available (for example, plots of land in agricultural trials), then it can no longer be assumed that treatment groups will be balanced as regards potentially confounding variables. Most plots allocated to one treatment level may (by chance) fall in a more fertile area, whilst plots allocated to another treatment level fall in less fertile areas. The same applies if experimental units are cattle which vary in age - treated cattle may by chance be the older cattle, whilst untreated are younger cattle. In this situation it is important that there is adequate interspersion of treatments through the process of stratification - in other words that treatments are assigned randomly within particular strata or blocks of relatively homogeneous units.
We have noted above the need for replication of each level of the explanatory variable. This is certainly the case if the explanatory variable is a nominal variable. But there is an exception to this general rule if the explanatory variable is an ordinal or (especially) a measurement variable. In this situation, a regression design may be appropriate in which (many) different levels of the explanatory variable are randomly allocated to individual experimental units. Such an experiment would then be best analyzed using a regression model rather than the more commonly used analysis of variance. In this case only one replicate would be required for each level, although in practice it is more common to use a replicated regression design with n replicates for each level. Choice of appropriate levels for a regression design depends on the form of the relationship between the response and explanatory variables. If the response is multiplicative rather than linear, values of the explanatory variable should be evenly spaced on a logarithmic rather than arithmetic scale.
We also noted above that the replicates in an experiment should be independent. The principle of independent replication is extremely important and applies to both observational designs and randomized experiments. The issue is also controversial because it can be very difficult to obtain independent replication. Hence we devote a separate More Information page to the topic of
Global randomization (completely randomized designs)
In global randomization each unit has an equal probability of receiving any treatment, at least at the start of the randomization process. In other words, treatments are allocated to units irrespective of any differences between them. This applies whether those differences be in age or sex (in clinical trials) or in position in a field (in agricultural trials).
If you are to use a completely randomized design, you must assume either:
In laboratory experiments it is sometimes possible to control all potentially confounding factors, so one can use global randomization even with only moderate numbers in each treatment group. However, outside the laboratory, large (or very large) numbers of experimental units must be used in each treatment group to balance confounding factors. Hence the use of large numbers of participants in most clinical trials.
A further complication is that there are two types of global randomization.
Variation in treatment group size does not matter too much if your groups are very large. But for small or medium group sizes, variation in group sizes can serious reduce the power of a study. Hence the widespread use of restricted randomization. For many types of experiment the total number of units is known in advance, so it is straightforward to assign treatments to identical numbers of units. In clinical trials (where participants are recruited over a period of time) other methods are used. One such is random permuted blocks where randomizations are blocked over time. As well as tending to minimize size discrepancies between groups, this procedure also protects against unknown time trends in the characteristics of arriving patients. Another method is the biased coin method where at each allocation the probability of assigning to the smaller group is made greater than 0.5.
Do not confuse the various methods used to make treatment group sizes similar (such as random permuted blocks), with blocking in (for example) agricultural trials which is equivalent to stratification. The former is not usually taken account of in the analysis whilst the latter is.
Nearly all experiments (other than regression designs) have one level of nesting (replicates are nested in treatment), but by convention it is only termed a nested design if there are at least two levels of nesting. Nested designs may result in pseudoreplication if the evaluation units are wrongly treated as experimental units in the
Stratified randomization (blocked designs)
In stratified randomization the randomization process is restricted by grouping the experimental units into more or less homogenous strata before the process of random allocation. This balances the contribution of such factors to each treatment group, and is often essential if there are only a small number of experimental units available. Designs using stratified randomization include the randomized block and Latin square designs. We deal with clinical trials separately below because the terminology used varies from that in other disciplines.
Stratified clinical trials
Strata are formed of patients with similar characteristics. For clinical trials age and gender are commonly used factors for stratification. If one has two categories for each, one would then have a total of four strata. The number of patients within each stratum may vary widely. A separate randomization list is drawn up for each stratum of patients, thus balancing the composition of such characteristics in each treatment group.
Stratification in clinical trials should only be used if the researcher is reasonably certain that the factor will affect the value of the response variable. In fact there is still some dispute in medical journals about the need or desirability of stratification. Certainly if the trial is large and random allocation is properly carried out, stratification may be an unnecessary complication. The stratification should always be explicitly recognised in the analysis.
The disadvantages of stratified randomization - especially in relation to dealing with large numbers of factors with small numbers of experimental units - can be overcome by using a different approach known as minimization. Treatment allocation is only done randomly for the first unit; after that it depends on the characteristics of those units already allocated. This equalizes the category totals between treatment groups rather than doing it for each individual stratum. The process ensures balance between groups for several factors, even in small samples - but is still not very popular amongst statisticians despite its strong advocacy by some. It is important to take account of the factors used for minimizing in the analysis - otherwise P-values will be invalid.
The randomized block design
Stratification has always been used extensively in the agricultural context where the experimental units are plots of land, and the strata are described as blocks. In the classical randomized complete block design, plots that are similar are grouped together into blocks each with the same number of plots as the number of treatment levels. Treatments are then assigned at random within blocks with the restriction that each treatment occurs only once within each block. Because there is only one replicate per block, one has to assume that there is no interaction between the blocking factor and the treatment. Use of multiple replicates per block allows one to test whether there is any interaction between treatment and blocking factor - and is a much stronger design.
The simplest form of the randomized block design is the matched pairs design. Pairs of units are matched for some characteristic (so the pairs are equivalent to blocks), and one or other treatment level is randomly allocated to each member of the pair. Because there are only two units in each block, the design can only deal with two levels of the treatment factor.
Matched pairs are commonly used in cluster randomized trials where groups of individuals (for example schools or villages) are randomized to receive different public health interventions. Clusters are often matched by geographic area or size of the cluster. Providing the number of clusters is moderate or there is a very high correlation between the matching factor and the response variable, matching will increase the power of the study. But for low correlation or very small numbers of clusters, matching can actually reduce power because of loss of degrees of freedom. The moral of the story is that matching may be appropriate, but this should not just be assumed.
The Latin square design
Whilst the randomized block design can only deal with one systematic source of variation, the Latin square design can deal with two such sources. It was first used widely in agricultural experiments where for example the study site may have a fertility gradient running East-West, and a moisture gradient running North-South. In a Latin Square design we divide up the field into 'rows' and 'columns. Each row and each column then contains one replicate of each treatment level. Hence the number of rows, the number of columns, and the number of treatment levels are all the same.
The Latin square design is what is known as a confounded design because the main effects of the treatment and blocking factors are confounded with the interactions between factors. Hence in order to analyse a single Latin square one has to assume that there are no interactions between any of the factors.
Up till now any interaction between treatment and blocking factors has been considered a problem - and we have often just assumed that such interaction is not present. But of course in reality any treatment will be applied along with all other factors. Hence it makes sense to investigate how different factors interact.
In a complete factorial design combinations of the various levels of two or more treatment factors are randomly allocated to each unit. With two treatment factors each at two levels there will be 4 combinations; with two treatment factors each at three levels there will be 8 combinations. Since each level of one factor is present with each level of the other factor we can refer to factor A being fully crossed with (or orthogonal to) factor B. Each combination must be replicated at least twice if one is to assess the interaction between the two factors. The combinations are then used in one of the designs already considered such as the completely
Partially nested designs
It is often not possible to randomly allocate experimental units to treatment combinations. For example in agricultural experiments some treatments, such as irrigation or ploughing, can only feasibly be done over a large area. One way to alleviate this problem is to use a split-plot design where one has blocks (originally termed mainplots), which are then divided into plots (originally termed split-plots). Different levels of treatment factor A (for example irrigated or not) are first randomly allocated to the blocks. Different levels of treatment factor B (for example insecticide application) are then randomly allocated within each block to the different plots.
In this design the blocks must be regarded as a factor which is nested in factor A. However, factor A is still crossed with factor B - hence it is a partially nested design. The split-plot design is a good example of where we have randomization at two levels: first to the main plots, then to the subplots.
Repeated measures designs
A repeated measures design is one where repeated measurements are made on the same experimental unit (either an individual or a plot) over time. This design can be very useful to eliminate the variability between different units. But it can have major limitations depending on the type of repeated measures design.
In one type of repeated measures designs units are randomly allocated different treatment levels at the start of the experiment, and then continue to receive the same treatment level throughout the time period. Time itself is a factor in the design and the order of this factor obviously cannot be randomized . This is sometimes termed a subjects × trial design design. Since the observations over time are all on the same the same unit, they cannot be treated as independent replicates. We can describe the subjects as being nested within treatments. This is analogous to the split-plot design where the blocks (= subjects) are nested in treatments. There are of course differences between the split plot and subjects × trial
The other type of repeated measures design is very different in that treatments are randomly allocated within units rather than between units. If only two treatments are to be compared, then it is termed a crossover design. Experimental units are randomly assigned to one or other of two sequence groups. Units in sequence group I receive treatment A1 followed by A2. Units in sequence group II receive treatment A2 followed by A1. If treatments are repeatedly alternated, it may be described as a multiple crossover design.
A crossover design comprising three or more treatments is best done using a multiperiod Latin square design (also known as a round-robin design). In recent years it has been used widely for testing of trap designs and odour attractants for insects. The main sources of variation are trap site, day and trap/odour type. Within a square, one must have the same number of sites, days and trap types. The different traps or odour attractants are rotated round to their next designated positions at the end of each day.