InfluentialPoints.com
Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

 

 

The square root transformation

Where counts are large, the usual transformation is the log transformation. However, you should never assume that a particular transformation is suitable - always examine the data first! Our worked example is taken from one that has been quoted in many statistical texts in the past - an experiment on weed control in cereals.

Worked example

Number of weeds in cereal plots
Blocks Treatment
ABCDEF
I
II
III
IV

s
438
442
319
380
395
57.9
538
422
377
315
413
94.2
77
61
157
52
86.8
48.0
17
31
87
16
37.8
33.5
18
26
77
20
35.3
28.0
115
57
100
45
79.3
33.5

The experiment was arranged in a randomised block design with one of each treatment per block. Treatment A was the control. Examination of the standard deviation for each treatment shows that the standard deviation tends to increase with the mean. Hence we may be able to use Taylor's Power Law to select the best transformation. The figure below shows a plot of log variance against log mean for each of the treatments.

{Fig. 4}
Stabat01.gif

Number of weeds in cereal plots
(square root transformed)
  Treatment
ABCDEF
'
s'
D
19.8
1.49
392
20.2
2.29
408
9.1
2.39
82.8
5.8
2.49
33.6
5.6
2.12
31.4
8.7
1.92
75.7

The slope (b) is 0.7466. The most appropriate power transformation is then obtained from:

Y'   =   Y[1 − (b/2)]   =   Y0.6267

This is close to a square root transformation (Y' = Y0.5) so in the adjacent table we have square root transformed the data, and recalculated means (') and standard deviations (s'). The standard deviations are now more similar between the treatments. The detransformed means (D) are also given.

You may wonder what transformation would be suggested by a Box-Cox transformation. When there is a strong treatment effect (as here), you cannot just pool all the data to find the appropriate power transform. Instead you need to first model the relationship using the general linear model as shown in Unit 11. You can then use R to find the optimal Box-Cox transformation.

{Fig. 5}
Stabat01.gif

This is a log-likelihood plot for the Box-Cox transformation of the weed data. In this case the maximum likelihood estimate of λ is 0.4646, but the confidence interval encloses 0.5 so a square root transformation would be perfectly acceptable.

 

 

The log transformation

Our worked example for a log transformation is taken from some of our own research on optimizing trap design for tsetse flies.

Worked example

Number of tsetse flies
caught in 4 different trap types
Areas
(G)
PositionsPeriods (B)
  I     II     III     IV  
1  I    5 9 0 4
  II   0 4 0 1
  III   0 1 6 3
  IV   5 4 4 6
2  V    3 5 10 5
  VI   4 2 8 6
  VII   17 11 15 29
  VIII   14 5 20 4
3  IX    10 29 61 26
  X   17 12 17 13
  XI   14 9 6 7
  XII   10 11 14 8

We used a Latin square design in order to control for the effects of environmental factors - namely site and day. For each replicate, each of the four different designs was rotated around four sites over four days to give the balanced design shown below. The different trap designs are colour coded. Pink denotes the control (a) (the standard NGU trap) and green (b), yellow (c) and blue (d) denote three different modifications to the basic design.

A brief look at the data (comprised of small whole numbers) might suggest a square root transform would be best. Clearly the standard deviation is not independent of the mean:
 
Arithmetic means:        14.3 9.4 10.3 5.4
Standard deviation:        16.5 7.8 7.8 4.9

The figure below shows a plot of log variance against log mean for each of the treatment/area combinations.

{Fig. 6}
tsetse.gif

The slope (b) is 1.5873. The most appropriate power transformation is then obtained from:

Y'   =   Y[1 − (b/2)]   =   Y0.2064

This is close to a zero so a log transformation (Y' = log Y) would probably be appropriate. Moreover, a multiplicative model is the most appropriate for the way in which site and day affect the catch - in other words a particular site tended to be (say) twice as good as another site, rather than always catching a (say) 50 more flies. The only other problem is that there were a few zero catches - in this case it was considered acceptable to use a log (Y + 1) transformation, despite the risk of bias in adding one to each data point.

Log (Y+1) transformed mean        0.997 0.873 0.977 0.661
Log(Y+1) transformed SD 0.448 0.416 0.259 0.405
Detransformed (geometric) means 8.9 6.5 8.5 3.6

In the transformed scale the standard deviations are much more similar for all trap types - although the standard deviation for trap type d (blue) is still somewhat lower. Note also that geometric mean catches in trap types (b) (green) and (c) (yellow) are now similar to the control (a) (pink) - the arithmetic mean catch in (a) was unduly inflated by an unusually high catch of 61 flies.