 InfluentialPoints.com
Biology, images, analysis, design...
 Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

# Box-Cox transformation

The Box-Cox transformation is a procedure for obtaining the optimal transformation to normalize data within the following family of power transformations:
 Y' = [Yλ −1] / λ when λ ≠ 0 Y' = ln Y when λ = 0

The required value of λ is given by that value which maximizes the following log likelihood function

#### Algebraically speaking -

 L =   − v ln sT2  +  (λ − 1) v Σ (ln Yi)  2 n
where
• L is the log likelihood,
• v is the degrees of freedom (number of observations − 1),
• sT2 is the variance of the values which have been transformed using [Yλ − 1] / λ,
• λ is the current estimate of the parameter,
• Yi are the original data values

This equation is solved iteratively using a series of values of λ. Values of the log likelihood function are then plotted against λ to obtain the maximum.

Where the data includes zeros, a constant is added to each value of Y - usually either 1 or 0.5. As with other transformations, if there are many zeros this can result in bias.

Detransformation is achieved using the following:
 Y = [Y'λ + 1] 1/λ when λ ≠ 0 Y = exp Y' when λ = 0

Although you can use the precise value of λ in the transformation, it is more common to use the (common) transformation closest to that suggested by the Box-Cox transformation, providing it still lies within the 95% confidence interval of λ. This is known as a 'convenient estimator' (although this model can be impossible to interpret in biological terms).

• A λ of 0 is a log transformation.
• A λ of 0.5 is equivalent to (but not identical to) a square root transformation (provided Y > 1) - then again a square root transformation cannot cope with negative numbers.
• A λ of − 1 is equivalent to a reciprocal transformation. 