 
Fisher's exact test
Worked example I
Let us assume you have 10 captivereared and 12 wild caught birds which have to chose between 11 high and 11 low sites.
Origin  Nest site  Totals 
High  Low 
Captivereared  2  8  10 
Wildcaught  9  3  12 
Totals  11  11  N = 22 
 
The probability of the observed result is given by:
P = 
(10)! (12)! (11)! (11)! 
22! 2! 8! 9! 3! 
 
Most of these cancel out leaving:
P 
= 
[12×11×10×9×8×7×6×5×4×3] [11×10×9] [11×10] 
[22×21×20×19×18×17×16×15×14×13×12×11][3×2] 

= 
2.608163712 × 10^{12}/1.858466812 x 10^{15} 

= 
0.014034 
 
Only two more extreme tables are possible.
Firstly:
Origin  Nest site  Totals 
High  Low 
Captivereared  1  9  10 
Wildcaught  10  2  12 
Totals  11  11  N = 22 
 
The probability of observing this by chance is:
P 
= 
(10)! (12)! (11)! (11)! 
22! 1! 9! 10! 2! 

= 
0.0009356 
 
Secondly:
Origin  Nest site  Totals 
High  Low 
Captivereared  0  10  10 
Wildcaught  11  1  12 
Totals  11  11  N = 22 
 
The probability of observing this by chance is:
P 
= 
(10)! (12)! (11)! (11)! 
22! 0! 10! 11! 1! 

= 
0.0000170 
 
Using
Summing probabilities we get the onetailed Pvalue:
P = 0.014034 + 0.0009356 + 0.0000170 = 0.0149866
This probability is normally doubled to give the two tailed Pvalue P = 0.02997
Hints and shortcuts
For any given set of margin totals the term [(a+b)! (c+d)! (a+c)! (b+d)!] / N! remains constant. This is a useful timesaver when combining a number of probabilities in a tail.
Unfortunately, for anything other than a very small sample, these numbers tend to become unmanageably large. For example, 40! ≅ 8.159 × 10^{47} or 81,591,528,280,000,000,000,000,000,000,000,000,000,000,000,000  
There are two ways of coping with this problem 
Even for very large samples, provided the smallest cell frequency is less than 40, you can calculate your results in a conventional manner.
For example, suppose we have these results (in red) 



total ⇓ 

3 
997 
1000 

5 
1995 
2000 
total ⇒ 
8 
2992 
3000 
 
Then there are only 8 possible tables, 2 of which are more extreme than this one. From the formula above, the probability (P) of finding the observed cell frequencies is
P = 
1000! 2000! 8! 2992! 
3000! 3! 997! 5! 1995! 
 
Most of which cancels out, leaving us
P = 
1000×999×998 × 2000×1999×1998×1997×1996 × 8×7×6 
3000×2999×2998×2997×2996×2995×2994×2993 × 3×2×1 
 
Which is rather more straightforward, if rather tedious to work out.
Where the smallest cell frequency is greater than 40, the only way to handle these factorials is as logarithms. Because the terms within these equations are all multiplied and divided, you simply add and subtract their logarithms.
So,
log{P} = 
[log{(a+b)!} + log{(c+d)!} + log{(a+c)!} + log{(b+d)!}] 

− [log{N!} + log{a!} + log{b!} + log{c!} + log{d!}] 
 
But, if these factorials are so colossal, how do you find their logarithms directly ?
There are two methods of finding logs of factorials, that avoid working out the factorials themselves.
If you are writing a computer programme you can use this formula 
log(N!) = log(N) + log(N1) + log(N2) + log(N3)... + log(2)
For example, log(3!) = log(3) + log(2) = 0.4771 + 0.3010 = 0.7781
If you are using a calculator this is clearly impractical.
A less accurate, but much quicker formula is 
ln(N!) ≅ N × ln(N + 0.5)  (N + 0.92)
Where ln(N!) is the natural log (log_{e}) of N!
If N > 20 the error is less than 0.01%, and if N > 100 it is < 0.002%
