Worked example 2
Our second worked example uses data from Luiselli (2006) who tested hypotheses on the ecological patterns of rarity using snake communities worldwide. We first considered this work in . Data are presented below:
Snake data 
No. sp. 
Rank 
% rare 
Rank 
8 24 17 18 16 12 15 13 5 3 12 20 28 6 6 34 6 17 68 7 4 33 23 5 9 46 14 17 18 17 5 10 5 27 17 
11.0 29.0 22.0 25.5 19.0 14.5 18.0 16.0 4.5 1.0 14.5 27.0 31.0 8.0 8.0 33.0 8.0 22.0 35.0 10.0 2.0 32.0 28.0 4.5 12.0 34.0 17.0 22.0 25.5 22.0 4.5 13.0 4.5 30.0 22.0 
12.5 20.8 11.8 16.7 18.7 8.33 20 23.1 0 0 0 10 25 0 0 14.7 0 23.5 14.7 14.2 0 3 17.4 0 0 23.9 21.4 5.9 38.9 11.8 0 10 0 33.3 29.4 
19.0 27.0 17.5 23.0 25.0 14.0 26.0 29.0 6.0 6.0 6.0 15.5 32.0 6.0 6.0 21.5 6.0 30.0 21.5 20.0 6.0 12.0 24.0 6.0 6.0 31.0 28.0 13.0 35.0 17.5 6.0 15.5 6.0 34.0 33.0 
 
{Fig. 2}
Check of assumptions
 The issue of independence of observations was not considered by the author. This was an unfortunate oversight, since there would appear to be a real danger of spatial autocorrelation and pseudoreplication if some of the study areas are adjacent and/or overlapping. This would result in the number of degrees of freedom being overestimated, which would make making the statistical tests too
liberal. We cannot test for this because there is insufficient information given in the paper.
 Both X and Y are measurement variables
 The plot of Y on X could indicate a weak linear relationship  but there is so much scatter that this is difficult to assess.
 The qq plots indicated that neither the distribution of X nor Y was (remotely) normal; inspection indicates that this cannot be remedied by a transformation (many zeros in X).
 The variance of Y appears to increase with X, and vice versa.
We conclude that a parametric test of the correlation coefficient is not appropriate. We could instead use a nonparametric correlation coefficient (as done by the author and by us in the More Information page on Nonparametric correlation and
regression ), or alternatively test the Pearson correlation coefficient using a randomization test. We will carry out a randomization test using R, and compare the result with that obtained using the standard parametric test.
Using
Our observed correlation coefficient is obtained as before:
Calculation of Pearson correlation coefficient
r 
= 
9396.36(250982.5/35) 

√[15757 − 342225/35
][9372.219 −184066.7/35] 

= 
0.44875 
 
we got
> # observed correlation coefficient
> (C=cov(x,y) / sd(x) / sd(y))
[1] 0.4487518
> 2*(.5abs(P.5))
[1] 0.01
> # conventional ttest of C
> df=length(x)2
> P=pt(C*sqrt(df/(1C^2)),df)
> 2*(.5abs(P.5))
[1] 0.006852728
 
We conclude there is a significant linear relationship between the percentage of rare species and species richness  although we should emphasize that a linear relationship still seems improbable, and a nonparametric test implying only a monotonically increasing relationship would be greatly preferable.