Biology, images, analysis, design...
|"It has long been an axiom of mine that the little things are infinitely the most important" |
Kolmogorov-Smirnov test One- & two-sample, and related testsOn this page: One-sample Kolmogorov-Smirnov test Two-sample test
One-sample Kolmogorov-Smirnov test
Worked example I
We base our first example on some data on sole horn moisture content from a study by Higuchi & Nagahata
The observed cumulative relative frequencies (S(Y)) are obtained by dividing rank (r) by the number of observations (n). Expected cumulative relative frequencies (Fo(Y)) for these quantiles from a normal distribution are given by the area under the normal curve; they are readily obtained from
The process may be easier to follow on the first graph below. The red curve shows the expected cumulative relative frequencies from a normal distribution. Blue points show the observed cumulative relative frequencies. Red points show points on the cumulative normal curve equivalent to observed cumulative relative frequencies. Green points lie immediately before each step-up.
A more efficient approach is shown in the second figure above. A correction factor (0.5/n) is subtracted from each observed cumulative relative frequency. Only one difference then has to be calculated for each observed cumulative relative frequency. Once the largest absolute difference is identified, the correction factor is added back on again to give d.
We said at the start of this worked example that we were testing the observed distribution against a fully defined normal distribution (μ=35,σ=2). Usually, however, one is more interested in an omnibus test of normality - using the sample mean and standard deviation as estimates of the population parameters.
The Kolmogorov-Smirnov test should not be used to test such a hypothesis - but we will do it here in R in order to see why it is inappropriate. In this example the mean is 34.754 and the standard deviation is 1.92472.
The P-value we obtain is 0.7026 - which gives no indication of a significant deviation from normality. Let us now use three of specialized tests of normality which allow for the fact that one is estimating parameters from the sample.
Whilst the P-value from the Kolmogorov-Smirnov test (0.7026) is not valid for the reasons stated, any of the other three tests could justifiably be used depending on which aspect of the distribution one is most interested in. None of them indicates a significant deviation from normality - although with such a small sample the deviation would have to be very marked to be detected.
Postscript: When there are several (appropriate) tests to choose from, it is very important to select the test a priori, and not just choose the one that gives the desired result. If you want to give more weight to the tails of the distribution, then select the Anderson-Darling test. If you want to give more weight to the centre of the distribution, then select the Lilliefors test.
Two-sample Kolmogorov-Smirnov test
Worked example II
We use the same data on the effect of drug treatment on the length of time from treatment to lambing that we have used previously. An equal-variance t-test on the log transformed data gave a P-value of 0.00986, whilst an unequal-variance t-test on the raw data gave a non-significant P-value of
The observed cumulative relative frequencies (S1(Y) and S2(Y)) are obtained by dividing rank (r) by the number of observations (n). Differences are then obtained between the two sets of observed values (S1(Y)i - S2(Y)i). The largest absolute difference of these two sets of differences is d.
The maximum difference here is between the smallest of the five values in sample 1 (45, S = 1/5 = 0.2) and the ninth ranked value in sample 2 (58, S = 9/11 = 0.8182). Hence d = 0.6182. This is also the value given by R.
However, we do have a problem with the presence of ties - three observations all have the same reading (51). R makes it clear that it cannot compute correct p-values when ties are present (although it does give a P-value anyway of 0.145).
Since in this case observations were rounded to the nearest hour, one possible way round this problem would be to jitter observations with the same readings - giving them randomly chosen values between 50.6 and 51.4. In this particular example, jittering does not affect the value of the test statistic, and enables R to give a (defensible) P-value of 0.125.
In other words, much like the unequal variance t-test and the Wald-Wolfowitz test, it suggests there is no significant difference in time to lambing between treated and untreated sheep. This reflects the lack of power of the Kolmogorov-Smirnov test to detect differences in distributions between two small samples.