InfluentialPoints.com Biology, images, analysis, design... 

"It has long been an axiom of mine that the little things are infinitely the most important" 

KolmogorovSmirnov test One & twosample, and related testsOn this page: Onesample KolmogorovSmirnov test Twosample testOnesample KolmogorovSmirnov testWorked example IWe base our first example on some data on sole horn moisture content from a study by Higuchi & Nagahata The observed cumulative relative frequencies (S(Y)) are obtained by dividing rank (r) by the number of observations (n). Expected cumulative relative frequencies (F_{o}(Y)) for these quantiles from a normal distribution are given by the area under the normal curve; they are readily obtained from
The process may be easier to follow on the first graph below. The red curve shows the expected cumulative relative frequencies from a normal distribution. Blue points show the observed cumulative relative frequencies. Red points show points on the cumulative normal curve equivalent to observed cumulative relative frequencies. Green points lie immediately before each stepup. Using
A more efficient approach is shown in the second figure above. A correction factor (0.5/n) is subtracted from each observed cumulative relative frequency. Only one difference then has to be calculated for each observed cumulative relative frequency. Once the largest absolute difference is identified, the correction factor is added back on again to give d. We said at the start of this worked example that we were testing the observed distribution against a fully defined normal distribution (μ=35,σ=2). Usually, however, one is more interested in an omnibus test of normality  using the sample mean and standard deviation as estimates of the population parameters. The KolmogorovSmirnov test should not be used to test such a hypothesis  but we will do it here in R in order to see why it is inappropriate. In this example the mean is 34.754 and the standard deviation is 1.92472. Using
The Pvalue we obtain is 0.7026  which gives no indication of a significant deviation from normality. Let us now use three of specialized tests of normality which allow for the fact that one is estimating parameters from the sample.
ConclusionsWhilst the Pvalue from the KolmogorovSmirnov test (0.7026) is not valid for the reasons stated, any of the other three tests could justifiably be used depending on which aspect of the distribution one is most interested in. None of them indicates a significant deviation from normality  although with such a small sample the deviation would have to be very marked to be detected. Postscript: When there are several (appropriate) tests to choose from, it is very important to select the test a priori, and not just choose the one that gives the desired result. If you want to give more weight to the tails of the distribution, then select the AndersonDarling test. If you want to give more weight to the centre of the distribution, then select the Lilliefors test. Twosample KolmogorovSmirnov testWorked example IIWe use the same data on the effect of drug treatment on the length of time from treatment to lambing that we have used previously. An equalvariance ttest on the log transformed data gave a Pvalue of 0.00986, whilst an unequalvariance ttest on the raw data gave a nonsignificant Pvalue of
The observed cumulative relative frequencies (S_{1}(Y) and S_{2}(Y)) are obtained by dividing rank (r) by the number of observations (n). Differences are then obtained between the two sets of observed values (S_{1}(Y)_{i}  S_{2}(Y)_{i}). The largest absolute difference of these two sets of differences is d. The maximum difference here is between the smallest of the five values in sample 1 (45, S = 1/5 = 0.2) and the ninth ranked value in sample 2 (58, S = 9/11 = 0.8182). Hence d = 0.6182. This is also the value given by R. Using
However, we do have a problem with the presence of ties  three observations all have the same reading (51). R makes it clear that it cannot compute correct pvalues when ties are present (although it does give a Pvalue anyway of 0.145). Since in this case observations were rounded to the nearest hour, one possible way round this problem would be to jitter observations with the same readings  giving them randomly chosen values between 50.6 and 51.4. In this particular example, jittering does not affect the value of the test statistic, and enables R to give a (defensible) Pvalue of 0.125. In other words, much like the unequal variance ttest and the WaldWolfowitz test, it suggests there is no significant difference in time to lambing between treated and untreated sheep. This reflects the lack of power of the KolmogorovSmirnov test to detect differences in distributions between two small samples.
