InfluentialPoints.com
Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

 

 

Purpose

These tests provide a means of comparing distributions, whether two sample distributions or a sample distribution with a theoretical distribution. The distributions are compared in their cumulative form as empirical distribution functions. The test statistic developed by Kolmogorov and Smirnov to compare distributions was simply the maximum vertical distance between the two functions.

Kolmogorov-Smirnov tests have the advantages that (a) the distribution of statistic does not depend on cumulative distribution function being tested and (b) the test is exact. They have the disadvantage that they are more sensitive to deviations near the centre of the distribution than at the tails.

 

 

One-sample Kolmogorov-Smirnov test

This is also known as the Kolmogorov-Smirnov goodness of fit test. It assesses the degree of agreement between an observed distribution and a completely specified theoretical continuous distribution. It is (reasonably) sensitive to all characteristics of a distribution including location, dispersion and shape.

The key assumptions of the one sample test are that the theoretical distribution is continuous (although a version exists which can cope with discrete distributions) and that it is fully defined. The latter assumption unfortunately means that its most common use - that of testing normality - is actually a misuse if the parameters of that distribution are estimated from the data.

The test statistic is d, the largest deviation between the observed cumulative step function and the expected theoretical cumulative frequency distribution:

Algebraically speaking -

d    =     max(abs[F0(Y)-S(Y)])
where
  • d is the maximum deviation Kolmogorov statistic,
  • F0(Y) is a theoretical cumulative distribution of population Y under H0,
  • S(Y) is the (observed) cumulative distribution of a Sample of population Y, assuming Hnil is TRUE - otherwise known as the ECDF (empirical cumulative distribution function).
Note:
  • The most commonly tested statistic is d = max(abs[F0(Y)-S(Y)]) but, depending upon the alternate hypothesis, d+ = max[F0(Y)-S(Y)] or d- = max[S(Y)-F0(Y)] are ALSO used (e.g. see R help). All 3 statistics are subject to an upper 1-tailed test, but although the null distributions of d+ and d- are the same, that of d is different.
  • Also, some authors use D rather than d (in the core text we use d, here we use D & d).
  •  

    Procedure

    1. Specify the theoretical cumulative function expected under the null hypothesis.
      For example: for a normal distribution, specify the mean and the standard deviation; for a uniform distribution, specify maximum and minimum.
    2. Calculate the observed cumulative relative frequencies and (optionally) plot as a step-plot.
      • Note, this instruction only makes any sense for very small samples when no computer is available.
      • Alternatively subtract a correction factor of 0.5/n from each observed frequency - this avoids the need to make two measurements for each point.
    3. Calculate the expected cumulative relative frequencies for the range of values in the sample and (optionally) plot as a curve.
    4. For each step on the step-plot, subtract the expected cumulative relative frequency (Fi) from the observed cumulative relative frequency (Si).
    5. For each step on the step-plot, subtract the previous expected cumulative relative frequency (Fi-1) from the observed cumulative relative frequency (Si).
    6. The largest of the absolute values of these differences is the test statistic (d).
      • If a correction factor was subtracted from the observed cumulative relative frequencies, add that factor to the largest of the absolute values to obtain d.
    7. The test statistic is referred to exact tables (for example Table E in Siegel (1956)) or to a software package.

     

    A warning

    Some authors, for example Sokal & Rohlf (1995), use "Kolmogorov-Smirnov" to denote both the (original) Kolmogorov-Smirnov tests, and Lilliefors test. This practice is perpetuated by some software packages, and can cause confusion. For example, the package SPSS uses the Lilliefors critical values when the Kolmogorov-Smirnov test is done in their 'Explore' module, but not when it is done as a nonparametric test. This leads to many erroneous analyses!

    SPSS also output a different test statistic - namely the Kolmogorov Smirnov Z-statistic:

    Algebraically speaking -

    K-S Z-statistic = d√n

    where

    • n is the number of observations,
    • d is the maximum deviation Kolmogorov statistic = max(|F0(Y)-S(Y)|)

     

    Assumptions

      These are of the One-sample Kolmogorov-Smirnov test - not the K-S Z-statistic.
    • The sample is a random sample
    • The theoretical distribution must be fully specified. The critical values given in tables (and often by software packages) assume this to be the case. If parameters are estimated from the data, the test result will be (much) too conservative. If parameters are estimated from the sample, Lilliefors test should be used instead. This test is available in certain packages (e.g. StatXact, Systat) for a number of different continuous distributions (normal, exponential and gamma).
    • The theoretical distribution is assumed to be continuous. If it is discrete (for example the Poisson), the result will be too conservative, although Conover (1999) provides an equivalent approach for discrete distributions for small samples.
    • The sample distribution is assumed to have no ties. If there are ties (for example from rounding, or if the variable under consideration is discrete), the result will be (much) too liberal as the large steps give an excessively large d. A categorized distribution can be tested with Kolmogorov-Smirnov by dividing observed differences between cumulative distributions by the number of observations in the class interval (n). But such a test is too conservative given (a) the distribution is discrete (see above) and (b) power is reduced because the number of observations reduced by a factor of n.

     

     

    Two-sample Kolmogorov-Smirnov test

    The two-sample Kolmogorov-Smirnov test assesses whether two independent samples have been drawn from the same population (Y) - or, equivalently, from two identical populations (X = Y). As with the one-sample test, it is moderately sensitive to all characteristics of a distribution including location, dispersion and shape. The one-tailed version of this test has a specific purpose - namely to test whether values of one population are stochastically larger than values of another population.

    As with the one-sample test cumulative distributions are compared, but here two sample distributions are compared - rather than a sample distribution and a theoretical distribution.

    For the two-tailed version of the test, the test statistic (d) is the largest absolute deviation between the two observed cumulative step functions, irrespective of the direction of the difference.

    Algebraically speaking -

    d    =     max[abs{S1(Y)-S2(Y)}]
    where
    • d is the maximum deviation Kolmogorov statistic,
    • S1(Y) is the observed cumulative distribution of sample 1,
    • S2(Y) is the observed cumulative distribution of sample 2.

    For a one-tailed version of the test, the test statistic (d) is the largest deviation between the two observed cumulative step functions in the predicted direction.

      Thus:
    • If Ha is that x > y, then test d+ or max[Sx-Sy]
    • If Ha is that y > x then test d- or max[Sy-Sx]

     

    Procedure

    1. Arrange each of the two sets of measurements in a cumulative relative frequency distribution using the same intervals for each distribution. Optionally plot each distribution as a step-plot.
    2. For each point where there is an observation (each step on the step-plot), determine the difference between the two cumulative distributions.
    3. The largest of the absolute values of these differences is the test statistic (d). For a one-tailed test, d is the largest difference in the predicted direction.
    4. Assessing the significance of this test statistic depends on sample sizes and the nature of H1:
      • If n1 = n2 and (n1+n2=) N ≤40, exact tables are available for both one- and two-tailed tests (for example Siegel (1956) Table L).
      • For a two-tailed test, if both n1 and n2 are greater than 40, the critical value for d at the 0.05 level is given by 1.36 √[(n1 + n2)/(n1 n2)]
      • For a one-tailed test, if both n1 and n2 are greater than 40, the expression 4d2/[(n1 n2)/(n1 + n2)] approaches χ2 with 2 df in the asymptote. This method can also be used as an approximation for small unequal sample sizes, but is conservative.

     

    Assumptions

    • The null hypothesis is both samples are randomly drawn from the same (pooled) set of values.
    • The two samples are mutually independent.
    • The scale of measurement is at least ordinal.
    • The test is only exact for continuous variables. It is conservative for discrete variables.

    Related
    topics :

    Anderson-Darling test

    Cramér-von-Mises test

    Lilliefors test