InfluentialPoints.com
Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

Search this site

 

 

Calculating quantiles


1.   The median

By convention, the rank (r) of the median of a set of n observations is calculated as r = p[n + 1] - where p = 1/2 and r is the (nominal) rank of the pth quantile. This is mathematically equivalent to the mean rank, Σr / n. If n is even, then the median is assumed to lie mid-way between ranks, and is estimated as the mean of the next lower and next higher ranked values.

This formula does not work so well for other quantiles - particularly the more extreme ones - and for most purposes r = 1 + p[n - 1] is best. Where p = 0.5 it produces a result identical to the conventional method.

For example, if y(r) is a value of (nominal) rank r, then:

  • The median of n = 7 observations is 1/2[n + 1] = 1/28, or Σr / n = 28 / 7, or 1 + 1/2[n - 1] = 1 + 1/26, or 4.
      If y(4) = 123.456, then that is your median.

  • Similarly, the median of n = 8 observations is 1/2[n + 1] = 1/29, or Σr / n = 36 / 8, or 1 + 1/2[n - 1] = 1 + 1/27, or 4.5 - and our best estimate of y(4.5) is assumed to be 1/2y(4) + 1/2y(5).
      If y(4) = 1.2, and y(5) = 2.4, then the median is 1/21.2 + 1/22.4 = 1.8

 

2.   Any quantile

Below are six ways of calculating the pth quantile (yp) for a set of n observations of variable Y.

  • Of these, the first method is the conventional way of estimating the median, but does not perform so well on less typical quantiles.
  • The second method gives same median, and tends to be better on the more extreme quantiles.
  • The fourth method is equivalent to using the cumulative distribution function.
  • The 3rd, 4th and 5th methods tend to produce medians slightly below those given by the first and second methods.

  1. r = p(n + 1)
    If r is not an integer, then yp is interpolated.

  2. r = 1 + p(n - 1)
    If r is not an integer, then yp is interpolated.

  3. r = pn + 0.5
    If r is not an integer, then r is rounded down.

  4. r = pn
    If r is not an integer yp is interpolated.

  5. r = pn
    If r is not an integer, then r is rounded up.

  6. r = pn
    If r is not an integer, then r is rounded up.
    But if r is a whole number, r = r + 0.5, and yp is interpolated.

 

In all cases:

  • y(r) denotes a value, from that set, of rank r.

  • If r is a whole number, then yp=y(r)

  • If r is not a whole number, yp is assumed to be between y(r) and y(r+1) - and is linearly interpolated as yp = (1-f)y(r) + (f)y(r+1)
      where
    • i = the whole number part of r. So i = r, rounded down.
    • f = the fractional part of r. So f = r - i.