InfluentialPoints.com
Biology, images, analysis, design...
Use/Abuse Stat.Book Beginners Stats & R
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)
 

 

Beginners statistics: quantile

Example, with R,  Definition and Use,  Simple formula,  Tips and Notes,  Test yourself,  References  Download R  R is Free, very powerful, and does the boring calculations & graphs for scientists.

Example, with R

Quantiles are values chosen to divide ordered values into predefined portions.
The median (1.1), their 50% quantile, divides these 5 ordered values into 2 equal groups:

 -999999   0   1.1   2   2.002 

Or you could find their 50% (the p = 0.5 th) quantile with 



Definition and Use

  1. Quantiles are commonly assumed to divide sets of ordered numbers into equal-sized groups.
    • Quartiles are expected to divide them into 4 equal groups.
    • Deciles are supposed to divide them into 10 equal groups.
    • Percentiles should divide them into 100 equal-sized groups.
    For the 5 numbers listed above, this reasoning may seem of academic interest.
  2. More practically perhaps, you can regard a set of n different values as n different quantiles.
  3. The most commonly encountered quantiles are the maximum (the 100% quantile) and minimum (the 0% quantile).
    • Since the maximum has none of the values above it, and the minimum has none of them below it, these are called 'extreme' or 'divergent' quantiles.
    • Conversely, the range enclosed by the first and third quantiles - termed the interquartile range - can be said to typify the distribution. It is commonly used as a summary statistic of spread. Values lying outside that range may be regarded as unrepresentative or outlying.


Simple formula

Given each item's rank (r) gives the number of items of less than or equal value, this is a usable approximation for large sets of values.

the rank of the pth quantile is pn

When pn does not correspond to the rank of any value of y you have to interpolate, or choose the best value.

When n is small you may prefer R's default quantile formula (type ?quantile to R for more):

1+p(n-1)


Tips and Notes

  • Instead of percentages, quantiles are commonly expressed using proportions: Thus the first quartile is the 25% or p = 0.25 th quantile.
  • Beware, the simple definitions run into difficulties when some of the numbers have equal values (tied), or where only certain numbers can be observed (discrete variables).
  • When applied to sufficiently large sets of (un-tied) values, the relative rank is virtually indistinguishable from p, the proportion of values below that value.
    When applied to small and/or heavily-tied sets of numbers, these ways of defining quantiles may differ quite noticeably!


Test yourself


Useful references

Altman, D.G. & Bland, J.M. (1994) Quartiles, quintiles, centiles and other quantiles. BMJ 309, 996 (15 October). Full text 
A good introduction to the use of quantiles in medical statistics.
Wikipedia: Quantile. Full text 
A bit heavy on formulae and a bit light on explanation.