
Worked example I
Our first worked example uses some hypothetical data based on the work of Rehbein & Visser (1999)
on the number of Fasciola eggs per gram of faeces in sheep.
sample | eggs |
sample | eggs |
sample | eggs |
1 | 306 | 8 |
85 | 15 | 75 |
2 | 152 | 9 |
245 | 16 | 77 |
3 | 136 | 10 |
227 | 18 | 43 |
4 | 113 | 11 |
99 | 18 | 211 |
5 | 128 | 12 |
324 | 19 | 301 |
6 | 72 | 13 |
785 | 20 | 80 |
7 | 455 | 14 |
220 | 21 | 354 |
| |
Just looking at the data we can see it is likely to be skewed, with just a few sheep having large numbers of eggs in their faeces. This becomes clear when we look at a (grouped) bar diagram of the frequency distribution of these data:
{Fig. 3}
Although the distribution is skewed, there is only one mode (at 1-100 eggs per sample) so a box-and-whisker plot is appropriate.
The first thing to do is to rank the data so we can determine the median and the lower and upper quartiles.
rank | eggs |
rank | eggs |
rank | eggs |
1 | 43 | 8 |
113 | 15 | 245 |
2 | 72 | 9 |
128 | 16 | 301 |
3 | 75 | 10 |
136 | 17 | 306 |
4 | 77 | 11 |
152 | 18 | 324 |
5 | 80 | 12 |
211 | 19 | 354 |
6 | 85 | 13 |
220 | 20 | 455 |
7 | 99 | 14 |
227 | 21 | 785 |
| |
We can then draw the box-and-whisker plot as below:
{Fig. 4}
Now compare this with the bar diagram of the ranked frequency distribution (above). Note that half the animals have between 85 and 301 eggs per gram of faeces (shown by the box), whilst a few animals have very large numbers of eggs - which results in the very long upper whisker of the box.
Worked example II
Our second worked example uses the same cattle weight data as used in the More Information page on frequency distributions. We have rotated (and inverted) histograms of these data so they are directly comparable to the box-and-whisker plots.
The histogram of weights in herd A is unimodal and skewed towards the lower values - a left skew if the histogram were orientated in the usual way. The histogram of weights in herd B is bimodal and not noticeably skewed in either direction. The second graph shows what happens if we try to compare these distributions using box-and-whisker plots.
{Fig. 5&6}
The box-and-whisker plot for herd A is readily interpretable as a skewed distribution with the lower whisker being longer than the upper whisker. But the box-and-whisker plot for herd B is very difficult to interpret. It has a wider interquartile range, and appears to be skewed towards higher values. The bimodality of this distribution cannot be interpreted from the box-and-whisker plot, so these plots should not be used for this sort of data. This is why one should always display data as jittered dot plots, rank scatterplots or histograms before using summary measures.

Worked example III
Our third worked example uses a quantile-quantile plot to compare the two sample distributions of cattle weights shown above.
- Sort each distribution into rank order - this is done in the table below.
Cattle weights in rank order |
Rank | Herd A | Herd B |
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
26 27 28 29 30
|
420 430 430 445 450
460 470 475 480 485
490 495 495 500 505
510 520 520 520 530
530 535 535 535 540
545 545 545 570 570
|
420 420 420 425 430
430 430 430 440 450
460 470 475 480 490
495 495 500 505 520
520 530 530 530 535
540 545 545 570 570
|
- Plot each value in one sample against the quantile with the same rank in the other sample.
- Draw in a line of equality to indicate where the points would lie if the two distributions were identical.
{Fig. 7}
The difference between the two distributions is immediately apparent, with the value in herd A generally being greater than its equivalent quantile in herd B. This difference may not be at all apparent from the box-and-whisker plot above.
This sort of plot is much more time-consuming if sample sizes are not equal, since all the quantiles in the smaller sample would have to be interpolated. Fortunately you can do a quantile-quantile plot in R.
For more sophisticated quantile plots, such as rank scatterplots and p-value plots, see the More Information page on frequency distributions.