Example, with R
Here is a straightforward bargraph showing the observed frequency in several categories.
You can get much the same thing with R.
- Bar labels (in this case the values in y) can be numbers or words.
- Each bar height is set by its value (in this case of the expected frequency f).
- The bars are equally-wide and equally-spaced rectangles (oblongs).
- For nominal values, the order in f is arbitrary - ordinal values (e.g. small, medium, large) are assumed to be arranged in order of rank.
- When you provide a negative frequency, R plots the bar with a negative height - this is not as mad as it sounds, you may be examining the observed deviations from an average or expected frequency.
Definition and Use
A bargraph (or bar diagram or bar plot) uses a series of rectangular bars to show the values associated with a set of classes. The values may be the frequencies of items in each class, or the amounts in each class, for example the gross national product of different countries.
- Bars may be arranged side-by-side in which case the height of each bar represents the value, or they may be arranged one-above-the-other in which case the width of each bar represents the value.
- The classes must be formed by dividing a set of items into mutually-exclusive classes.
- For example:
- You could divide a group of children into 3 classes: short, tall, and in-between - and validly display the number in each height-class using a bargraph.
You could also classify those children as boys, girls, short, and tall - but, because some children would fall into more than one of those classes, you could not validly display the number in each of those classes as a simple bargraph.
- If the bars are very narrow, the graph is sometimes described as a linegraph.
- More complex types of bargraph (such as stacked bargraphs) enable you to display the frequency of sub-classes.
Tips and Notes
- Do not confuse bargraphs with histograms. Histograms are used exclusively to display the frequency distribution of a continuous measurement variable. Bar graphs are used for many purposes including to display the frequency distribution of discrete measurement variables and of nominal variables. Histograms (should) have no space between the bars, whereas bar diagrams always have a space between each bar.
- The width (or height) of every bar should be the same - if not, assume someone is trying to mislead you.
- Do not assume the order of bars implies the classes are an ordinal variable.
- Do not use a truncated bar graph!
- If you convert the frequencies to proportions (as p = f/sum(f)) or percentages), it should not alter the shape of the resulting bargraph.
- Do not assume that all information has been given in a bargraph - watch out especially for categories being omitted, for example for cause of death 'unknown' may (wrongly) be omitted.
Test yourself
There are several ways in which the following barplot could be horribly misleading...
Useful references
- Huff, D. (1954). How to lie with statistics. Victor Gollancz, London. Full text
- Looks at the many problems with the use of bar graphs. Recommended!
- Kabakoff, R.I. (2012). Quick-R: Bar plots. Full text
- Covers simple bar plots, stacked bar plots and grouped bar plots
- Klass, G. (2002). How to construct bad charts and graphs. Full text
- Focuses on the three elements of bad graphical display: data ambiguity, data distortion and data distraction.
- Wikipedia: Bar chart.
Full text
- Useful text - but the lack of label on vertical axis of the example makes it a classic example of how not to do it!
