 InfluentialPoints.com
Biology, images, analysis, design...
 Use/Abuse Stat.Book Beginners Stats & R
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

### Example, with R

Sometimes the average, or most typical value, is very obvious and straightforward:

-2   -1   0   0   0   0   0   1   2

For that, rather unusual, set of values the average is obviously 0 - and for several different reasons:

• 0 is the most common value (the mode), and would be the value most likely to be selected by chance,
• 0 is the least deviant (middle-ranking, median) value,
• 0 is the mean of the most common values, and of the least deviant values,
• 0 is midway between the most extreme values (=the mid-range),
• 0 is their mean value (their sum, divided by the number of values),
• 0 is the value from which all the others deviate least.

You can check this last point yourself with Note: for most sets of values one or more of the measures shown above will disagree.
For example, the values -2.1, -1, 0, 0, 0, 0, 0, 1, 2 will give several different averages, depending upon how you calculate them.

### Definition and Use

• An average is assumed to be the most typical, usual, normal, expected, representative value of a set.
• The average is assumed to be a simple-to-interpret robust measure of 'location', or expected outcome.
• In statistics the average most often refers to the (simple arithmetic) mean.
• But it is sometimes assumed to be the median or mode, or even the minimum or maximum.
• Many other measures of location, such as 'trimmed' means and geometric means, are also known as averages.
• In real life these measures seldom agree.
• In other words, 'average' is a loose, ill-defined term. So beware.

### Simple formula

Assuming y is a list of items, perhaps the simplest way to assess the average is to select an item at random:

But, where y is a list of n numbers, the arithmetic mean is more popular:

### Tips and Notes

• Even if you assume an average is the arithmetic mean, there are many different ways of calculating that value.
• If the number of values (n) is infinitely large, as is assumed by many statistical models, you cannot calculate their average by adding them up and dividing by infinity!
• The best measure of location depends upon what is being averaged, and to what use you wish to put that average.
• Even when an average is a simple and reasonable measure of location, it can be highly misleading.

### Test yourself

We can illustrate some of the beginners statistical issues raised above with ### Useful references

Huff, D. (1954). How to lie with statistics. Victor Gollancz, London. Full text This gives the topic a lighter treatment emphasizing the problems with the misuse of the arithmetic mean for heavily skewed data. Although very old, this book is still worth looking at. Recommended!

Wikipedia: Average. Full text Notes that the most appropriate statistic to use for a measure of central tendency depends on the nature of the data.
 Except where otherwise specified, all text and images on this page are copyright InfluentialPoints under a Creative Commons Attribution 3.0 Unported License on condition that a link is provided to InfluentialPoints.com