InfluentialPoints.com
Biology, images, analysis, design...
Use/Abuse Stat.Book Beginners Stats & R
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

 

 

Beginners statistics: mode

On this page: Example, with R,  Definition and Use,  Simple formula,  Tips and Notes,  Test yourself,  References  Download R  R is Free, very powerful, and does the boring calculations & graphs for scientists.

Example, with R

Modes are the most common values in a set of data.
For example these numbers, 0   0   0   0   1   2   20   21   22   23   100 have one obvious mode: 0.

You can find a single mode, such as that, with 


This works well enough where there is a single identifiable mode - which may not be the case.

The variable called dist will have the frequency of every value in variable y, entering dist will reveal its contents.


Definition and Use

  1. A mode is a value (in a set of values) which is more common than other values in that set.
    • Conventional statistics commonly assumes data will have a single mode, close to its midrange.
    • Many conventional statistical models also assume data are continuous (every value is different) given which, if you have n values, you must have n modes.
    • This problem is gotten around by arbitrarily dividing data into class-intervals.
  2. Modes are whichever intervals contain more values than their neighbours.
    Beware: Since the choice of class-intervals is inherently arbitrary, there is no way to unambiguously identify modes!
    For example, if you set the breakpoints correctly, the set of values above has a second 'mode' around a midpoint of 21.5
  3. The mode is most commonly used as a measure of location for nominal data, where neither the mean nor median are applicable.
    For example, in a sample of garden bird sightings, blackbird may be the commonest species.


Simple formula

Assuming y is a 'normal' list of numbers, the midrange is the simplest way to assess the mode:

(yminimum + ymaximum)/2

A more popular way is to arbitrarily sub-divide y into n ranges of equal width, then select whichever range has most values.

Alternatively, the most common value may be selected.


Tips and Notes

  • If you use class-intervals to identify modes, the number and location of those modes will depend directly and strongly upon your choice of interval breakpoints.
  • Given sufficiently narrow class intervals, discrete data will have the same number of modes as the number of values it contains.
  • Histogram-type dotplots, are equivalent to setting the class intervals to zero.


Test yourself

You can see the effects of changing the class intervals with 

It may be sobering to recall that, for very good reasons, real data is usually 'rounded' or truncated in some fashion. The next example uses values truncated to 3 decimal points.


Useful references

Wikipedia: Mode. Full text 

See Also