Biology, images, analysis, design...
|"It has long been an axiom of mine that the little things are infinitely the most important" |
IntroductionOn this page: Example, Definition and Use, Tips and Notes, Test yourself, References Download R R is Free, very powerful, and does the boring calculations & graphs for scientists.
Example, with R
Imagine you have been collecting data - for example the colour - on a number of items.
greenishyellow yellowishgreen blue puregreen blue bluegreen
Alternatively you could summarize the data in various ways.
With so little data is easy to do such summaries. But if you were summarizing very many results, say a million, you may prefer a bit of help. Some of those summaries can be obtained with R:
Definition and Use
The term 'statistic' can refer to several rather different things.
Many functions are available to summarize information. For example, a salesman could equally truthfully provide the most typical cost as 'on average' or give the maximum ('up to...') or the minimum ('from...') just $ 300. The 'average', 'maximum' and 'minimum' are all statistics.
Note, summary statistics of a sample are often used as estimates for the population at large - for instance when you are told 'the average man has 1.8 children' that result was found in a sample of men - it is usually impossible to check every man.
Humans, of course, use non-numerical summaries all of the time. For example when you say 'cats are smaller than dogs' you are probably describing the average situation - however some people assume you mean every cat is smaller than every dog.
Humans also use non-numerical estimates of probability, using a simple scale, ranging from impossible to certain. Research shows most people divide that scale into surprisingly few levels - seldom more than 7 - and have problems in dealing with very small probabilities.
Tips and Notes
Whilst simple numerical measures are a useful way to summarize data:
Since there are innumerable ways to summarize any set of information, and assuming no mistakes are made in making that summary, you should always ask yourself:
Governments and corporations have particular ideas of what summaries are appropriate, and may select their information and summary measures so as to achieve particular outcomes. Hence 'statistics' are commonly seen as 'lies using numbers'.
Nevertheless, since statistics are used for all sorts of important things, and because we all use statistics (consciously or otherwise) it is wise to understand something of their properties - and we do not mean you merely need to know how to calculate them, or to memorize the results of those calculations!
It is easy to get a computer to calculate a statistic, the hard part is knowing whether the result means anything - and how it may be misleading.
Consider this summary of salaries of a small company:
The company said 'since their average salary was $ 30000 our staff receive far above the industry average of $ 15000 per year'.
Do you think this is an appropriate way to summarize their data?
Would the mid-ranking salary be a better measure of what their average member of staff gets?
Note, on most pages we provide the R-code, and a few comments/notes, and expect you to ask appropriate questions and reach your own conclusions. Their object is to promote thought, not to simply impart information.