Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)



Frequency & Probability

Analyzing frequencies is central to data analysis because it allows you to estimate probabilities. For example, if you drop a coin a large number of times, and record your observations, you can plot these as a bar graph. One bar representing the 'heads' side of the coin, the other bar could represent the 'tails' side. If your sample of observations is large enough, you would expect to find the bars are about the same size. In other words you would expect the probability to be the same for the coin to land on either side.

But, what would you decide if one bar of your graph was 3 times the size of the other bar?

  1. It was just luck. In other words an unlikely random result.
  2. The coin was 'bent' and was somehow biased towards landing on one particular side. In other words this result you observed was not a random chance event.

Obviously if you had only dropped the coin 4 times, your evidence for the second conclusion would not be very strong. But if you had dropped the coin 4000 times, 3000 'heads'(or 3000 'tails'), would not be very likely - if the coin was unbiased. In Unit 4 we consider how to work out just how unlikely that would be. In later units we will use that logic to solve some much more interesting problems.

We can consider this analysis in another way.

When you try spinning a coin, or doing an experiment, you are taking a 'sample' of the possible observations. If the sample is small, you will have little confidence in the result. This is because you know that a small number of observations are not a good indication of what you are trying to examine.

If you take a large sample, containing many observations, you are likely to get a pretty good idea of how common are the sort of observations your sample represents. In other words, provided your observations are an unbiased selection of their population, you would assume the proportion of any class of results in your sample reflects its frequency in the population as a whole. The larger the sample, or the more often you observed a similar result - the more confidence you would have in your result.

Clearly therefore, proportions and probabilities are closely linked. This is a powerful tool, if you can understand how to use it.