Biology, images, analysis, design...
|"It has long been an axiom of mine that the little things are infinitely the most important" |
Survey Sampling MethodsOn this page: Simple random sampling One-stage cluster-sampling Two-stage cluster-sampling Stratified random sampling Adaptive cluster sampling
Random and systematic sampling
Simple random sampling
You must first be able to list your individual sampling units in some way. This applies whether your sampling unit is a person, a rodent, a tree or an insect. In most cases they will need to be tagged in some way so they can be identified. If you cannot list the individual sampling units, you cannot take a simple random sample (although there are other options as we see below).
The best way to select your sample is to select n numbers from a table of random numbers. Another way is to generate n numbers on the computer using a random number generator. If all else fails, write all the numbers of the sampling units on pieces of paper. Fold the pieces of paper so the numbers are not visible, put them in a box and shake them up. Then select the required number of units, preferably with your eyes shut. Unfortunately it is not random, but hopefully will not be unduly biased.
For systematic sampling the starting point should be chosen randomly in order to avoid bias. In the diagram right, we wanted to select (n=) 12 units from a population of (N=100), so k = N/n = 100/12 = 81/3. We used random number tables to select a number between 1 and (k=) 8 as our starting point. The number selected was 6, so starting there, we then selected every 8th unit - giving a total sample size of 12.
Since N is fairly small, it would have been better had we employed a selection interval of k=81/3 unit - rounding the result to the nearest whole number. Thus, instead of (6+0)=6, (6+8)=12, (6+16)=22, (6+24)=30... we should have used (6+0)=6, (6+81/3)=12, (6+162/3)=23, (6+25)=31...
The need for an initial random selection means that, even for a systematic sample, you must be able to list all units in the population - or at least locate them unambiguously. You also have to know the total number of units in order to select the sampling interval to get your desired sample size. Sometimes the first unit is haphazardly selected, although this can lead to bias - especially if you interpret haphazard to mean convenience and also select a convenient value of k.
If systematic sampling is being used to select quadrats in a field, the distance between plots can measured by the number of paces. The distance between sampling units does not have to be measured too precisely, providing there is no risk of bias in the precise positioning of the sample. If there is, it is better not to look at the ground for the last few paces.
One stage cluster sampling
This is done in the following way.
We first take an example where there are the same number of secondary units in each cluster.
Let us take an example of sampling cages each of 100 laying hens. We take a random sample of 12 cages, and determine the proportion suffering from a particular nutritional disorder in each cage.
Using the formula for one stage cluster sampling with equal numbers per cluster:
Had we used the binomial formula assuming a simple random sample we would have got a confidence interval of 0.269 to 0.326. This estimate is smaller than the correct value, and emphasises the importance of ensuring that conditions really are met for use of the binomial formula.
In this example there are an unequal number of secondary units in each cluster.
Let's take an example of doing a survey to determine the level of immunization coverage for children against measles in a district. You don't have a list of all the children in the district so you cannot take a simple random sample. But you do have a list of schools in the district. Hence you take a random sample of schools. You then sample all pupils from each
Let's take as an example sampling children of a particular age in schools for their percentage immunity to a disease. The schools are selected randomly and all children of the chosen age at the selected schools are tested for immunity. Not surprisingly the number of children sampled at each school varies widely.
The total number of children sampled is 3046, of whom 928 are immune.
Using the formula given above for one stage cluster sampling with unequal numbers per cluster:
Two stage cluster sampling
This is done in the following way.
Selection by probability proportional to size
Stratified random sampling
Let us take an example of carrying out a survey to determine the prevalence of an allergy in children within a district. The sample was stratified according to rural or urban, as it was anticipated that the rural prevalence rates may be higher. The number of children living in each stratum was known from a recently conducted census. It was decided to use proportional allocation and sample 5% of children in each stratum.
Since sample sizes were determined by proportional allocation, we could have used the simplified formula, but we will use the general formulae for demonstration purposes.
The 95% normal approximation binomial confidence interval to the weighted proportion is then given by:
Adaptive cluster sampling
We will take the example given in the core text where 12 initial samples were taken. Two samples were positive so adjacent units were also sampled, giving the networks shown in the figure:
Since the initial sample constituted more than 5% of the population, we multiply this by √[1 − 12/100] to give a corrected standard error of 0.144.