Biology, images, analysis, design...
|"It has long been an axiom of mine that the little things are infinitely the most important" |
Survey sampling methods: Use & misuse
(probability sampling, convenience sampling, cluster sampling, adaptive sampling, missing observations, non-response bias, measurement error, data validation)
Statistics courses, especially for biologists, assume formulae = understanding and teach how to do statistics, but largely ignore what those procedures assume, and how their results mislead when those assumptions are unreasonable. The resulting misuse is, shall we say, predictable...
Use and Misuse
All data in the field have to be gathered using some form of sampling. If we want the sample to be representative of the 'population', then some form of probability sampling is essential. In other words each sampling unit must have a known probability of selection. Such methods include simple random sampling, stratified random sampling and cluster random sampling.
A simple random sample of n sampling units is one in which every possible combination of n units is equally likely to be in the sample selected. In stratified random sampling the population is divided into homogeneous subgroups before sampling. and a simple random sample is taken within each stratum. It can be done using either equal allocation (same number of units sampled in each stratum) or proportional allocation (same proportion of units sampled in each stratum). A systematic sample is one in which the units selected for a sample occupy related positions in the sampling frame, the first unit being selected at random. Like stratified random sampling, it provides a more even coverage of the population. but it has a major disadvantage relative to stratified sampling in that estimates of variability in the population are not straightforward and may be biased.
In cluster sampling the sampling unit is a group of individuals rather than a single individual. In one stage cluster sampling the clusters are chosen by simple random sampling, and within each cluster all secondary (evaluation) units are selected. In two stage cluster sampling, a random sample of clusters is selected, and a further (ideally) random samples of secondary (evaluation) units. In adaptive cluster sampling an initial random sample of units is taken, but then additional sampling units are taken in the immediate neighbourhood of the 'positive' sampling unit. This creates a set of 'networks' of sampling units, each comprising different numbers of sampling units.
Non-probability sampling comprises haphazard sampling. purposive sampling and convenience sampling. Haphazard sampling is where a researcher is aware of the need for a random sample but it is impractical, or too expensive, to list every sampling unit. Hence procedures are adopted that will not produce a random sample, but will at least reduce bias in sample selection. Purposive sampling is where the researcher thinks he can (non-randomly) select a 'representative sample'. Convenience sampling is to select whatever units happen to be most easily and cheaply available to the researcher. There should be no place in science for convenience or quota sampling, and haphazard sampling should only be used in the secondary stage of cluster sampling if probability sampling is simply not possible.
Sampling methodology has improved over the years in some disciplines, although non-probability sampling is still common. For example, we may read that it is the farmers who select which animals are sampled, and the veterinarians who are purposively selecting 'representative' farms. Unfortunately claiming a sample is representative is no substitute for a proper sampling programme. It is true that we sometimes have to use non-probability methods in the final stage of cluster sampling - we give examples of rapid survey techniques to evaluate civilian mortality rates in war situations and vaccination rates. The same problem also arises when sampling insect populations with traps. But in all these cases one should still use probability sampling in the first stage. Every effort should be made to minimize bias in the latter stages, or if this is not possible the result should be calibrated against a gold standard (for example relating trap catches to absolute population estimates).
Even where the intention is to use probability sampling, it is sometimes not properly implemented. For systematic sampling the starting unit must be randomly selected and there should be no cyclicity. One must also avoid bias at the point of final selection of the sampling unit. We give examples where these criteria may not be met. In veterinary studies one is often looking at both herd level prevalence and individual level prevalence. A common problem is that clusters are sampled randomly, but confidence limits are estimated as if the individual units were sampled randomly. Sometimes we find cluster samples are taken, but exactly what comprises a cluster is not defined. Stratification should be used if one is dealing with an aggregated population to enable good coverage, and we give examples where this was done (sampling toheroa) and not done. Adaptive sampling should offer advantages over other methods for highly aggregated populations, although as we see for sampling reptiles these advantages are not always realised.
Even if a probability sample is selected, missing observations and non-response bias can reintroduce selection bias. Although this is often recognised, there is too great a readiness to dismiss it if non-respondents are similar to respondents in one or two (often irrelevant) characteristics. Measurement error is a more serious problem - the sensitivity and specificity of farmers' diagnoses are likely to be very low, yet this is commonly ignored, as is any attempt to correct prevalences for test imperfections. For sensitive issues such as food safety and reporting of notifiable diseases reporting bias is inevitable. Observer bias is also a serious problem for wildlife counts. Such errors should prompt a greater number of attempt at data validation, but such studies are rare. We do however include one validation study on estimating the density of logs in woodland. Lastly the need for a sufficient sample size cannot be over-emphasized - small samples tell one very little and can be misleading.
What the statisticians sayLumley (2010) provides a guide to analysis of complex surveys using R. Chambers & Skinner (2003) and Lehtonen & Pahkinen (2003) look at methods for the design and analysis of survey data. Armitage & Berry (2002) provide a useful account of survey sampling methods for the medical researcher in Chapter 19, with particular attention given to systematic, stratified and multistage sampling. Levy & Lemeshow (1991) is the most authoritative and comprehensive text on sampling methods. Two older texts, Cochran (1977) and Kish (1965) are the classics in survey sampling and are still the authorities quoted in many recent textbooks .
Thrusfield (2005) covers the main types of sampling in veterinary research in Chapter 13. Dohoo et al. (2003) provide a more comprehensive treatment of the topic. Sutherland (2006), Elzinga et al. (2001) and Krebs (1999) all provide good accounts of sampling methods for the ecologist. Bart et al. (1998) provide an excellent text on sampling methods for behavioural ecologists. Buckland et al. (1993) covers study design when using distance sampling, whilst Snedecor & Cochran (1989) give a concise account of the design and analysis of sampling in chapter 17.
Grais et al. (2009) review lessons learnt from some recent surveys conducted in Congo, while Grais et al. (2007) and Luman et al. (2007) consider alternative methods for the second stage sample in two-stage cluster surveys. Fottrell & Byass (2008) compare the performance of various survey sampling methods. Frerichs & Shaheen (2001) demonstrate the advantages of taking a random second stage sample for small community-based surveys. Hoshaw-Woodard (1991) compares rapid survey cluster sampling with Lot Quality Assessment Sampling for assessing immunization coverage. Bennett et al. (1991) and Henderson & Sundaresan (1982) review experience with rapid survey cluster sampling. Frerichs (1989) describes the analytic procedures for rapid survey cluster sampling.
Ziller et al. (2002) compare various two-stage sample strategies for substantiating freedom from animal disease in large areas. McDermott et al. (1994) examines design and analysis for cluster sampling in veterinary applications. White et al. (2005) provide recommendations for best practice in the use of questionnaires in ecology. Sargeant et al. (2003) look at sampling designs for carnivore scent-station surveys. Thompson (1990) considers the principles and practice of adaptive cluster sampling.
Wikipedia provides a general section on sampling which covers most of the techniques. More detail on particular topics is given in simple random sampling, systematic sampling, stratified sampling, cluster sampling, quota sampling, and convenience sampling. CDC has a training module on probability proportional to size cluster sampling. Therese McGinn also gives a useful guide to how to select clusters using probability proportional to size. Putt et al. (1988) summarize appropriate sampling designs in veterinary epidemiology. Edgar Moser provides a useful account of adaptive cluster sampling.