It is relatively easy to calculate the probability
( p ) of randomly selecting exactly the same combination of observations as in your sample. Assuming each of your n observations are different from one another, that probability is
^{n!}/_{n}n
For example,
- If your sample contains 2 observations, a and b, the chance of selecting one a and one b (in any order) is ^{2!}/_{2}2 = ^{2×1}/_{2×2} = ^{1}/_{2} = 0.5
- If your sample contains 20 distinguishable observations, the probability of selecting that combination of observations is ^{20!}/_{20}20 = 2.3×10^{-8}
If you follow through the probability maths it turns out that if you perform B bootstrap resamples of n (differing) observations, provided [B-1]×p < 1, the probability of obtaining this result twice is ^{1}/_{2}×B×[B-1]×p.
So, if you perform 2000 resamples of 20 (differing) observations, this probability is 0.5×2000×1999×2.3×10^{-8} = 0.046 If, however you perform 5000 resamples of 40 different observations, this probability is only 8.43×10^{-10}