InfluentialPoints.com Biology, images, analysis, design... 

"It has long been an axiom of mine that the little things are infinitely the most important" 

Binomial and
Definition
Standard error
Distribution
Assumptions
Mass probability function

Algebraically speaking 

Simple arithmetic tells us that, irrespective of the sample size,
If we code a success as a one and failure as a zero, then the sum of a binary sample (ΣY) is equal to the frequency of successes
The figure below shows the observed and expected frequencies of successes in random samples of 8 observations on a variable Y, where P = 0.333. Expected frequencies are calculated using the binomial mass probability function (given
The distribution is unimodal, in this case with a mode at 23 successes for a sample size of 8, as would be expected for P=0.333.
Sometimes, when sampling a binomial variable, the probability of observing the event is very small (that is P tends to zero) and the sample size is large (that is n tends to infinity). This might be the case, for example, if we were looking at the incidence of a rare disease, where only one in ten thousand people are affected. In such a situation it is difficult and tedious to estimate expected probabilities from the binomial distribution.
Fortunately another distribution approximates the results of the binomial distribution under these circumstances  this is known as the
Since n is tending to infinity, and P is tending to zero, the sample mean is not a meaningful statistic  and the sum is used instead. Hence the sum (ΣY, or f) is an estimate of Pn  which is the expected mean frequency (λ). Since P and n are combined, λ is the only parameter of the Poisson distribution.
In reality, P is assumed to be small (but not zero), and n to be large (but not infinite). Happily, as you can see from the graphs below, the Poisson distribution approximates fairly rapidly to the binomial distribution  even at relatively modest sample sizes.
One important advantage of using the Poisson formulae is that the calculations tend to be rather more straightforward than those of the binomial. In addition the indeterminate sample size makes it applicable to a wider range of situations. For example, if you are attempting a visual count of wildlife, each animal seen can be counted as a success  each not seen is a failure. Then, assuming the probability of success is quite small, we can still estimate the standard error of our counts  provided our other assumptions are reasonable (see
The standard error of frequencies and proportions (sums and means of a binary variable) can be estimated in the same way as for those of a continuous variable  most conveniently from the sample variance formula, pq. Notice that this estimate of the standard error makes the same assumptions as the binomial distribution  and may be referred to as the binomial standard error.
Algebraically speaking 

Where the proportion of successes in a population is very small, and the sample size is very large, the frequency of successes in the sample is used as an estimate of their frequency in similarly large random samples of their population. In other words, f is used as an estimate of λ or Pn. Moreover, the variance of the observed frequencies is equal to the mean (expected)
Algebraically speaking 

The skew and kurtosis of binomial and Poisson populations, relative to a normal one, can be calculated as follows:
Skew =
Kurtosis =
Skew =
Kurtosis =
Although the binomial is a discrete distribution function, in some ways the sums (= frequencies) and means (= proportions) of binary variables behave very similarly to those of continuous variables.
In relation to sample size (n):
In relation to the proportion of successes (P):
For computational convenience therefore, a normal distribution function is commonly used as an approximation to the binomial one, providing PQn is at least 5. However, we still have the problem that we are using a continuous distribution to approximate a discrete one. This is commonly dealt with by using a continuity correction which consists of subtracting 0.5 from the
For example, the first graph below shows a normal approximation to a binomial function. Even though PQn is less than 5, the normal density function is not too implausible a fit. However the cumulative function shows a clear difference (of half a unit) between their
For large sample sizes the continuity correction may be ignored, but for moderate sample sizes it is generally required when a continuous function is used to approximate a discrete one. Be aware however, although they hide within quite a few textbook formulae, not all statisticians agree upon when (or indeed if) continuity corrections should be used.
Like the binomial distribution the Poisson is discrete, and for large values of λ it approaches normal. For small expected frequencies, like the binomial, it is markedly skewed. Where the frequency is 5 or above the normal distribution is often used as an approximation  usually with a continuity correction.
Because the binomial and Poisson models are often used descriptively, or merely assumed to apply, their assumptions tend to get ignored. This is partly because they tend to be used in different ways, some of which require additional assumptions.
Where observations are recorded as groups  for example using a quadrat  you are assuming the observations within each group behave as if they were selected individually at random. For this to be reasonable, some additional assumptions must be made:
If you use a complete count of events in a time period, or of organisms in an area, then the issue of how the sample is taken does not arise. But if you are taking a sample, for example counting the number of insects within each of a number of quadrats, the quadrats are assumed to be randomly located.
When the binomial or Poisson model are used to estimate standard errors, or to predict how p or f will vary, it is further assumed there is no source of variation, other than random selection  such as measurement error.
If deviations from the binomial or Poisson distributions are used as estimates of nonrandomness, it is assumed that all the other assumptions are met.
N.B. Although a Poisson model is expected to produce counts whose variance and mean are equal, the converse does not apply. Nor is a variance greater than the mean a reliable measure of 'aggregation'.
The parameters P and Q are usually estimated from the numbers in a sample with and without the characteristic. Alternatively they may be obtained from theory such as in genetics studies. The probability of getting r individuals with the characteristic in n observations can then be determined from the general formula for the binomial distribution:
Algebraically speaking 

The following expression gives the probabilities for each frequency class:
