Binomial proportion confidence interval
In statistics, a binomial proportion confidence interval is a confidence interval for the probability of success calculated from the outcome of a series of success–failure experiments. In other words, a binomial proportion confidence interval is an interval estimate of a success probability when only the number of experiments and the number of successes are known.
There are several formulas for a binomial confidence interval, but all of them rely on the assumption of a binomial distribution. In general, a binomial distribution applies when an experiment is repeated a fixed number of times, each trial of the experiment has two possible outcomes, the probability of success is the same for each trial, and the trials are statistically independent. Because the binomial distribution is a discrete probability distribution and difficult to calculate for large numbers of trials, a variety of approximations are used to calculate this confidence interval, all with their own tradeoffs in accuracy and computational intensity.
A simple example of a binomial distribution is the set of various possible outcomes, and their probabilities, for the number of heads observed when a coin is flipped ten times. The observed binomial proportion is the fraction of the flips that turn out to be heads. Given this observed proportion, the confidence interval for the true probability of the coin landing on heads is a range of possible proportions, which may or may not contain the true proportion. A 95% confidence interval for the proportion, for instance, will contain the true proportion 95% of the times that the procedure for constructing the confidence interval is employed.
Problems with using a normal approximation or "Wald interval"
A commonly used formula for a binomial confidence interval relies on approximating the distribution of error about a binomially-distributed observation, with a normal distribution.The normal approximation depends on the de Moivre–Laplace theorem and becomes unreliable when it violates the theorems' premises, as the sample size becomes small or the success probability grows close to either or
Using the normal approximation, the success probability is estimated by
where is the proportion of successes in a Bernoulli trial process and an estimator for in the underlying Bernoulli distribution. The equivalent formula in terms of observation counts is
where the data are the results of trials that yielded successes and failures. The distribution function argument is the quantile of a standard normal distribution corresponding to the target error rate For a 95% confidence level, the error so that and
When using the Wald formula to estimate or just considering the possible outcomes of this calculation, two problems immediately become apparent:
- First, for approaching either or, the interval narrows to zero width.
- Second, for values of , the interval boundaries exceed .
where is the lower quantile of a standard normal distribution, vs. which is the upper i.e., quantile.
Since the test in the middle of the inequality is a Wald test, the normal approximation interval is sometimes called the Wald interval or Wald method, after Abraham Wald, but it was first described by Laplace.
Bracketing the confidence interval
Extending the normal approximation and Wald-Laplace interval concepts, Michael Short has shown that inequalities on the approximation error between the binomial distribution and the normal distribution can be used to accurately bracket the estimate of the confidence interval aroundwith
and where is again the proportion of successes in a Bernoulli trial process measured with trials yielding successes, is the quantile of a standard normal distribution corresponding to the target error rate and the constants and are simple algebraic functions of For a fixed , the above inequalities give easily computed one- or two-sided intervals which bracket the exact binomial upper and lower confidence limits corresponding to the error rate
Standard error of a proportion estimation when using weighted data
Let there be a simple random sample where each is i.i.d from a Bernoulli distribution and weight is the weight for each observation, with the weights normalized so they sum to The weighted sample proportion is: Since each of the is independent from all the others, and each one has variance for every the sampling variance of the proportion therefore is:The standard error of is the square root of this quantity. Because we do not know we have to estimate it. Although there are many possible estimators, a conventional one is to use the sample mean, and plug this into the formula. That gives:
For otherwise unweighted data, the effective weights are uniform giving The becomes leading to the familiar formulas, showing that the calculation for weighted data is a direct generalization of them.
Wilson score interval
The Wilson score interval was developed by E.B. Wilson.It is an improvement over the normal approximation interval in multiple respects: Unlike the symmetric normal approximation interval, the Wilson score interval is asymmetric, and it doesn't suffer from problems of overshoot and zero-width intervals that afflict the normal interval. It can be safely employed with small samples and skewed observations. The observed coverage probability is consistently closer to the nominal value,
Like the normal interval, the interval can be computed directly from a formula.
Wilson started with the normal approximation to the binomial:
where is the standard normal interval half-width corresponding to the desired confidence The analytic formula for a binomial sample standard deviation is
Combining the two, and squaring out the radical, gives an equation that is quadratic in
or
Transforming the relation into a standard-form quadratic equation for treating and as known values from the sample, and using the value of that corresponds to the desired confidence for the estimate of gives this:
where all of the values bracketed by parentheses are known quantities.
The solution for estimates the upper and lower limits of the confidence interval for Hence the probability of success is estimated by and with confidence bracketed in the interval
where is an abbreviation for
An equivalent expression using the observation counts and is
with the counts as above: the count of observed "successes", the count of observed "failures", and their sum is the total number of observations
In practical tests of the formula's results, users find that this interval has good properties even for a small number of trials and / or the extremes of the probability estimate,
Intuitively, the center value of this interval is the weighted average of and with receiving greater weight as the sample size increases. Formally, the center value corresponds to using a pseudocount of the number of standard deviations of the confidence interval: Add this number to both the count of successes and of failures to yield the estimate of the ratio. For the common two standard deviations in each direction interval, this yields the estimate which is known as the "plus four rule".
Although the quadratic can be solved explicitly, in most cases Wilson's equations can also be solved numerically using the fixed-point iteration
with
The Wilson interval can also be derived from the single sample z-test or Pearson's chi-squared test with two categories. The resulting interval,
can then be solved for to produce the Wilson score interval. The test in the middle of the inequality is a score test.
The interval equality principle
Since the interval is derived by solving from the normal approximation to the binomial, the Wilson score interval has the property of being guaranteed to obtain the same result as the equivalent z-test or chi-squared test.This property can be visualised by plotting the probability density function for the Wilson score interval.
After that, then also plotting a normal across each bound. The tail areas of the resulting Wilson and normal distributions represent the chance of a significant result, in that direction, must be equal.
The continuity-corrected Wilson score interval and the Clopper-Pearson interval are also compliant with this property. The practical import is that these intervals may be employed as significance tests, with identical results to the source test, and new tests may be derived by geometry.
Wilson score interval with continuity correction
The Wilson interval may be modified by employing a continuity correction, in order to align the minimum coverage probability, rather than the average coverage probability, with the nominal value,Just as the Wilson interval mirrors Pearson's chi-squared test, the Wilson interval with continuity correction mirrors the equivalent Yates' chi-squared test.
The following formulae for the lower and upper bounds of the Wilson score interval with continuity correction are derived from Newcombe:
for and
If then must instead be set to if then must be instead set to
Wallis identifies a simpler method for computing continuity-corrected Wilson intervals that employs a special function based on Wilson's lower-bound formula: In Wallis' notation, for the lower bound, let
where is the selected tolerable error level for Then
This method has the advantage of being further decomposable.