Standard deviation
In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its mean. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range. Standard deviation may be abbreviated SD or std dev, and is most commonly represented in mathematical texts and equations by the lowercase Greek letter σ, for the population standard deviation, or the Latin letter s, for the sample standard deviation.
The standard deviation of a random variable, sample, statistical population, data set, or probability distribution is the square root of its variance. A useful property of the standard deviation is that, unlike the variance, it is expressed in the same unit as the data. Standard deviation can also be used to calculate standard error for a finite sample, and to determine statistical significance.
When only a sample of data from a population is available, the term standard deviation of the sample or sample standard deviation can refer to either the above-mentioned quantity as applied to those data, or to a modified quantity that is an unbiased estimate of the population standard deviation.
Relationship with standard error and statistical significance
The standard deviation of a population or sample and the standard error of a statistic are quite different, but related. The sample mean's standard error is the standard deviation of the set of means that would be found by drawing an infinite number of repeated samples from the population and computing a mean for each sample. The mean's standard error turns out to equal the population standard deviation divided by the square root of the sample size, and is estimated by using the sample standard deviation divided by the square root of the sample size. For example, a poll's standard error is the expected standard deviation of the estimated mean if the same poll were to be conducted multiple times. Thus, the standard error estimates the standard deviation of an estimate, which itself measures how much the estimate depends on the particular sample that was taken from the population.In science, it is common to report both the standard deviation of the data and the standard error of the estimate. By convention, only effects more than two standard errors away from a null expectation are considered "statistically significant", a safeguard against spurious conclusion that is really due to random sampling error.
Basic examples
Population standard deviation of grades of eight students
Suppose that the entire population of interest is eight students in a particular class.Their marks are the following eight values:
For a finite set of numbers, the population standard deviation is found by taking the square root of the average of the squared deviations of the values subtracted from their average value, that is:
These eight data points have the mean of 5:
First, calculate the deviations of each data point from the mean, and square the result of each:
The variance is the mean of these values:
and the population standard deviation is equal to the square root of the variance:
This formula is valid only if the eight values with which we began form the complete population. If the values instead were a random sample drawn from some large parent population, then one divides by instead of in the denominator of the last formula, and the result is In that case, the result of the original formula would be called the sample standard deviation and denoted by instead of Dividing by rather than by gives an unbiased estimate of the variance of the larger parent population. This is known as Bessel's correction. Roughly, the reason for it is that the formula for the sample variance relies on computing differences of observations from the sample mean, and the sample mean itself was constructed to be as close as possible to the observations, so just dividing by n would underestimate the variability.
Standard deviation of average height for adult men
If the population of interest is approximately normally distributed, the standard deviation provides information on the proportion of observations above or below certain values. For example, the average height for adult men in the United States is about, with a standard deviation of around. This means that most men have a height within 3 inches of the mean one standard deviationand almost all men have a height within of the mean two standard deviations. If the standard deviation were zero, then all men would share an identical height of 69 inches. Three standard deviations account for 99.73% of the sample population being studied, assuming the distribution is normal or bell-shaped.Definition of population values
Let be the expected value of random variable with density :The standard deviation of is defined as
which can be shown to equal
Using words, the standard deviation is the square root of the variance of.
The standard deviation of a probability distribution is the same as that of a random variable having that distribution.
Not all random variables have a standard deviation. If the distribution has fat tails going out to infinity, the standard deviation might not exist, because the integral might not converge. The normal distribution has tails going out to infinity, but its mean and standard deviation do exist, because the tails diminish quickly enough. The Pareto distribution with parameter has a mean, but not a standard deviation. The Cauchy distribution has neither a mean nor a standard deviation.
Discrete random variable
In the case where takes random values from a finite data set, with each value having the same probability, the standard deviation isNote: The above expression has a built-in bias. See the discussion on Bessel's correction further down below.
or, by using summation notation,
If, instead of having equal probabilities, the values have different probabilities, let have probability, have probability have probability In this case, the standard deviation will be
Continuous random variable
The standard deviation of a continuous real-valued random variable with probability density function isand where the integrals are definite integrals taken for ranging over , which represents the set of possible values of the random variable .
In the case of a parametric family of distributions, the standard deviation can often be expressed in terms of the parameters for the underlying distribution. For example, in the case of the log-normal distribution with parameters and for the underlying normal distribution, the standard deviation of the log-normal variable is given by the expression
Estimation
One can find the standard deviation of an entire population in cases where every member of a population is sampled. In cases where that cannot be done, the standard deviation σ is estimated by examining a random sample taken from the population and computing a statistic of the sample, which is used as an estimate of the population standard deviation. Such a statistic is called an estimator, and the estimator is called a sample standard deviation, and is denoted by s.Unlike in the case of estimating the population mean of a normal distribution, for which the sample mean is a simple estimator with many desirable properties, there is no single estimator for the standard deviation with all these properties, and unbiased estimation of standard deviation is a very technically involved problem. Most often, the standard deviation is estimated using the [|corrected sample standard deviation], defined below, and this is often referred to as the "sample standard deviation", without qualifiers. However, other estimators are better in other respects: the uncorrected estimator yields lower mean squared error, while using N − 1.5 almost completely eliminates bias.
Uncorrected sample standard deviation
The formula for the population standard deviation can be applied to the sample, using the size of the sample as the size of the population. This estimator, denoted by sN, is known as the uncorrected sample standard deviation, or sometimes the standard deviation of the sample, and is defined as follows:where are the observed values of the sample items, and is the mean value of these observations, while the denominator N stands for the size of the sample: this is the square root of the sample variance, which is the average of the squared deviations about the sample mean.
This is a consistent estimator, and is the maximum-likelihood estimate when the population is normally distributed. However, this is a biased estimator, as the estimates are generally too low. The bias decreases as sample size grows, dropping off as 1/N, and thus is most significant for small or moderate sample sizes; for the bias is below 1%. Thus for very large sample sizes, the uncorrected sample standard deviation is generally acceptable. This estimator also has a uniformly smaller mean squared error than the corrected sample standard deviation.
Corrected sample standard deviation
If the biased sample variance is used to compute an estimate of the population's standard deviation, the result isHere taking the square root introduces further downward bias, by Jensen's inequality, due to the square root's being a concave function. The bias in the variance is easily corrected, but the bias from the square root is more difficult to correct, and depends on the distribution in question.
An unbiased estimator for the variance is given by applying Bessel's correction, using N − 1 instead of N to yield the unbiased sample variance, denoted s2:
This estimator is unbiased if the variance exists and the sample values are drawn independently with replacement. N − 1 corresponds to the number of degrees of freedom in the vector of deviations from the mean,
Taking square roots reintroduces bias, yielding the corrected sample standard deviation, denoted by s:
As explained above, while s2 is an unbiased estimator for the population variance, s is still a biased estimator for the population standard deviation, though markedly less biased than the uncorrected sample standard deviation. This estimator is commonly used and generally known simply as the "sample standard deviation". The bias may still be large for small samples. As sample size increases, the amount of bias decreases. We obtain more information and the difference between and becomes smaller.