Moment (mathematics)


Moments of a function in mathematics are certain quantitative measures related to the shape of the function's graph. For example, if the function represents mass density, then the zeroth moment is the total mass, the first moment is the center of mass, and the second moment is the moment of inertia. If the function is a probability distribution, then the first moment is the expected value, the second central moment is the variance, the third standardized moment is the skewness, and the fourth standardized moment is the kurtosis.
For a distribution of mass or probability on a bounded interval, the collection of all the moments uniquely determines the distribution. The same is not true on unbounded intervals.
In the mid-nineteenth century, Pafnuty Chebyshev became the first person to think systematically in terms of the moments of random variables.

Significance of the moments

The th raw moment of a random variable with density function is defined byThe th moment of a real-valued continuous random variable with density function about a value is the integral
It is possible to define moments for random variables in a more general fashion than moments for real-valued functions – see [|moments in metric spaces]. The moment of a function, without further explanation, usually refers to the above expression with.
For the second and higher moments, the central moment are usually used rather than the moments about zero, because they provide clearer information about the distribution's shape.
Other moments may also be defined. For example, the th inverse moment about zero is and the th logarithmic moment about zero is
The th moment about zero of a probability density function is the expected value of and is called a raw moment or crude moment. The moments about its mean are called central moments; these describe the shape of the function, independently of translation.
If is a probability density function, then the value of the integral above is called the th moment of the probability distribution. More generally, if F is a cumulative probability distribution function of any probability distribution, which may not have a density function, then the th moment of the probability distribution is given by the Riemann–Stieltjes integralwhere X is a random variable that has this cumulative distribution F, and is the expectation operator or mean.
Whenthe moment is said not to exist. If the th moment about any point exists, so does the th moment about every point. The zeroth moment of any probability density function is, since the area under any probability density function must be equal to one.

Standardized moments

The normalised th central moment or standardised moment is the th central moment divided by ; the normalised th central moment of the random variable is
These normalised central moments are dimensionless quantities, which represent the distribution independently of any linear change of scale.

Notable moments

Mean

The first raw moment is the mean, usually denoted

Variance

The second central moment is the variance. The positive square root of the variance is the standard deviation

Skewness

The third central moment is the measure of the lopsidedness of the distribution; any symmetric distribution will have a third central moment, if defined, of zero. The normalised third central moment is called the skewness, often. A distribution that is skewed to the left will have a negative skewness. A distribution that is skewed to the right, will have a positive skewness.
For distributions that are not too different from the normal distribution, the median will be somewhere near ; the mode about.

Kurtosis

The fourth central moment is a measure of the heaviness of the tail of the distribution. Since it is the expectation of a fourth power, the fourth central moment, where defined, is always nonnegative; and except for a point distribution, it is always strictly positive. The fourth central moment of a normal distribution is.
The kurtosis is defined to be the standardized fourth central moment. If a distribution has heavy tails, the kurtosis will be high ; conversely, light-tailed distributions have low kurtosis.
The kurtosis can be positive without limit, but must be greater than or equal to ; equality only holds for binary distributions. For unbounded skew distributions not too far from normal, tends to be somewhere in the area of and.
The inequality can be proven by consideringwhere. This is the expectation of a square, so it is non-negative for all a; however it is also a quadratic polynomial in a. Its discriminant must be non-positive, which gives the required relationship.

Higher moments

High-order moments are moments beyond 4th-order moments.
As with variance, skewness, and kurtosis, these are higher-order statistics, involving non-linear combinations of the data, and can be used for description or estimation of further shape parameters. The higher the moment, the harder it is to estimate, in the sense that larger samples are required in order to obtain estimates of similar quality. This is due to the excess degrees of freedom consumed by the higher orders. Further, they can be subtle to interpret, often being most easily understood in terms of lower order moments – compare the higher-order derivatives of jerk and jounce in physics. For example, just as the 4th-order moment can be interpreted as "relative importance of tails as compared to shoulders in contribution to dispersion", the 5th-order moment can be interpreted as measuring "relative importance of tails as compared to center in contribution to skewness".

Mixed moments

Mixed moments are moments involving multiple variables.
The value is called the moment of order . The moments of the joint distribution of random variables are defined similarly. For any integers, the mathematical expectation is called a mixed moment of order , and is called a central mixed moment of order. The mixed moment is called the covariance and is one of the basic characteristics of dependency between random variables.
Some examples are covariance, coskewness and cokurtosis. While there is a unique covariance, there are multiple co-skewnesses and co-kurtoses.

Properties of moments

Transformation of center

Since
where is the binomial coefficient, it follows that the moments about b can be calculated from the moments about a by:

Moment of a convolution of function

The raw moment of a convolution reads
where denotes the th moment of the function given in the brackets. This identity follows by the convolution theorem for moment generating function and applying the chain rule for differentiating a product.

Cumulants

The first raw moment and the second and third unnormalized central moments are additive in the sense that if X and Y are independent random variables then
.
These are the first three cumulants and all cumulants share this additivity property.

Sample moments

For all k, the th raw moment of a population can be estimated using the th raw sample moment
applied to a sample drawn from the population.
It can be shown that the expected value of the raw sample moment is equal to the th raw moment of the population, if that moment exists, for any sample size. It is thus an unbiased estimator. This contrasts with the situation for central moments, whose computation uses up a degree of freedom by using the sample mean. So for example an unbiased estimate of the population variance is given by
in which the previous denominator has been replaced by the degrees of freedom, and in which refers to the sample mean. This estimate of the population moment is greater than the unadjusted observed sample moment by a factor of and it is referred to as the "adjusted sample variance" or sometimes simply the "sample variance".

Problem of moments

Problems of determining a probability distribution from its sequence of moments are called problem of moments. Such problems were first discussed by P.L. Chebyshev in connection with research on limit theorems. In order that the probability distribution of a random variable be uniquely defined by its moments it is sufficient, for example, that Carleman's condition be satisfied:
A similar result even holds for moments of random vectors. The problem of moments seeks characterizations of sequences that are sequences of moments of some function, all moments of which are finite, and for each integer let
where is finite. Then there is a sequence that weakly converges to a distribution function having as its moments. If the moments determine uniquely, then the sequence weakly converges to.

Partial moments

Partial moments are sometimes referred to as "one-sided moments". The th order lower and upper partial moments with respect to a reference point r may be expressed as
If the integral function does not converge, the partial moment does not exist.
Partial moments are normalized by being raised to the power. The upside potential ratio may be expressed as a ratio of a first-order upper partial moment to a normalized second-order lower partial moment.

Central moments in metric spaces

Let be a metric space, and let be the Borel -algebra on, the -algebra generated by the -open subsets of. Let.
The th central moment of a measure on the measurable space about a given point is defined to be
is said to have finite th central moment if the th central moment of about is finite for some.
This terminology for measures carries over to random variables in the usual way: if is a probability space and is a random variable, then the th central moment of about is defined to be
and X has finite th central moment if the th central moment of about is finite for some.