Average


An average of a collection or group is a value that is most central or most common in some sense, and represents its overall position. In mathematics, especially in colloquial usage, it most commonly refers to the arithmetic mean, i.e. the sum divided by the count, so the "average" of the list of numbers is generally considered to be /5 = 25/5 = 5.
In situations where the data is skewed or has outliers, and it is desired to focus on the main part of the group rather than the long tail, "average" often instead refers to the median, i.e. the value in the center after the values have been sorted. For example, the average personal income is usually given as the median income, so that it represents the majority of the population rather than being overly influenced by the much higher incomes of the few rich people.
In many situations involving rates or ratios, such as computing the average speed from multiple measurements taken over the same distance, the average used is the harmonic mean, i.e. the count divided by the sum of the reciprocals. This is because unlike an arithmetic mean or median of speeds, a harmonic mean of speeds will give the value of the constant speed that would cause one to travel the same distance in the same amount of time. In some situations where the frequency of each value is relevant, such as where a histogram or probability density function is being referenced, the "average" could instead refer to the mode, or most common value. Other statistics that can be used as an average include the mid-range and geometric mean, but they would rarely, if ever, be colloquially referred to as "the average".

General properties

All averages of a collection are somewhere within its bounding box. Therefore, if a collection consists entirely of the same value, any average of it is that value.
Most averages are monotonic, i.e. moving a member of it in one direction causes the average to move in the same direction, or equivalently, if two collections of numbers A and B have the same number of elements, and they can be arranged such that each entry in A ≥ the corresponding entry in B, then the average of A ≥ the average of B.
All commonly-used averages are linearly homogeneous, i.e. multiplying every value by the same scale factor multiplies the average by that same scale factor.
Most averages remain identical when the list of items is permuted, i.e. the ordering does not matter.

Pythagorean means

The arithmetic mean, the geometric mean, and the harmonic mean are known collectively as the Pythagorean means.

Statistical location

The mode, the median, and the mid-range are often used in addition to the mean as estimates of central tendency in descriptive statistics. These can all be seen as minimizing variation by some measure; see.
TypeDescriptionExampleResult
Arithmetic meanSum of values of a data set divided by number of values: / 74
MedianMiddle value separating the greater and lesser halves of a data set1, 2, 2, 3, 4, 7, 93
ModeMost frequent value in a data set1, 2, 2, 3, 4, 7, 92
Mid-rangeThe arithmetic mean of the highest and lowest values of a set / 25

Mode

The most frequently occurring number in a list is called the mode. For example, the mode of the list is 3. It may happen that there are two or more numbers which occur equally often and more often than any other number. In this case there is no agreed definition of mode: either they are all modes or there is no mode.

Median

The median is the middle number of the group when they are ranked in order. For an even amount of numbers, the mean of the middle two is taken.
Thus to find the median, order the list according to its elements' magnitude and then repeatedly remove the pair consisting of the highest and lowest values until either one or two values are left. If exactly one value is left, it is the median; if two values, the median is the arithmetic mean of these two. This method takes the list 1, 7, 3, 13 and orders it to read 1, 3, 7, 13. Then the 1 and 13 are removed to obtain the list 3, 7. Since there are two elements in this remaining list, the median is their arithmetic mean, /2 = 5.

Mid-range

The mid-range is the arithmetic mean of the highest and lowest values of a set.

Summary of types

NameEquation or descriptionAs solution to optimization problem
Arithmetic mean
MedianA middle value that separates the higher half from the lower half of the data set; may not be unique if the data set contains an even number of points
Geometric medianA rotation invariant extension of the median for points in
Tukey medianAnother rotation invariant extension of the median for points in —a point that maximizes the Tukey depth
ModeThe most frequent value in the data set
Geometric mean
Harmonic mean
Contraharmonic mean
Lehmer mean
Quadratic mean
Cubic mean
Generalized mean
Quasi-arithmetic meanis monotonic
Weighted mean
Truncated meanThe arithmetic mean of data values after a certain number or proportion of the highest and lowest data values have been discarded-
Interquartile meanA special case of the truncated mean, using the interquartile range. A special case of the inter-quantile truncated mean, which operates on quantiles that are equidistant but on opposite sides of the median.-
Midrange
Winsorized meanSimilar to the truncated mean, but, rather than deleting the extreme values, they are set equal to the largest and smallest values that remain-
MedoidA representative object of a set of objects with minimal sum of dissimilarities to all the objects in the set, according to some dissimilarity function.

Even though perhaps not an average, the th quantile can similarly be expressed as a solution to the optimization problem
which aims to minimize the total tilted absolute value loss.
The table of mathematical symbols explains the symbols used below.

Miscellaneous types

Other more sophisticated averages are: trimean, trimedian, and normalized mean, with their generalizations.
One can create one's own average metric using the generalized f-mean:
where f is any invertible function. The harmonic mean is an example of this using f = 1/x, and the geometric mean is another, using f = log x.
However, this method for generating means is not general enough to capture all averages. A more general method for defining an average takes any function g of a list of arguments that is continuous, strictly increasing in each argument, and symmetric. The average y is then the value that, when replacing each member of the list, results in the same function value: . This most general definition still captures the important property of all averages that the average of a list of identical elements is that element itself. The function provides the arithmetic mean. The function provides the geometric mean. The function provides the harmonic mean.

Average percentage return and CAGR

A type of average used in finance is the average percentage return, which is an example of a geometric mean. When the returns are annual, it is called the Compound Annual Growth Rate. For example, if we are considering a period of two years, and the investment return in the first year is −10% and the return in the second year is +60%, then the average percentage return or CAGR, R, can be obtained by solving the equation:. The value of R that makes this equation true is 0.2, or 20%. This means that the total return over the 2-year period is the same as if there had been 20% growth each year. The order of the years makes no difference – the average percentage returns of +60% and −10% is the same result as that for −10% and +60%.
This method can be generalized to examples in which the periods are not equal. For example, consider a period of a half of a year for which the return is −23% and a period of two and a half years for which the return is +13%. The average percentage return for the combined period is the single year return, R, that is the solution of the following equation:, giving an average return R of 0.0600 or 6.00%.

Moving average

Given a time series, such as daily stock market prices or yearly temperatures, people often want to create a smoother series. This helps to show underlying trends or perhaps periodic behavior. An easy way to do this is the moving average: one chooses a number n and creates a new series by taking the arithmetic mean of the first n values, then moving forward one place by dropping the oldest value and introducing a new value at the other end of the list, and so on. This is the simplest form of moving average. More complicated forms involve using a weighted average. The weighting can be used to enhance or suppress various periodic behaviors and there is extensive analysis of what weightings to use in the literature on filtering. In digital signal processing the term "moving average" is used even when the sum of the weights is not 1.0. The reason for this is that the analyst is usually interested only in the trend or the periodic behavior.