Expected value


In probability theory, the expected value is a generalization of the weighted average.
The expected value of a random variable with a finite number of outcomes is a weighted average of all possible outcomes. In the case of a continuum of possible outcomes, the expectation is defined by integration. In the axiomatic foundation for probability provided by measure theory, the expectation is given by Lebesgue integration.
The expected value of a random variable is often denoted by,, or, with also often stylized as, or.

History

The concept of expected value emerged in the mid-17th century from the "problem of points", a puzzle centered on how to fairly divide stakes between two players forced to end a game prematurely. While the problem had been debated for centuries, it gained new momentum in 1654 when the Chevalier de Méré, a French writer and amateur mathematician, presented it to Blaise Pascal. Méré claimed that this problem could not be solved and that it showed just how flawed mathematics was when it came to its application to the real world. Pascal, being a mathematician, decided to work on a solution to the problem.
He began to discuss the problem in the famous series of letters to Pierre de Fermat. Soon enough, they both independently came up with a solution. They solved the problem in different computational ways, but their results were identical because their computations were based on the same fundamental principle. The principle is that the value of a future gain should be directly proportional to the chance of getting it. This principle seemed to have come naturally to both of them. They were very pleased by the fact that they had found essentially the same solution, and this in turn made them absolutely convinced that they had solved the problem conclusively; however, they did not publish their findings. They only informed a small circle of mutual scientific friends in Paris about it.
In Dutch mathematician Christiaan Huygens' book, he considered the problem of points, and presented a solution based on the same principle as the solutions of Pascal and Fermat. Huygens published his treatise in 1657, "De ratiociniis in ludo aleæ" on probability theory just after visiting Paris. The book extended the concept of expectation by adding rules for how to calculate expectations in more complicated situations than the original problem, and can be seen as the first successful attempt at laying down the foundations of the theory of probability.
In the foreword to his treatise, Huygens wrote:
In the mid-nineteenth century, Pafnuty Chebyshev became the first person to think systematically in terms of the expectations of random variables.

Etymology

Neither Pascal nor Huygens used the term "expectation" in its modern sense. In particular, Huygens writes:
More than a hundred years later, in 1814, Pierre-Simon Laplace published his tract "Théorie analytique des probabilités", where the concept of expected value was defined explicitly:

Notations

The use of the letter to denote "expected value" goes back to W. A. Whitworth in 1901. The symbol has since become popular for English writers. In German, stands for Erwartungswert, in Spanish for esperanza matemática, and in French for espérance mathématique.
When "E" is used to denote "expected value", authors use a variety of stylizations: the expectation operator can be stylized as , , or , while a variety of bracket notations are all used.
Another popular notation is.,, and are commonly used in physics. is used in Russian-language literature.

Definition

As discussed above, there are several context-dependent ways of defining the expected value. The simplest and original definition deals with the case of finitely many possible outcomes, such as in the flip of a coin. With the theory of infinite series, this can be extended to the case of countably many possible outcomes. It is also very common to consider the distinct case of random variables dictated by continuous probability density functions, as these arise in many natural contexts. All of these specific definitions may be viewed as special cases of the general definition based upon the mathematical tools of measure theory and Lebesgue integration, which provide these different contexts with an axiomatic foundation and common language.
Any definition of expected value may be extended to define an expected value of a multidimensional random variable, i.e. a random vector. It is defined component by component, as. Similarly, one may define the expected value of a random matrix with components by.

Random variables with finitely many outcomes

Consider a random variable with a finite list of possible outcomes, each of which has probability of occurring. The expectation of is defined as
Since the probabilities must satisfy, it is natural to interpret as a weighted average of the values, with weights given by their probabilities.
In the special case that all possible outcomes are equiprobable, the weighted average is given by the standard average. In the general case, the expected value takes into account the fact that some outcomes are more likely than others.

Examples

  • Let represent the outcome of a roll of a fair six-sided die. More specifically, will be the number of pips showing on the top face of the die after the toss. The possible values for are 1, 2, 3, 4, 5, and 6, all of which are equally likely with a probability of. The expectation of is If one rolls the die times and computes the average of the results, then as grows, the average will almost surely converge to the expected value, a fact known as the strong law of large numbers.
  • The roulette game consists of a small ball and a wheel with 38 numbered pockets around the edge. As the wheel is spun, the ball bounces around randomly until it settles down in one of the pockets. Suppose random variable represents the outcome of a $1 bet on a single number. If the bet wins, the payoff is $35; otherwise the player loses the bet. The expected profit from such a bet will be That is, the expected value to be won from a $1 bet is −$. Thus, in 190 bets, the net loss will probably be about $10.

    Random variables with countably infinitely many outcomes

Informally, the expectation of a random variable with a countably infinite set of possible outcomes is defined analogously as the weighted average of all possible outcomes, where the weights are given by the probabilities of realizing each given value. This is to say that
where are the possible outcomes of the random variable and are their corresponding probabilities. In many non-mathematical textbooks, this is presented as the full definition of expected values in this context.
However, there are some subtleties with infinite summation, so the above formula is not suitable as a mathematical definition. In particular, the Riemann series theorem of mathematical analysis illustrates that the value of certain infinite sums involving positive and negative summands depends on the order in which the summands are given. Since the outcomes of a random variable have no naturally given order, this creates a difficulty in defining expected value precisely.
For this reason, many mathematical textbooks only consider the case that the infinite sum given above converges absolutely, which implies that the infinite sum is a finite number independent of the ordering of summands. In the alternative case that the infinite sum does not converge absolutely, one says the random variable ''does not have finite expectation.''

Example

Suppose and for where is the scaling factor which makes the probabilities sum to 1:
by the logarithm series for Then we have
due to the geometric series for

Random variables with density

Now consider a random variable which has a probability density function given by a function on the real number line. This means that the probability of taking on any value in a given open interval is given by the integral of over that interval. The expectation of is then given by the integral
A general and mathematically precise formulation of this definition uses measure theory and Lebesgue integration, and the corresponding theory of absolutely continuous random variables is described in the next section. The density functions of many common distributions are piecewise continuous, and as such the theory is often developed in this restricted setting. For such functions, it is sufficient to only consider the standard Riemann integration. Sometimes continuous random variables are defined as those corresponding to this special class of densities, although the term is used differently by various authors.
Analogously to the countably-infinite case above, there are subtleties with this expression due to the infinite region of integration. Such subtleties can be seen concretely if the distribution of is given by the Cauchy distribution, so that. It is straightforward to compute in this case that
The limit of this expression as and does not exist: if the limits are taken so that, then the limit is zero, while if the constraint is taken, then the limit is.
To avoid such ambiguities, in mathematical textbooks it is common to require that the given integral converges absolutely, with left undefined otherwise. However, measure-theoretic notions as given below can be used to give a systematic definition of for more general random variables.

Arbitrary real-valued random variables

All definitions of the expected value may be expressed in the language of measure theory. In general, if is a real-valued random variable defined on a probability space, then the expected value of, denoted by, is defined as the Lebesgue integral
Despite the newly abstract situation, this definition is extremely similar in nature to the very simplest definition of expected values, given above, as certain weighted averages. This is because, in measure theory, the value of the Lebesgue integral of is defined via weighted averages of approximations of which take on finitely many values. Moreover, if given a random variable with finitely or countably many possible values, the Lebesgue theory of expectation is identical to the summation formulas given above. However, the Lebesgue theory clarifies the scope of the theory of probability density functions. A random variable is said to be absolutely continuous if any of the following conditions are satisfied:
  • there is a nonnegative measurable function on the real line such that for any Borel set, in which the integral is Lebesgue.
  • the cumulative distribution function of is absolutely continuous.
  • for any Borel set of real numbers with Lebesgue measure equal to zero, the probability of being valued in is also equal to zero
  • for any positive number there is a positive number such that: if is a Borel set with Lebesgue measure less than, then the probability of being valued in is less than.
These conditions are all equivalent, although this is nontrivial to establish. In this definition, is called the probability density function of . According to the change-of-variables formula for Lebesgue integration, combined with the law of the unconscious statistician, it follows that
for any absolutely continuous random variable. The above discussion of continuous random variables is thus a special case of the general Lebesgue theory, due to the fact that every piecewise-continuous function is measurable.
The expected value of any real-valued random variable can also be defined on the graph of its cumulative distribution function by a nearby equality of areas. In fact, with a real number if and only if the two surfaces in the --plane, described by
respectively, have the same finite area, i.e. if
and both improper Riemann integrals converge. Finally, this is equivalent to the representation
also with convergent integrals.