Probability distribution


In probability theory and statistics, a probability distribution is a function that gives the probabilities of occurrence of possible events for an experiment. It is a mathematical description of a random phenomenon in terms of its sample space and the probabilities of events.
Each random variable has a probability distribution. For instance, if is used to denote the outcome of a coin toss, then the probability distribution of would take the value 0.5 for, and 0.5 for . More commonly, probability distributions are used to compare the relative occurrence of many different random values.
In practice, probability distributions are often described using cumulative distribution functions, probability mass functions or probability density functions.
In probability theory, probability distributions are represented by probability measures, and the term probability distribution is often used in reference to probability measures associated with random variables.
Probability distributions of particular importance are given specific names.

Introduction

A probability distribution is a mathematical description of the probabilities of events, subsets of the sample space. The sample space, often represented in notation by is the set of all possible outcomes of a random phenomenon being observed. The sample space may be any set: a set of real numbers, a set of descriptive labels, a set of vectors, a set of arbitrary non-numerical values, etc. For example, the sample space of a coin flip could be
To define probability distributions for the specific case of random variables, it is common to distinguish between discrete and continuous random variables. In the discrete case, it is sufficient to specify a probability mass function assigning a probability to each possible outcome (e.g. when throwing a fair die, each of the six digits to, corresponding to the number of dots on the die, has probability The probability of an event is then defined to be the sum of the probabilities of all outcomes that satisfy the event; for example, the probability of the event "the die rolls an even value" is
In contrast, when a random variable takes values from a continuum then by convention, any individual outcome is assigned probability zero. For such continuous random variables, only events that include infinitely many outcomes such as intervals have probability greater than 0.
For example, consider measuring the weight of a piece of ham in the supermarket, and assume the scale can provide arbitrarily many digits of precision. Then, the probability that it weighs exactly 500 g must be zero because no matter how high the level of precision chosen, it cannot be assumed that there are no non-zero decimal digits in the remaining omitted digits ignored by the precision level.
However, for the same use case, it is possible to meet quality control requirements such as that a package of "500 g" of ham must weigh between 490 g and 510 g with at least 98% probability. This is possible because this measurement does not require as much precision from the underlying equipment.
Continuous probability distributions can be described by means of the cumulative distribution function, which describes the probability that the random variable is no larger than a given value (i.e., for some. The cumulative distribution function is the area under the probability density function from to, as shown in figure 1.
Most continuous probability distributions encountered in practice are not only continuous but also absolutely continuous. Such distributions can be described by their probability density function. Informally, the probability density of a random variable describes the infinitesimal probability that takes any value — that is as becomes arbitrarily small. The probability that lies in a given interval can be computed rigorously by integrating the probability density function over that interval.

General probability definition

Let be a probability space, be a measurable space, and be a -valued random variable. Then the probability distribution of is the pushforward measure of the probability measure onto induced by. Explicitly, this pushforward measure on is given by
for
Any probability distribution is a probability measure on .
A probability distribution can be described in various forms, such as by a probability mass function or a cumulative distribution function. One of the most general descriptions, which applies for absolutely continuous and discrete variables, is by means of a probability function whose input space is a σ-algebra, and gives a real number probability as its output, particularly, a number in.
The probability function can take as argument subsets of the sample space itself, as in the coin toss example, where the function was defined so that and. However, because of the widespread use of random variables, which transform the sample space into a set of numbers, it is more common to study probability distributions whose argument are subsets of these particular kinds of sets, and all probability distributions discussed in this article are of this type. It is common to denote as the probability that a certain value of the variable belongs to a certain event.
The above probability function only characterizes a probability distribution if it satisfies all the Kolmogorov axioms, that is:
  1. , so the probability is non-negative
  2. , so no probability exceeds
  3. for any countable disjoint family of sets
The concept of probability function is made more rigorous by defining it as the element of a probability space, where is the set of possible outcomes, is the set of all subsets whose probability can be measured, and is the probability function, or probability measure, that assigns a probability to each of these measurable subsets.
Probability distributions usually belong to one of two classes.
A discrete probability distribution is applicable to the scenarios where the set of possible outcomes is discrete and the probabilities are encoded by a discrete list of the probabilities of the outcomes; in this case probabilities are described by a probability mass function, and the probability distribution is given by a sum of the probability mass function.
An absolutely continuous probability distribution is applicable to scenarios where the set of possible outcomes can take on values in a continuous range, such as the temperature on a given day. In the absolutely continuous case, probabilities are described by a probability density function, and the probability distribution is by definition the integral of the probability density function. The normal distribution is a commonly encountered absolutely continuous probability distribution. More complex experiments, such as those involving stochastic processes defined in continuous time, may demand the use of more general probability measures.
A probability distribution whose sample space is one-dimensional is called univariate, while a distribution whose sample space is a vector space of dimension 2 or more is called multivariate. A univariate distribution gives the probabilities of a single random variable taking on various different values; a multivariate distribution gives the probabilities of a random vector – a list of two or more random variables – taking on various combinations of values. Important and commonly encountered univariate probability distributions include the binomial distribution, the hypergeometric distribution, and the normal distribution. A commonly encountered multivariate distribution is the multivariate normal distribution.
Besides the probability function, the cumulative distribution function, the probability mass function and the probability density function, the moment generating function and the characteristic function also serve to identify a probability distribution, as they uniquely determine an underlying cumulative distribution function.
File:Standard deviation diagram.svg|right|thumb|250px|Figure 2: The probability density function of the normal distribution, also called Gaussian or "bell curve", the most important absolutely continuous random distribution. As notated on the figure, the probabilities of intervals of values correspond to the area under the curve.

Terminology

Some key concepts and terms, widely used in the literature on the topic of probability distributions, are listed below.

Basic terms

  • Random variable: takes values from a sample space; probabilities describe which values and set of values are more likely taken.
  • Event: set of possible values of a random variable that occurs with a certain probability.
  • Probability function or probability measure: describes the probability that the event occurs.
  • Cumulative distribution function: function evaluating the probability that will take a value less than or equal to for a random variable.
  • Quantile function: the inverse of the cumulative distribution function. Gives such that, with probability, will not exceed.

    Discrete probability distributions

  • Discrete probability distribution: for many random variables with finitely or countably infinitely many values.
  • Probability mass function : function that gives the probability that a discrete random variable is equal to some value.
  • Frequency distribution: a table that displays the frequency of various outcomes.
  • Relative frequency distribution: a frequency distribution where each value has been divided by a number of outcomes in a sample.
  • Categorical distribution: for discrete random variables with a finite set of values.

    Absolutely continuous probability distributions

  • Absolutely continuous probability distribution: for many random variables with uncountably many values.
  • Probability density function or probability density: function whose value at any given sample in the sample space can be interpreted as providing a relative likelihood that the value of the random variable would equal that sample.