Statistical distance
In statistics, probability theory, and information theory, a statistical distance quantifies the distance between two statistical objects, which can be two random variables, or two probability distributions or samples, or the distance can be between an individual sample point and a population or a wider sample of points.
A distance between populations can be interpreted as measuring the distance between two probability distributions and hence they are essentially measures of distances between probability measures. Where statistical distance measures relate to the differences between random variables, these may have statistical dependence, and hence these distances are not directly related to measures of distances between probability measures. Again, a measure of distance between random variables may relate to the extent of dependence between them, rather than to their individual values.
Many statistical distance measures are not metrics, and some are not symmetric. Some types of distance measures, which generalize squared distance, are referred to as divergences.
Terminology
Many terms are used to refer to various notions of distance; these are often confusingly similar, and may be used inconsistently between authors and over time, either loosely or with precise technical meaning. In addition to "distance", similar terms include deviance, deviation, discrepancy, discrimination, and divergence, as well as others such as contrast function and metric. Terms from information theory include cross entropy, relative entropy, discrimination information, and information gain.Distances as metrics
Metrics
A metric on a set X is a function d : X × X → R+. For all x, y, z in X, this function is required to satisfy the following conditions:
- d ≥ 0
- d = 0 if and only if x = y
- d = d
- d ≤ d + d .
Generalized metrics
Statistically close
The total variation distance of two distributions and over a finite domain, is defined asWe say that two probability ensembles and are statistically close if is a negligible function in.
Examples
Metrics
- Total variation distance
- Hellinger distance
- Lévy–Prokhorov metric
- Wasserstein metric: also known as the Kantorovich metric, or earth mover's distance
- Mahalanobis distance
- Integral probability metrics generalize several metrics or pseudometrics on distributions
Divergences
- Kullback–Leibler divergence
- Rényi divergence
- Jensen–Shannon divergence
- Ball divergence
- Bhattacharyya distance
- f-divergence: generalizes several distances and divergences
- Discriminability index, specifically the Bayes discriminability index, is a positive-definite symmetric measure of the overlap of two distributions.