Ergodicity


In mathematics, ergodicity expresses the idea that a point of a moving system, either a dynamical system or a stochastic process, will eventually visit all parts of the space in which the system moves, in a uniform and random sense. This implies that the average behavior of the system can be deduced from the trajectory of a "typical" point. Equivalently, a sufficiently large collection of random samples from a process can represent the average statistical properties of the entire process. Ergodicity is a property of the system; it is a statement that the system cannot be reduced or factored into smaller components. Ergodic theory is the study of systems possessing ergodicity.
Ergodic systems occur in a broad range of systems in physics and in geometry. This can be roughly understood to be due to a common phenomenon: the motions of particles, that is, geodesics, on a hyperbolic manifold are divergent; when that manifold is compact, that is, of finite size, those orbits return to the same general area, eventually filling the entire space.
Ergodic systems capture the common-sense, everyday notions of randomness, such that smoke might come to fill all of a smoke-filled room, or that a block of metal might eventually come to have the same temperature throughout, or that flips of a fair coin may come up heads and tails half the time. A stronger concept than ergodicity is that of mixing, which aims to mathematically describe the common-sense notions of mixing, such as mixing drinks or mixing cooking ingredients.
The proper mathematical formulation of ergodicity is founded on the formal definitions of measure theory and dynamical systems, and rather specifically on the notion of a measure-preserving dynamical system. The origins of ergodicity lie in statistical physics, where Ludwig Boltzmann formulated the ergodic hypothesis.

Informal explanation

Ergodicity occurs in broad settings in physics and mathematics. All of these settings are unified by a common mathematical description, that of the measure-preserving dynamical system. Equivalently, ergodicity can be understood in terms of stochastic processes. They are one and the same, despite using dramatically different notation and language.

Measure-preserving dynamical systems

The mathematical definition of ergodicity aims to capture ordinary every-day ideas about randomness. This includes ideas about systems that move in such a way as to fill up all of space, such as diffusion and Brownian motion, as well as common-sense notions of mixing, such as mixing paints, drinks, cooking ingredients, industrial process mixing, smoke in a smoke-filled room, the dust in Saturn's rings and so on. To provide a solid mathematical footing, descriptions of ergodic systems begin with the definition of a measure-preserving dynamical system. This is written as
The set is understood to be the total space to be filled: the mixing bowl, the smoke-filled room, etc. The measure is understood to define the natural volume of the space and of its subspaces. The collection of subspaces is denoted by, and the size of any given subset is ; the size is its volume. Naively, one could imagine to be the power set of ; this doesn't quite work, as not all subsets of a space have a volume. Thus, conventionally, consists of the measurable subsets—the subsets that do have a volume. It is always taken to be a Borel set—the collection of subsets that can be constructed by taking intersections, unions and set complements of open sets; these can always be taken to be measurable.
The time evolution of the system is described by a map. Given some subset, its image will in general be a deformed version of – it is squashed or stretched, folded or cut into pieces. Mathematical examples include the baker's map and the horseshoe map, both inspired by bread-making. The set must have the same volume as ; the squashing/stretching does not alter the volume of the space, only its distribution. Such a system is "measure-preserving".
A formal difficulty arises when one tries to reconcile the volume of sets with the need to preserve their size under a map. The problem arises because, in general, several different points in the domain of a function can map to the same point in its range; that is, there may be with. Worse, a single point has no size. These difficulties can be avoided by working with the inverse map ; it will map any given subset to the parts that were assembled to make it: these parts are. It has the important property of not losing track of where things came from. More strongly, it has the important property that any map is the inverse of some map. The proper definition of a volume-preserving map is one for which because describes all the pieces-parts that came from.
One is now interested in studying the time evolution of the system. If every set eventually comes to fill all of over a long period of time, the system is said to be ergodic. If every set satisfies, the system is a conservative system, placed in contrast to a dissipative system, where some subsets wander away, never to be returned to. An example would be water running downhill: once it's run down, it will never come back up again. The lake that forms at the bottom of this river can, however, become well-mixed. The ergodic decomposition theorem states that every conservative system can be decomposed into a family of ergodic components.
Mixing is a stronger statement than ergodicity. Mixing asks for this ergodic property to hold between any two sets, and not just between some set and. That is, given any two sets, a system is said to be mixing if there is an integer such that, for all and, one has that. Here, denotes set intersection and is the empty set. Other notions of mixing include strong and weak mixing, which describe the notion that the mixed substances intermingle everywhere, in equal proportion. This can be non-trivial, as practical experience of trying to mix sticky, gooey substances shows.

Ergodic processes

The above discussion appeals to a physical sense of a volume. The volume does not have to literally be some portion of 3D space; it can be some abstract volume. This is generally the case in statistical systems, where the volume is given by the probability. The total volume corresponds to probability one. This correspondence works because the axioms of probability theory are identical to those of measure theory; these are the Kolmogorov axioms.
The idea of a volume can be very abstract. Consider, for example, the set of all possible coin-flips: the set of infinite sequences of heads and tails. Assigning the volume of 1 to this space, it is clear that half of all such sequences start with heads, and half start with tails. One can slice up this volume in other ways: one can say "I don't care about the first coin-flips; but I want the 'th of them to be heads, and then I don't care about what comes after that". This can be written as the set where is "don't care" and is "heads". The volume of this space is again one-half.
The above is enough to build up a measure-preserving dynamical system, in its entirety. The sets of or occurring in the 'th place are called cylinder sets. The set of all possible intersections, unions and complements of the cylinder sets then form the Borel set defined above. In formal terms, the cylinder sets form the base for a topology on the space of all possible infinite-length coin-flips. The measure has all of the common-sense properties one might hope for: the measure of a cylinder set with in the 'th position, and in the 'th position is obviously 1/4, and so on. These common-sense properties persist for set-complement and set-union: everything except for and in locations and obviously has the volume of 3/4. All together, these form the axioms of a sigma-additive measure; measure-preserving dynamical systems always use sigma-additive measures. For coin flips, this measure is called the Bernoulli measure.
For the coin-flip process, the time-evolution operator is the shift operator that says "throw away the first coin-flip, and keep the rest". Formally, if is a sequence of coin-flips, then. The measure is obviously shift-invariant: as long as we are talking about some set where the first coin-flip is the "don't care" value, then the volume does not change:. In order to avoid talking about the first coin-flip, it is easier to define as inserting a "don't care" value into the first position:. With this definition, one obviously has that with no constraints on. This is again an example of why is used in the formal definitions.
The above development takes a random process, the Bernoulli process, and converts it to a measure-preserving dynamical system The same conversion can be applied to any stochastic process. Thus, an informal definition of ergodicity is that a sequence is ergodic if it visits all of ; such sequences are "typical" for the process. Another is that its statistical properties can be deduced from a single, sufficiently long, random sample of the process, or that any collection of random samples from a process must represent the average statistical properties of the entire process In the present example, a sequence of coin flips, where half are heads, and half are tails, is a "typical" sequence.
There are several important points to be made about the Bernoulli process. If one writes 0 for tails and 1 for heads, one gets the set of all infinite strings of binary digits. These correspond to the base-two expansion of real numbers. Explicitly, given a sequence, the corresponding real number is
The statement that the Bernoulli process is ergodic is equivalent to the statement that the real numbers are uniformly distributed. The set of all such strings can be written in a variety of ways: This set is the Cantor set, sometimes called the Cantor space to avoid confusion with the Cantor function
In the end, these are all "the same thing".
The Cantor set plays key roles in many branches of mathematics. In recreational mathematics, it underpins the period-doubling fractals; in analysis, it appears in a vast variety of theorems. A key one for stochastic processes is the Wold decomposition, which states that any stationary process can be decomposed into a pair of uncorrelated processes, one deterministic, and the other being a moving average process.
The Ornstein isomorphism theorem states that every stationary stochastic process is equivalent to a Bernoulli scheme. Other results include that every non-dissipative ergodic system is equivalent to the Markov odometer, sometimes called an "adding machine" because it looks like elementary-school addition, that is, taking a base-N digit sequence, adding one, and propagating the carry bits. The proof of equivalence is very abstract; understanding the result is not: by adding one at each time step, every possible state of the odometer is visited, until it rolls over, and starts again. Likewise, ergodic systems visit each state, uniformly, moving on to the next, until they have all been visited.
Systems that generate sequences of N letters are studied by means of symbolic dynamics. Important special cases include subshifts of finite type and sofic systems.