Persistence barcode


In topological data analysis, a persistence barcode, sometimes shortened to barcode, is an algebraic invariant associated with a filtered chain complex or a persistence module that characterizes the stability of topological features throughout a growing family of spaces. Formally, a persistence barcode consists of a multiset of intervals in the Extended [real number line|extended real line], where the length of each interval corresponds to the lifetime of a topological feature in a filtration, usually built on a point cloud, a graph, a function, or, more generally, a simplicial complex or a chain complex. Generally, longer intervals in a barcode correspond to more robust features, whereas shorter intervals are more likely to be noise in the data. A persistence barcode is a complete invariant that captures all the topological information in a filtration. In algebraic topology, the persistence barcodes were first introduced by Sergey Barannikov in 1994 as the "canonical forms" invariants consisting of a multiset of line segments with ends on two parallel lines, and later, in geometry processing, by Gunnar Carlsson et al. in 2004.

Definition

Let be a fixed field. Consider a real-valued function on a chain complex compatible with the differential, so that whenever in. Then for every the sublevel set is a subcomplex of K, and the values of on the generators in define a filtration :
Then, the filtered complexes classification theorem states that for any filtered chain complex over , there exists a linear transformation that preserves the filtration and brings the filtered complex into so called canonical form, a canonically defined direct sum of filtered complexes of two types: two-dimensional complexes with trivial homology and one-dimensional complexes with trivial differential. The multiset of the intervals or describing the canonical form, is called the barcode, and it is the complete invariant of the filtered chain complex.
The concept of a persistence module is intimately linked to the notion of a filtered chain complex. A persistence module indexed over consists of a family of -vector spaces and linear maps for each such that for all. This construction is not specific to ; indeed, it works identically with any totally-ordered set.
A persistence module is said to be of finite type if it contains a finite number of unique finite-dimensional vector spaces. The latter condition is sometimes referred to as pointwise finite-dimensional.
Let be an interval in. Define a persistence module via, where the linear maps are the identity map inside the interval. The module is sometimes referred to as an interval module.
Then for any -indexed persistence module of finite type, there exists a multiset of intervals such that, where the direct sum of persistence modules is carried out index-wise. The multiset is called the barcode of, and it is unique up to a reordering of the intervals.
This result was extended to the case of pointwise finite-dimensional persistence modules indexed over an arbitrary totally-ordered set by William Crawley-Boevey and Magnus Botnan in 2020, building upon known results from the Structure theorem for [finitely generated modules over a principal ideal domain|structure theorem for finitely generated modules over a PID], as well as the work of Cary Webb for the case of the integers.