Sturmian sequences can be defined strictly in terms of their combinatoric properties or geometrically as cutting sequences for lines of irrational slope or codings for irrational rotations. They are traditionally taken to be infinite sequences on the alphabet of the two symbols 0 and 1.
Sequences of low complexity
For an infinite sequence of symbolsw, let σ be the complexity function of w; i.e., σ = the number of distinct contiguous subwords in w of length n. Then w is Sturmian if σ = n + 1 for all n.
A set X of binary strings is called balanced if the Hamming weight of elements of X takes at most two distinct values. That is, for any |s|1 = k or |s|1 = k' where |s|1 is the number of 1s in s. Let w be an infinite sequence of 0s and 1s and let denote the set of all length-n subwords of w. The sequence w is Sturmian if is balanced for all n and w is not eventually periodic.
A set S of finite binary words is balanced if for each n the subset Sn of words of length n has the property that the Hamming weight of the words in Sn takes at most two distinct values. A balanced sequence is one for which the set of factors is balanced. A balanced sequence has at most n+1 distinct factors of length n. An aperiodic sequence is one which does not consist of a finite sequence followed by a finite cycle. An aperiodic sequence has at least n + 1 distinct factors of length n. A sequence is Sturmian if and only if it is balanced and aperiodic.
Slope and intercept
A sequence over is a Sturmian word if and only if there exist two real numbers, the slope and the intercept, with irrational, such that for all. Thus a Sturmian word provides a discretization of the straight line with slope and intercept ρ. Without loss of generality, we can always assume, because for any integer k we have All the Sturmian words corresponding to the same slope have the same set of factors; the word corresponding to the intercept is the standard word or characteristic word of slope. Hence, if, the characteristic word is the first difference of the Beatty sequence corresponding to the irrational number. The standard word is also the limit of a sequence of words defined recursively as follows: Let be the continued fraction expansion of, and define
where the product between words is just their concatenation. Every word in the sequence is a prefix of the next ones, so that the sequence itself converges to an infinite word, which is. The infinite sequence of words defined by the above recursion is called the standard sequence for the standard word, and the infinite sequence d = of nonnegative integers, with d1 ≥ 0 and dn > 0, is called its directive sequence. A Sturmian word w over is characteristic if and only if both 0w and 1w are Sturmian.
If s is an infinite sequence word and w is a finite word, let μN denote the number of occurrences of w as a factor in the prefix of s of length N + |w| − 1. If μN has a limit as N→∞, we call this the frequency of w, denoted by μ. For a Sturmian word s, every finite factor has a frequency. The three-gap theorem implies that the factors of fixed length n have at most three distinct frequencies, and if there are three values then one is the sum of the other two.
For words over an alphabet of size kgreater than 2, we define a Sturmian word to be one with complexity function n + k − 1. They can be described in terms of cutting sequences for k-dimensional space. An alternative definition is as words of minimal complexity subject to not being ultimately periodic.