Straight-line grammar
A straight-line grammar is a formal grammar that generates exactly one string. Consequently, it does not branch nor loop.
Areas of usefulness
Straight-line grammars are widely used in the development of algorithms that execute directly on compressed structures.SLGs are of interest in fields like Kolmogorov complexity, Lossless data compression, Structure discovery and Compressed data structures.
The problem of finding a context-free grammar of minimal size that generates a given string is called the smallest grammar problem.
Straight-line grammars can be generalized to Straight-line context-free tree grammars.
The latter can be used conveniently to compress trees.
Formal definition
A context-free grammar G is an SLG if:1. for every non-terminal N, there is at most one production rule that has N as its left-hand side, and
2. the directed graph G=<V,''E>, defined by V'' being the set of non-terminals and ∈ E whenever B appears at the right-hand side of a production rule for A, is acyclic.
A mathematical definition of the more general formalism of straight-line context-free tree grammars can be found in Lohrey et al.
An SLG in Chomsky normal form is equivalent to a straight-line program.
A list of algorithms using SLGs
- The Sequitur algorithm constructs a straight-line grammar for a given string.
- The Lempel-Ziv-Welch algorithm creates a context-free grammar in such a deterministic way that it is necessary to store only the start rule of the generated grammar.
- Byte pair encoding
- The Grammatical-Ziv-Lempel algorithm (GLZA),, which creates a low entropy context-free grammar using a recursive, relatively greedy, grammar rule creation method. It often achieves Pareto Frontier decompression speed vs size results for text files.