Ogden's lemma
In the theory of formal languages, Ogden's lemma is a generalization of the pumping lemma for context-free languages.
Despite Ogden's lemma being a strengthening of the pumping lemma, it is insufficient to fully characterize the class of context-free languages. This is in contrast to the Myhill–Nerode theorem, which unlike the pumping lemma for regular languages is a necessary and sufficient condition for regularity.
Statement
We will use underlines to indicate "marked" positions.Special cases
Ogden's lemma is often stated in the following form, which can be obtained by "forgetting about" the grammar, and concentrating on the language itself:If a language is context-free, then there exists some number such that for any string of length at least in and every way of "marking" or more of the positions in, can be written as
with strings and, such that
- has at least one marked position,
- has at most marked positions, and
- for all.
Example applications
Non-context-freeness
The special case of Ogden's lemma is often sufficient to prove some languages are not context-free. For example, is a standard example of non-context-free language,Similarly, one can prove the "copy twice" language is not context-free, by using Ogden's lemma on.
And the given example last section is not context-free by using Ogden's lemma on.
Inherent ambiguity
Ogden's lemma can be used to prove the inherent ambiguity of some languages, which is implied by the title of Ogden's paper.Example: Let. The language is inherently ambiguous.
Similarly, is inherently ambiguous, and for any CFG of the language, letting be the constant for Ogden's lemma, we find that has at least different parses. Thus has an unbounded degree of inherent ambiguity.
Undecidability
The proof can be extended to show that deciding whether a CFG is inherently ambiguous is undecidable, by reduction to the Post correspondence problem. It can also show that deciding whether a CFG has an unbounded degree of inherent ambiguity is undecidable.Generalized condition
Bader and Moura have generalized the lemma to allow marking some positions that are not to be included in. Their dependence of the parameters was later improved by Dömösi and Kudlek. If we denote the number of such excluded positions by, then the number of marked positions must satisfy, where is some constant that depends only on the language. The statement becomes that every can be written aswith strings and, such that
- has at least one marked position and no excluded position,
- Let be the number of marked positions and the number of excluded positions in ; then.
- for all.