Noncontracting grammar


In formal language theory, a noncontracting grammar is a type of formal grammar whose production rules never decrease the total length of a string during derivation. This means that when applying any rule to transform one string into another, the resulting string must have at least as many symbols as the original.
Noncontracting grammars are significant because they are equivalent in expressive power to context-sensitive grammars and define the same class of languages in the Chomsky hierarchy. This equivalence makes them important for understanding the computational limits of natural language processing and compiler design, as they can model complex linguistic phenomena while maintaining certain desirable mathematical properties. Some authors use the term context-sensitive grammar to refer to noncontracting grammars in general, though this usage varies in the literature.
A closely related concept is the essentially noncontracting grammar, which allows one special exception: a rule that produces the empty string from the start symbol, provided that the start symbol never appears elsewhere in the grammar.

Formal definitions

A grammar is noncontracting if for all of its production rules,
α → β, it holds that
A grammar is essentially noncontracting if there may be one exception, namely, a rule
S → ε
where S is the start symbol and ε the empty string, and furthermore, S never occurs in the right-hand side of any rule.
A context-sensitive grammar is a noncontracting grammar in which all rules are of the form αAβ → αγβ, where A is a nonterminal, and γ is a nonempty string of nonterminal and/or terminal symbols. However, some authors use the term context-sensitive grammar to refer to noncontracting grammars in general.
A noncontracting grammar in which |α| < |β| for all rules is called a growing context-sensitive grammar.

History

Chomsky introduced the Chomsky hierarchy, in which context-sensitive grammars occur as "type 1" grammars; general noncontracting grammars do not occur.
Chomsky calls a noncontracting grammar a "type 1 grammar", and a context-sensitive grammar a "type 2 grammar", and by presenting a conversion from the former into the latter, proves the two weakly equivalent.
Kuroda introduced Kuroda normal form, into which all noncontracting grammars can be converted.

Example

This grammar, with the start symbol S, generates the language
which is not context-free due to the pumping lemma.
A context-sensitive grammar for the same language is shown below.

Expressive power

Every context-sensitive grammar is a noncontracting grammar.
There are easy procedures for
Hence, these three types of grammar are equal in expressive power, all describing exactly the context-sensitive languages that do not include the empty string; the essentially noncontracting grammars describe exactly the set of context-sensitive languages.

A direct conversion

A direct conversion into context-sensitive grammars, avoiding Kuroda normal form:
For an arbitrary noncontracting grammar, construct the context-sensitive grammar as follows:
  1. For every terminal symbol a ∈ Σ, introduce a new nonterminal symbolN, and a new rule ∈ P’.
  2. In the rules of P, replace every terminal symbol a by its corresponding nonterminal symbol . As a result, all these rules are of the form → for nonterminals Xi, Yj and m≤''n.
  3. Replace each rule → with m''>1 by 2m rules:
For example, the above noncontracting grammar for leads to the following context-sensitive grammar for the same language: