Tree transducer
In theoretical computer science and formal language theory, a tree transducer is an abstract machine taking as input a tree, and generating output – generally other trees, but models producing words or other structures exist. Roughly speaking, tree transducers extend tree automata in the same way that word transducers extend word automata.
Manipulating tree structures instead of words enable TT to model syntax-directed transformations of formal or natural languages. However, TT are not as well-behaved as their word counterparts in terms of algorithmic complexity, closure properties, etcetera. In particular, most of the main classes are not closed under composition.
The main classes of tree transducers are:
Top-Down Tree Transducers (TOP)
A TOP T is a tuple such that:- is a finite set, the set of states;
- is a finite ranked alphabet, called the input alphabet;
- is a finite ranked alphabet, called the output alphabet;
- is a subset of Q, the set of initial states; and
- is a set of rules of the form, where f is a symbol of Σ, n is the arity of f, q is a state, and u is a tree on Γ and, such pairs being nullary.
Examples of rules and intuitions on semantics
For instance,is a rule – one customarily writes instead of the pair – and its intuitive semantics is that, under the action of q, a tree with f at the root and three children is transformed into
where, recursively, and are replaced, respectively, with the application of on the first child and
with the application of on the third.
Semantics as term rewriting">Term rewriting system">term rewriting
The semantics of each state of the transducer T, and of T itself, is a binary relation between input trees and output trees.A way of defining the semantics formally is to see as a term rewriting system, provided that in the right-hand sides the calls are written in the form, where states q are unary symbols. Then the semantics of a state q is given by
The semantics of T is then defined as the union of the semantics of its initial states:
Determinism and domain
As with tree automata, a TOP is said to be deterministic if no two rules of δ share the same left-hand side, and there is at most one initial state. In that case, the semantics of the DTOP is a partial function from input trees to output trees, as are the semantics of each of the DTOP's states.The domain of a transducer is the domain of its semantics. Likewise, the image of a transducer is the image of its semantics.
Properties of DTOP
- DTOP are not closed under union: this is already the case for deterministic word transducers.
- The domain of a DTOP is a regular tree language. Furthermore, the domain is recognisable by a deterministic top-down tree automaton of size at most exponential in that of the initial DTOP.
- The image of a DTOP is not a regular tree language.
- However, the domain of a DTOP cannot be restricted to a regular tree language. That is to say, given a DTOP T and a language L, one cannot in general build a DTOP such that the semantics of is that of T, restricted to L.
- DTOP are not closed under composition. However this problem can be solved by the addition of a lookahead: a tree automaton, coupled to the transducer, that can perform tests on the domain which the transducer is incapable of.
- The typechecking problem—testing whether the image of a regular tree language is included in another regular tree language—is decidable.
- The equivalence problem—testing whether two DTOP define the same functions—is decidable.