Generalized suffix tree
In computer science, a generalized suffix tree is a suffix tree for a set of strings. Given the set of strings of total length, it is a Patricia tree containing all suffixes of the strings. It is mostly used in bioinformatics.
Functionality
It can be built in time and space, and can be used to find all occurrences of a string of length in time, which is asymptotically optimal.When constructing such a tree, each string should be padded with a unique out-of-alphabet marker symbol to ensure no suffix is a substring of another, guaranteeing each suffix is represented by a unique leaf node.
Algorithms for constructing a GST include Ukkonen's algorithm and McCreight's algorithm.
Example
A suffix tree for the stringsABAB and BABA is shown in a figure above. They are padded with the unique terminator strings $0 and $1. The numbers in the leaf nodes are string number and starting position. Notice how a left to right traversal of the leaf nodes corresponds to the sorted order of the suffixes. The terminators might be strings or unique single symbols. Edges on $ from the root are left out in this example.