struct SuffixTreeNode

Declaration

struct SuffixTreeNode { /* full declaration omitted */ };

Description

A node in a suffix tree which represents a substring or suffix. Each node has either no children or at least two children, with the root being a exception in the empty tree. Children are represented as a map between unsigned integers and nodes. If a node N has a child M on unsigned integer k, then the mapping represented by N is a proper prefix of the mapping represented by M. Note that this, although similar to a trie is somewhat different: each node stores a full substring of the full mapping rather than a single character state. Each internal node contains a pointer to the internal node representing the same string, but with the first character chopped off. This is stored in \p Link. Each leaf node stores the start index of its respective suffix in \p SuffixIdx.

Declared at: llvm/include/llvm/Support/SuffixTree.h:40

Member Variables

public llvm::DenseMap<unsigned int, SuffixTreeNode*> Children
A child existing on an unsigned integer implies that from the mapping represented by the current node, there is a way to reach another mapping by tacking that character on the end of the current string.
public unsigned int StartIdx = EmptyIdx
The start index of this node's substring in the main string.
public unsigned int* EndIdx = nullptr
Every leaf node must have its \p EndIdx incremented at the end of every step in the construction algorithm. To avoid having to update O(N) nodes individually at the end of every step, the end index is stored as a pointer.
public unsigned int SuffixIdx = EmptyIdx
For all other nodes, this is ignored.
Say we add a child to an internal node with associated mapping S. The next insertion must be at the node representing S - its first character. This is given by the way that we iteratively build the tree in Ukkonen's algorithm. The main idea is to look at the suffixes of each prefix in the string, starting with the longest suffix of the prefix, and ending with the shortest. Therefore, if we keep pointers between such nodes, we can move to the next insertion point in O(1) time. If we don't, then we'd have to query from the root, which takes O(N) time. This would make the construction algorithm O(N^2) rather than O(N).
public unsigned int ConcatLen = 0
The length of the string formed by concatenating the edge labels from the root to this node.

Method Overview

Methods

SuffixTreeNode(unsigned int StartIdx,
               unsigned int* EndIdx,
               llvm::SuffixTreeNode* Link)

Declared at: llvm/include/llvm/Support/SuffixTree.h:109

Parameters

unsigned int StartIdx
unsigned int* EndIdx
llvm::SuffixTreeNode* Link

SuffixTreeNode()

Declared at: llvm/include/llvm/Support/SuffixTree.h:112

bool isLeaf() const

Description

Returns true if this node is a leaf.

Declared at: llvm/include/llvm/Support/SuffixTree.h:90

bool isRoot() const

Description

Returns true if this node is the root of its owning \p SuffixTree.

Declared at: llvm/include/llvm/Support/SuffixTree.h:93

size_t size() const

Description

Return the number of elements in the substring associated with this node.

Declared at: llvm/include/llvm/Support/SuffixTree.h:96