Actions

Click a frame to move through the build timeline.

Here's something we want to tell you:

Ukkonen's algorithm

Ukkonen's algorithm constructs a suffix tree in linear time by scanning the string left to right and updating the tree after each new character.

It keeps an active point (active node, active edge, active length), a remainder of suffixes still to add, and uses suffix links to move between related internal nodes efficiently.

The active point is the cursor for the current extension: the active node is where the search starts, the active edge is the next outgoing edge being matched, and the active length is how many characters are already matched on that edge. This compactly represents a position inside the tree without rebuilding paths, allowing the algorithm to continue matching or splitting from the exact in-progress location.

The remainder is the count of suffixes that still need to be added for the current phase. It shrinks as extensions are resolved and grows when a new character is read, letting the algorithm defer work and avoid rescanning the string from scratch.

For each phase, extensions follow three rules:

  • If the next character already exists on the active edge, the algorithm increments active length and stops.
  • If there is no matching edge from the active node, it creates a new leaf.
  • If a mismatch happens on an edge, it splits the edge, creates a new internal node, then adds the new leaf and suffix link.

The result is a compact tree that represents every suffix of the input string, built in linear time, which makes it suitable for long inputs and interactive visualization.

Original paper

typescript implementation

Source: src/core/suffix-tree/suffix-tree-functions.ts


function ukkonenBuildSuffixTree({ text }): void {
  if (text.substring(-1) !== "$") {
    text += "$";
  }
  text = text.toLowerCase();
  const rootNode = new SuffixTreeNode({ fullString: text });
  // -- State Variables --
  let activeNode = rootNode;
  let activeEdgeIndex = -1;
  let activeLength = 0;
  let remainder = 0;
  let leafEnd = { value: -1 };
  // -- Main Construction Loop --
  for (let charToAddIndex = 0; charToAddIndex < text.length; charToAddIndex++) {
    remainder++;
    let lastNewNode = undefined;
    leafEnd.value = charToAddIndex;
    // Insert the current character into the tree 'remainder' times
    while (remainder > 0) {
      if (activeLength === 0) {
        activeEdgeIndex = charToAddIndex;
      }
      const child = activeNode.children.find((child) => text[child.startIndex] === text[activeEdgeIndex]);
      // Case 1: Active edge does not exist from activeNode
      if (!child) {
        // -- Rule 1 --
        // Create a new leaf node
        const newChildNode = new SuffixTreeNode({
          parent: activeNode,
          startIndex: charToAddIndex,
          endIndex: leafEnd
        });
        activeNode.children.push(newChildNode);
        if (lastNewNode) {
          lastNewNode.suffixLink = activeNode;
          lastNewNode = undefined;
        }
      } else {
        // Case 2: Active edge exists
        const childEdgeLength = child.endIndex.value - child.startIndex;
        // -- Walk Down --
        // If active length goes beyond the edge, walk down to the next node
        if (activeLength >= childEdgeLength) {
          activeLength -= childEdgeLength;
          activeEdgeIndex += childEdgeLength;
          activeNode = child;
          // Continue to next iteration of while loop to process from the new active node
          continue;
        }
        const nextCharOnEdge = text[child.startIndex + activeLength];
        // -- Rule 2 --
        // Character is already on the edge (implicit extension)
        if (text[charToAddIndex] === nextCharOnEdge) {
          activeLength++;
          if (lastNewNode) {
            lastNewNode.suffixLink = activeNode;
            lastNewNode = undefined;
          }
          break;
        }
        // -- Rule 3 --
        // Split the edge
        const splitEnd = { value: child.startIndex + activeLength };
        const split = new SuffixTreeNode({
          parent: activeNode,
          startIndex: child.startIndex,
          endIndex: splitEnd
        });
        activeNode.children.push(split);
        const leaf = new SuffixTreeNode({
          parent: split,
          startIndex: charToAddIndex,
          endIndex: leafEnd
        });
        split.children.push(leaf);
        // Adjust original child
        child.startIndex += activeLength;
        child.parent = split;
        // Replace child in activeNode's children list
        activeNode.children.splice(activeNode.children.findIndex((activeNodeChild) => activeNodeChild === child), 1);
        split.children.push(child);
        // Suffix link maintenance
        if (lastNewNode) {
          lastNewNode.suffixLink = split;
        }
        lastNewNode = split;
      }
      remainder--;
      // -- Active Point Traversal --
      if (activeNode === rootNode && activeLength > 0) {
        // Rule: If at root, drop activeLength and shift activeEdgeIndex
        activeLength--;
        activeEdgeIndex = charToAddIndex - remainder + 1;
      } else {
        // Rule: Follow suffix link
        if (activeNode !== (activeNode.suffixLink ?? rootNode)) {
          activeNode = activeNode.suffixLink ?? rootNode;
        }
      }
    }
  }
  leafEnd.value++;
  return rootNode;
}