Back to Technology

Union-Find, Trie, Fenwick Tree & Segment Tree

June 10, 2026 Wasil Zafar 18 min read

A decision guide for four advanced data structures that unlock a whole tier of interview problems: Union-Find for connectivity, Trie for prefix search, Fenwick Tree for mutable prefix sums, and Segment Tree for arbitrary range queries.

Table of Contents

  1. Decision Guide
  2. Union-Find
  3. Trie (Prefix Tree)
  4. Fenwick Tree (BIT)
  5. Segment Tree
  6. Comparison Table
  7. Interview & Practice

Decision Guide: Which Structure to Use?

Quick Selection

Problem PatternUseWhy
Are two nodes connected? Merge groups?Union-FindO(α(n)) per op — near-constant
Autocomplete / prefix matching / word existenceTrieO(L) per op (L = word length) regardless of vocabulary size
Prefix sums with point updatesFenwick Tree (BIT)O(log n) update + query; simpler than segment tree
Range queries with range updates (min, max, sum, GCD...)Segment TreeO(log n) any range query; fully general
Static range min/max (no updates)Sparse TableO(1) query after O(n log n) preprocessing — but immutable
Selection Flowchart
flowchart TD
    Q1{Connectivity or group membership?}
    Q1 -->|Yes| UF[Union-Find]
    Q1 -->|No| Q2{Strings or prefix queries?}
    Q2 -->|Yes| TR[Trie]
    Q2 -->|No| Q3{Range queries on array?}
    Q3 -->|No| Q4[Array or HashMap is sufficient]
    Q3 -->|Yes| Q5{Point updates and range sum only?}
    Q5 -->|Yes| FW[Fenwick Tree - BIT]
    Q5 -->|No| Q6{Range updates or min/max/GCD?}
    Q6 -->|Yes| ST[Segment Tree]
    Q6 -->|No| SP[Sparse Table - static O-1 query]
                            

Union-Find (Disjoint Set Union)

Mental Model

Problem it solves: Efficiently answer "are A and B in the same group?" and "merge the groups containing A and B". Canonical for connectivity problems in undirected graphs.

When NOT to use: Directed graphs (connectivity is not symmetric), need to query which group a node was in at a past time (use persistent DSU), need to actually traverse the group members (use adjacency list + BFS).

Common misconception: "Union-Find finds shortest paths." It does NOT — it only answers yes/no connectivity. For paths, use BFS/DFS or Dijkstra.

Union-Find Implementation

class UnionFind:
    """
    Union-Find with path compression + union by rank.
    All operations: O(alpha(n)) amortized ~ O(1) practical.
    """
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
        self.components = n  # number of distinct groups

    def find(self, x):
        """Path compression: flatten the tree on every find."""
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        """Union by rank: attach smaller tree under larger."""
        px, py = self.find(x), self.find(y)
        if px == py:
            return False  # already connected
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        self.components -= 1
        return True  # merged two different groups

    def connected(self, x, y):
        return self.find(x) == self.find(y)


# Number of Islands (LeetCode 200) using Union-Find
def numIslands(grid):
    rows, cols = len(grid), len(grid[0])
    uf = UnionFind(rows * cols)
    water_count = 0

    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == '0':
                water_count += 1
            else:
                for dr, dc in [(0,1),(1,0)]:
                    nr, nc = r+dr, c+dc
                    if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == '1':
                        uf.union(r*cols+c, nr*cols+nc)

    return uf.components - water_count


grid = [['1','1','0'],['0','1','0'],['0','0','1']]
print("Islands:", numIslands(grid))  # 2

Union-Find Applications

Classic Use Cases

  • Kruskal's MST algorithm — add edge if endpoints are not yet connected; uses Union-Find to detect cycles in O(α(n))
  • Dynamic connectivity — network routing, social network friend groups
  • Percolation — does a grid percolate (top-to-bottom connected path of open sites)?
  • Redundant Connection (LeetCode 684), Accounts Merge (LeetCode 721), Satisfiability of Equality Equations (LeetCode 990)

Trie (Prefix Tree)

Mental Model

Problem it solves: Store and search a set of strings where prefix queries are needed. O(L) per operation (L = word length) regardless of how many words are stored.

When NOT to use: Memory is constrained and the alphabet is large (each node can have 26+ children). Use a hash-based variant (TrieMap with dict children) for large alphabets. If you only need exact lookups, a hash set is simpler.

Common misconception: "Trie is always better than a hash set for strings." Hash set O(L) lookup is often faster in practice due to cache effects. Trie wins when prefix queries are needed.

Trie Implementation

class TrieNode:
    def __init__(self):
        self.children = {}   # char -> TrieNode
        self.is_end = False
        self.count = 0       # words that pass through this node


class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        """O(L) — insert word into trie."""
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
            node.count += 1
        node.is_end = True

    def search(self, word):
        """O(L) — True if word exists in trie."""
        node = self.root
        for ch in word:
            if ch not in node.children:
                return False
            node = node.children[ch]
        return node.is_end

    def starts_with(self, prefix):
        """O(P) — True if any word starts with prefix."""
        node = self.root
        for ch in prefix:
            if ch not in node.children:
                return False
            node = node.children[ch]
        return True

    def count_starts_with(self, prefix):
        """O(P) — count of words with given prefix."""
        node = self.root
        for ch in prefix:
            if ch not in node.children:
                return 0
            node = node.children[ch]
        return node.count


# Demo
t = Trie()
for w in ['apple', 'app', 'apply', 'apt', 'banana']:
    t.insert(w)

print(t.search('app'))           # True
print(t.search('ap'))            # False (not a full word)
print(t.starts_with('ap'))       # True
print(t.count_starts_with('app')) # 3 (app, apple, apply)
print(t.count_starts_with('ban')) # 1

Trie Applications

Classic Use Cases

  • Autocomplete — IDE suggestions, search bar prefixes
  • Spell checker — find closest word in trie using DFS with edit distance
  • IP routing — longest prefix match for CIDR blocks (binary trie on IP bits)
  • Word Search II (LeetCode 212) — build trie from word list, DFS the grid; avoids re-checking each word individually
  • Maximum XOR Pair (LeetCode 421) — binary trie, greedily pick opposite bit

Fenwick Tree (Binary Indexed Tree)

Mental Model

Problem it solves: Maintain a mutable array where you frequently need prefix sums. O(log n) update and O(log n) prefix query — much faster than recomputing from scratch (O(n)).

When NOT to use: Range minimum/maximum queries (BIT only supports invertible operations like sum). Need range updates AND range queries (use Segment Tree with lazy propagation).

Common misconception: BIT is 1-indexed. Index 0 is unused. Every tutorial that forgets this causes subtle off-by-one bugs. Always allocate n+1 elements.

Fenwick Tree Implementation

class FenwickTree:
    """
    Binary Indexed Tree (BIT) for prefix sums.
    1-indexed. O(log n) update and query.
    """
    def __init__(self, n):
        self.n = n
        self.tree = [0] * (n + 1)  # 1-indexed; index 0 unused

    def update(self, i, delta):
        """Add delta to position i (1-indexed). O(log n)."""
        while i <= self.n:
            self.tree[i] += delta
            i += i & (-i)   # move to parent by adding lowest set bit

    def query(self, i):
        """Prefix sum from 1 to i (inclusive). O(log n)."""
        s = 0
        while i > 0:
            s += self.tree[i]
            i -= i & (-i)   # move to previous range by removing lowest set bit
        return s

    def range_query(self, l, r):
        """Sum from l to r inclusive. O(log n)."""
        return self.query(r) - self.query(l - 1)


# Count of Smaller Numbers After Self (LeetCode 315)
def countSmaller(nums):
    rank = {v: i+1 for i, v in enumerate(sorted(set(nums)))}
    n = len(rank)
    bit = FenwickTree(n)
    result = []
    for num in reversed(nums):
        r = rank[num]
        result.append(bit.query(r - 1))  # count of elements < num inserted so far
        bit.update(r, 1)
    return result[::-1]


print(countSmaller([5, 2, 6, 1]))  # [2, 1, 1, 0]

Segment Tree

Mental Model

Problem it solves: Any range query (sum, min, max, GCD, count...) with point or range updates. The most general range data structure — if BIT doesn't support your operation, Segment Tree does.

When NOT to use: Only prefix sums needed (BIT is simpler and faster in practice). Static array, no updates (Sparse Table gives O(1) range min/max). Memory is extremely tight (Segment Tree uses 4n space).

Common misconception: Segment Trees are hard to implement. They follow a single recursive pattern — build from leaves, query/update by splitting at midpoint. Once you internalize the pattern, all variants follow.

Segment Tree Implementation

class SegmentTree:
    """
    Segment Tree for range sum queries with point updates.
    Build: O(n), Update: O(log n), Query: O(log n).
    """
    def __init__(self, nums):
        self.n = len(nums)
        self.tree = [0] * (4 * self.n)
        self._build(nums, 0, 0, self.n - 1)

    def _build(self, nums, node, start, end):
        if start == end:
            self.tree[node] = nums[start]
        else:
            mid = (start + end) // 2
            self._build(nums, 2*node+1, start, mid)
            self._build(nums, 2*node+2, mid+1, end)
            self.tree[node] = self.tree[2*node+1] + self.tree[2*node+2]

    def update(self, idx, val, node=0, start=0, end=None):
        """Update index idx to value val. O(log n)."""
        if end is None: end = self.n - 1
        if start == end:
            self.tree[node] = val
        else:
            mid = (start + end) // 2
            if idx <= mid:
                self.update(idx, val, 2*node+1, start, mid)
            else:
                self.update(idx, val, 2*node+2, mid+1, end)
            self.tree[node] = self.tree[2*node+1] + self.tree[2*node+2]

    def query(self, l, r, node=0, start=0, end=None):
        """Range sum [l, r]. O(log n)."""
        if end is None: end = self.n - 1
        if r < start or end < l:
            return 0  # out of range
        if l <= start and end <= r:
            return self.tree[node]  # fully within range
        mid = (start + end) // 2
        left = self.query(l, r, 2*node+1, start, mid)
        right = self.query(l, r, 2*node+2, mid+1, end)
        return left + right


# Demo: Range Sum Query - Mutable (LeetCode 307)
st = SegmentTree([1, 3, 5, 7, 9, 11])
print(st.query(1, 3))   # sum [3,5,7] = 15
st.update(1, 10)        # set index 1 to 10
print(st.query(1, 3))   # sum [10,5,7] = 22

Comprehensive Comparison

StructureBuildUpdateQuerySpaceSupports
Union-FindO(n)O(α(n))O(α(n))O(n)Connectivity, group merge
TrieO(n×L)O(L)O(L)O(n×L)Prefix search, word existence
Fenwick TreeO(n log n)O(log n)O(log n)O(n)Prefix sums (invertible ops)
Segment TreeO(n)O(log n)O(log n)O(4n)Any range query + updates
Sparse TableO(n log n)N/AO(1)O(n log n)Static range min/max only

Interview & Practice

Quick Check

  • Explain why path compression makes Union-Find O(α(n)) — what does the tree look like after repeated finds?
  • Implement a Trie that supports wildcard search (e.g., search("a.c") matches "abc", "aXc").
  • Given array [2,4,5,3,6,1], use Fenwick Tree to answer: how many elements to the right of index i are smaller than arr[i]? (LeetCode 315)

Key LeetCode Problems

  • Union-Find: 684, 721, 990, 547 (Number of Provinces), 200 (Number of Islands)
  • Trie: 208 (Implement Trie), 212 (Word Search II), 421 (Max XOR)
  • Fenwick Tree: 307 (Range Sum Query Mutable), 315 (Count Smaller), 493 (Reverse Pairs)
  • Segment Tree: 307, 715 (Range Module), 732 (My Calendar III)