Decision Guide: Which Structure to Use?
Quick Selection
| Problem Pattern | Use | Why |
|---|---|---|
| Are two nodes connected? Merge groups? | Union-Find | O(α(n)) per op — near-constant |
| Autocomplete / prefix matching / word existence | Trie | O(L) per op (L = word length) regardless of vocabulary size |
| Prefix sums with point updates | Fenwick Tree (BIT) | O(log n) update + query; simpler than segment tree |
| Range queries with range updates (min, max, sum, GCD...) | Segment Tree | O(log n) any range query; fully general |
| Static range min/max (no updates) | Sparse Table | O(1) query after O(n log n) preprocessing — but immutable |
flowchart TD
Q1{Connectivity or group membership?}
Q1 -->|Yes| UF[Union-Find]
Q1 -->|No| Q2{Strings or prefix queries?}
Q2 -->|Yes| TR[Trie]
Q2 -->|No| Q3{Range queries on array?}
Q3 -->|No| Q4[Array or HashMap is sufficient]
Q3 -->|Yes| Q5{Point updates and range sum only?}
Q5 -->|Yes| FW[Fenwick Tree - BIT]
Q5 -->|No| Q6{Range updates or min/max/GCD?}
Q6 -->|Yes| ST[Segment Tree]
Q6 -->|No| SP[Sparse Table - static O-1 query]
Union-Find (Disjoint Set Union)
Mental Model
Problem it solves: Efficiently answer "are A and B in the same group?" and "merge the groups containing A and B". Canonical for connectivity problems in undirected graphs.
When NOT to use: Directed graphs (connectivity is not symmetric), need to query which group a node was in at a past time (use persistent DSU), need to actually traverse the group members (use adjacency list + BFS).
Common misconception: "Union-Find finds shortest paths." It does NOT — it only answers yes/no connectivity. For paths, use BFS/DFS or Dijkstra.
Union-Find Implementation
class UnionFind:
"""
Union-Find with path compression + union by rank.
All operations: O(alpha(n)) amortized ~ O(1) practical.
"""
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
self.components = n # number of distinct groups
def find(self, x):
"""Path compression: flatten the tree on every find."""
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # path compression
return self.parent[x]
def union(self, x, y):
"""Union by rank: attach smaller tree under larger."""
px, py = self.find(x), self.find(y)
if px == py:
return False # already connected
if self.rank[px] < self.rank[py]:
px, py = py, px
self.parent[py] = px
if self.rank[px] == self.rank[py]:
self.rank[px] += 1
self.components -= 1
return True # merged two different groups
def connected(self, x, y):
return self.find(x) == self.find(y)
# Number of Islands (LeetCode 200) using Union-Find
def numIslands(grid):
rows, cols = len(grid), len(grid[0])
uf = UnionFind(rows * cols)
water_count = 0
for r in range(rows):
for c in range(cols):
if grid[r][c] == '0':
water_count += 1
else:
for dr, dc in [(0,1),(1,0)]:
nr, nc = r+dr, c+dc
if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == '1':
uf.union(r*cols+c, nr*cols+nc)
return uf.components - water_count
grid = [['1','1','0'],['0','1','0'],['0','0','1']]
print("Islands:", numIslands(grid)) # 2
Union-Find Applications
Classic Use Cases
- Kruskal's MST algorithm — add edge if endpoints are not yet connected; uses Union-Find to detect cycles in O(α(n))
- Dynamic connectivity — network routing, social network friend groups
- Percolation — does a grid percolate (top-to-bottom connected path of open sites)?
- Redundant Connection (LeetCode 684), Accounts Merge (LeetCode 721), Satisfiability of Equality Equations (LeetCode 990)
Trie (Prefix Tree)
Mental Model
Problem it solves: Store and search a set of strings where prefix queries are needed. O(L) per operation (L = word length) regardless of how many words are stored.
When NOT to use: Memory is constrained and the alphabet is large (each node can have 26+ children). Use a hash-based variant (TrieMap with dict children) for large alphabets. If you only need exact lookups, a hash set is simpler.
Common misconception: "Trie is always better than a hash set for strings." Hash set O(L) lookup is often faster in practice due to cache effects. Trie wins when prefix queries are needed.
Trie Implementation
class TrieNode:
def __init__(self):
self.children = {} # char -> TrieNode
self.is_end = False
self.count = 0 # words that pass through this node
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
"""O(L) — insert word into trie."""
node = self.root
for ch in word:
if ch not in node.children:
node.children[ch] = TrieNode()
node = node.children[ch]
node.count += 1
node.is_end = True
def search(self, word):
"""O(L) — True if word exists in trie."""
node = self.root
for ch in word:
if ch not in node.children:
return False
node = node.children[ch]
return node.is_end
def starts_with(self, prefix):
"""O(P) — True if any word starts with prefix."""
node = self.root
for ch in prefix:
if ch not in node.children:
return False
node = node.children[ch]
return True
def count_starts_with(self, prefix):
"""O(P) — count of words with given prefix."""
node = self.root
for ch in prefix:
if ch not in node.children:
return 0
node = node.children[ch]
return node.count
# Demo
t = Trie()
for w in ['apple', 'app', 'apply', 'apt', 'banana']:
t.insert(w)
print(t.search('app')) # True
print(t.search('ap')) # False (not a full word)
print(t.starts_with('ap')) # True
print(t.count_starts_with('app')) # 3 (app, apple, apply)
print(t.count_starts_with('ban')) # 1
Trie Applications
Classic Use Cases
- Autocomplete — IDE suggestions, search bar prefixes
- Spell checker — find closest word in trie using DFS with edit distance
- IP routing — longest prefix match for CIDR blocks (binary trie on IP bits)
- Word Search II (LeetCode 212) — build trie from word list, DFS the grid; avoids re-checking each word individually
- Maximum XOR Pair (LeetCode 421) — binary trie, greedily pick opposite bit
Fenwick Tree (Binary Indexed Tree)
Mental Model
Problem it solves: Maintain a mutable array where you frequently need prefix sums. O(log n) update and O(log n) prefix query — much faster than recomputing from scratch (O(n)).
When NOT to use: Range minimum/maximum queries (BIT only supports invertible operations like sum). Need range updates AND range queries (use Segment Tree with lazy propagation).
Common misconception: BIT is 1-indexed. Index 0 is unused. Every tutorial that forgets this causes subtle off-by-one bugs. Always allocate n+1 elements.
Fenwick Tree Implementation
class FenwickTree:
"""
Binary Indexed Tree (BIT) for prefix sums.
1-indexed. O(log n) update and query.
"""
def __init__(self, n):
self.n = n
self.tree = [0] * (n + 1) # 1-indexed; index 0 unused
def update(self, i, delta):
"""Add delta to position i (1-indexed). O(log n)."""
while i <= self.n:
self.tree[i] += delta
i += i & (-i) # move to parent by adding lowest set bit
def query(self, i):
"""Prefix sum from 1 to i (inclusive). O(log n)."""
s = 0
while i > 0:
s += self.tree[i]
i -= i & (-i) # move to previous range by removing lowest set bit
return s
def range_query(self, l, r):
"""Sum from l to r inclusive. O(log n)."""
return self.query(r) - self.query(l - 1)
# Count of Smaller Numbers After Self (LeetCode 315)
def countSmaller(nums):
rank = {v: i+1 for i, v in enumerate(sorted(set(nums)))}
n = len(rank)
bit = FenwickTree(n)
result = []
for num in reversed(nums):
r = rank[num]
result.append(bit.query(r - 1)) # count of elements < num inserted so far
bit.update(r, 1)
return result[::-1]
print(countSmaller([5, 2, 6, 1])) # [2, 1, 1, 0]
Segment Tree
Mental Model
Problem it solves: Any range query (sum, min, max, GCD, count...) with point or range updates. The most general range data structure — if BIT doesn't support your operation, Segment Tree does.
When NOT to use: Only prefix sums needed (BIT is simpler and faster in practice). Static array, no updates (Sparse Table gives O(1) range min/max). Memory is extremely tight (Segment Tree uses 4n space).
Common misconception: Segment Trees are hard to implement. They follow a single recursive pattern — build from leaves, query/update by splitting at midpoint. Once you internalize the pattern, all variants follow.
Segment Tree Implementation
class SegmentTree:
"""
Segment Tree for range sum queries with point updates.
Build: O(n), Update: O(log n), Query: O(log n).
"""
def __init__(self, nums):
self.n = len(nums)
self.tree = [0] * (4 * self.n)
self._build(nums, 0, 0, self.n - 1)
def _build(self, nums, node, start, end):
if start == end:
self.tree[node] = nums[start]
else:
mid = (start + end) // 2
self._build(nums, 2*node+1, start, mid)
self._build(nums, 2*node+2, mid+1, end)
self.tree[node] = self.tree[2*node+1] + self.tree[2*node+2]
def update(self, idx, val, node=0, start=0, end=None):
"""Update index idx to value val. O(log n)."""
if end is None: end = self.n - 1
if start == end:
self.tree[node] = val
else:
mid = (start + end) // 2
if idx <= mid:
self.update(idx, val, 2*node+1, start, mid)
else:
self.update(idx, val, 2*node+2, mid+1, end)
self.tree[node] = self.tree[2*node+1] + self.tree[2*node+2]
def query(self, l, r, node=0, start=0, end=None):
"""Range sum [l, r]. O(log n)."""
if end is None: end = self.n - 1
if r < start or end < l:
return 0 # out of range
if l <= start and end <= r:
return self.tree[node] # fully within range
mid = (start + end) // 2
left = self.query(l, r, 2*node+1, start, mid)
right = self.query(l, r, 2*node+2, mid+1, end)
return left + right
# Demo: Range Sum Query - Mutable (LeetCode 307)
st = SegmentTree([1, 3, 5, 7, 9, 11])
print(st.query(1, 3)) # sum [3,5,7] = 15
st.update(1, 10) # set index 1 to 10
print(st.query(1, 3)) # sum [10,5,7] = 22
Comprehensive Comparison
| Structure | Build | Update | Query | Space | Supports |
|---|---|---|---|---|---|
| Union-Find | O(n) | O(α(n)) | O(α(n)) | O(n) | Connectivity, group merge |
| Trie | O(n×L) | O(L) | O(L) | O(n×L) | Prefix search, word existence |
| Fenwick Tree | O(n log n) | O(log n) | O(log n) | O(n) | Prefix sums (invertible ops) |
| Segment Tree | O(n) | O(log n) | O(log n) | O(4n) | Any range query + updates |
| Sparse Table | O(n log n) | N/A | O(1) | O(n log n) | Static range min/max only |
Interview & Practice
Quick Check
- Explain why path compression makes Union-Find O(α(n)) — what does the tree look like after repeated finds?
- Implement a Trie that supports wildcard search (e.g.,
search("a.c")matches "abc", "aXc"). - Given array [2,4,5,3,6,1], use Fenwick Tree to answer: how many elements to the right of index i are smaller than arr[i]? (LeetCode 315)
Key LeetCode Problems
- Union-Find: 684, 721, 990, 547 (Number of Provinces), 200 (Number of Islands)
- Trie: 208 (Implement Trie), 212 (Word Search II), 421 (Max XOR)
- Fenwick Tree: 307 (Range Sum Query Mutable), 315 (Count Smaller), 493 (Reverse Pairs)
- Segment Tree: 307, 715 (Range Module), 732 (My Calendar III)