Back to Technology

System Design Series Part 4: Database Design & Sharding

January 25, 2026 Wasil Zafar 40 min read

Master database design patterns and sharding strategies for building scalable systems. Learn SQL vs NoSQL trade-offs, partitioning strategies, replication, and distributed database architectures.

Table of Contents

  1. Database Fundamentals
  2. Sharding Strategies
  3. Replication
  4. Next Steps

Database Fundamentals

Series Navigation: This is Part 4 of the 15-part System Design Series. Review Part 3: Load Balancing & Caching first.

Database design is one of the most critical decisions in system architecture. The choice between SQL and NoSQL, along with proper sharding strategies, directly impacts scalability, performance, and maintainability.

Key Insight: There's no one-size-fits-all database. Understanding your data access patterns is crucial for making the right choice.

SQL vs NoSQL: The Fundamental Choice

Before diving into advanced topics, understand the core differences:

Aspect SQL (Relational) NoSQL (Non-Relational)
Schema Fixed schema, enforced structure Dynamic/flexible schema
Scaling Vertical (scale up) Horizontal (scale out)
Consistency Strong ACID guarantees Eventually consistent (BASE)
Query Language SQL (standardized) Varies by database
Joins Powerful multi-table joins Limited/no joins
Best For Complex queries, transactions High volume, flexible data
Examples PostgreSQL, MySQL, SQL Server MongoDB, Cassandra, DynamoDB

NoSQL Database Types

Document Stores

Store data as JSON-like documents. Each document can have different structure.

// MongoDB document example
{
    "_id": "user_123",
    "name": "John Doe",
    "email": "john@example.com",
    "orders": [
        {"id": "ord_1", "total": 99.99, "items": 3},
        {"id": "ord_2", "total": 149.50, "items": 2}
    ],
    "preferences": {
        "newsletter": true,
        "notifications": ["email", "sms"]
    }
}

Use cases: Content management, user profiles, product catalogs
Examples: MongoDB, CouchDB, Firestore

Flexible Schema JSON Documents

Key-Value Stores

Simplest NoSQL type. Store data as key-value pairs with O(1) lookups.

# Redis key-value operations
import redis
r = redis.Redis()

# Simple string
r.set("user:123:name", "John Doe")
r.get("user:123:name")  # b"John Doe"

# Hash (object-like)
r.hset("user:123", mapping={
    "name": "John Doe",
    "email": "john@example.com",
    "login_count": 42
})
r.hgetall("user:123")

Use cases: Caching, session storage, real-time data
Examples: Redis, Memcached, DynamoDB

Ultra Fast Simple API

Wide-Column Stores

Store data in column families. Optimized for queries over large datasets.

-- Cassandra CQL example
CREATE TABLE user_activity (
    user_id UUID,
    timestamp TIMESTAMP,
    action TEXT,
    details MAP<TEXT, TEXT>,
    PRIMARY KEY (user_id, timestamp)
) WITH CLUSTERING ORDER BY (timestamp DESC);

-- Query last 100 activities for a user
SELECT * FROM user_activity 
WHERE user_id = ? 
LIMIT 100;

Use cases: Time-series data, IoT, logging, analytics
Examples: Cassandra, HBase, ScyllaDB

High Write Throughput Time Series

Graph Databases

Optimized for storing and querying relationships between entities.

-- Neo4j Cypher query example
// Find friends of friends who also like "Science Fiction"
MATCH (user:Person {name: "Alice"})-[:FRIEND]->(friend)-[:FRIEND]->(fof)
WHERE (fof)-[:LIKES]->(:Genre {name: "Science Fiction"})
AND NOT (user)-[:FRIEND]->(fof)
RETURN DISTINCT fof.name AS recommendation

// Find shortest path between two users
MATCH path = shortestPath(
    (a:Person {name: "Alice"})-[:FRIEND*]-(b:Person {name: "Bob"})
)
RETURN path

Use cases: Social networks, recommendations, fraud detection
Examples: Neo4j, Amazon Neptune, JanusGraph

Relationships Traversals

ACID Properties

ACID guarantees are fundamental to relational databases and critical for financial systems, inventory management, and any application where data integrity is paramount.

Atomicity

Transactions are "all or nothing." Either all operations succeed, or none do.

-- Bank transfer: Must be atomic
BEGIN TRANSACTION;

-- Debit from source account
UPDATE accounts SET balance = balance - 100 WHERE id = 'account_A';

-- Credit to destination account
UPDATE accounts SET balance = balance + 100 WHERE id = 'account_B';

-- If both succeed, commit; otherwise rollback
COMMIT;  -- or ROLLBACK if any step fails

Without atomicity, a system crash between the two UPDATEs could leave money "lost."

Consistency

Database moves from one valid state to another. All constraints (foreign keys, checks, triggers) are enforced.

-- Consistency enforced by constraints
CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INT NOT NULL REFERENCES users(id),
    total DECIMAL(10,2) CHECK (total >= 0),
    status VARCHAR(20) CHECK (status IN ('pending', 'shipped', 'delivered'))
);

-- This INSERT will fail if user_id doesn't exist in users table
INSERT INTO orders (user_id, total, status) VALUES (999, 50.00, 'pending');
-- ERROR: violates foreign key constraint

Isolation

Concurrent transactions don't interfere with each other. Different isolation levels offer different guarantees:

Level Dirty Read Non-Repeatable Read Phantom Read
READ UNCOMMITTED ?? Yes ?? Yes ?? Yes
READ COMMITTED ? No ?? Yes ?? Yes
REPEATABLE READ ? No ? No ?? Yes
SERIALIZABLE ? No ? No ? No
-- Set isolation level in PostgreSQL
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
-- Operations here see a consistent snapshot
COMMIT;

Durability

Once committed, data survives system crashes. Achieved through:

  • Write-Ahead Logging (WAL): Log changes before applying them
  • Checkpoints: Periodic snapshots to disk
  • Replication: Copy data to multiple servers
-- PostgreSQL synchronous commit (maximum durability)
SET synchronous_commit = on;

-- For slightly faster writes with small durability window
SET synchronous_commit = off;  -- Commits may be lost in crash

BASE: The NoSQL Alternative

NoSQL databases often follow BASE properties instead:

  • Basically Available: System guarantees availability (may return stale data)
  • Soft state: State may change over time, even without input
  • Eventually consistent: System will become consistent over time

Sharding Strategies

Sharding (horizontal partitioning) distributes data across multiple database servers. Each shard holds a subset of the data, enabling horizontal scaling.

When to Shard

Consider sharding when:

  • Single database can't handle read/write load
  • Data size exceeds single server storage
  • Geographic distribution required for latency
  • Workload isolation needed (multi-tenant systems)
Warning: Sharding adds significant complexity. Exhaust other options first: query optimization, caching, read replicas, connection pooling.

Sharding Approaches

Hash-Based Sharding

Apply hash function to shard key, then modulo by number of shards.

# Hash-based sharding
import hashlib

class HashShardRouter:
    def __init__(self, num_shards):
        self.num_shards = num_shards
        self.shards = [Database(f"shard_{i}") for i in range(num_shards)]
    
    def get_shard(self, key):
        """Determine which shard holds data for given key"""
        hash_value = int(hashlib.md5(str(key).encode()).hexdigest(), 16)
        shard_index = hash_value % self.num_shards
        return self.shards[shard_index]
    
    def insert(self, user_id, data):
        shard = self.get_shard(user_id)
        shard.insert(data)
    
    def query(self, user_id):
        shard = self.get_shard(user_id)
        return shard.query(user_id)

# Usage
router = HashShardRouter(num_shards=4)
router.insert(user_id=12345, data={"name": "John"})  # Goes to shard 1
router.insert(user_id=67890, data={"name": "Jane"})  # Goes to shard 2

Pros: Even distribution, simple implementation
Cons: Resharding is expensive (all data must move when adding shards)

Even Distribution Resharding Pain

Range-Based Sharding

Partition data by value ranges. Good when queries often filter by range.

# Range-based sharding
class RangeShardRouter:
    def __init__(self):
        self.ranges = [
            (0, 1000000, "shard_1"),
            (1000000, 2000000, "shard_2"),
            (2000000, 3000000, "shard_3"),
            (3000000, float('inf'), "shard_4"),
        ]
        self.shards = {name: Database(name) for _, _, name in self.ranges}
    
    def get_shard(self, user_id):
        for start, end, shard_name in self.ranges:
            if start <= user_id < end:
                return self.shards[shard_name]
        raise ValueError(f"No shard for user_id {user_id}")
    
    def range_query(self, start_id, end_id):
        """Query across multiple shards for range"""
        results = []
        for range_start, range_end, shard_name in self.ranges:
            if start_id < range_end and end_id > range_start:
                shard = self.shards[shard_name]
                results.extend(shard.range_query(
                    max(start_id, range_start),
                    min(end_id, range_end)
                ))
        return results

Pros: Efficient range queries, easy to add new ranges
Cons: Hot spots if data isn't evenly distributed (e.g., recent user IDs)

Range Queries Potential Hotspots

Geographic Sharding

Partition by geographic region. Reduces latency for users in specific regions.

# Geographic sharding
class GeoShardRouter:
    def __init__(self):
        self.region_shards = {
            "us-east": Database("us-east-db"),
            "us-west": Database("us-west-db"),
            "eu-west": Database("eu-west-db"),
            "ap-south": Database("ap-south-db"),
        }
    
    def get_user_region(self, user):
        # Based on user's country or preference
        country_to_region = {
            "US": "us-east",
            "CA": "us-west",
            "UK": "eu-west",
            "DE": "eu-west",
            "IN": "ap-south",
        }
        return country_to_region.get(user.country, "us-east")
    
    def get_shard(self, user):
        region = self.get_user_region(user)
        return self.region_shards[region]

Pros: Lower latency, data residency compliance (GDPR)
Cons: Complex cross-region queries, imbalanced load

Low Latency Compliance

Directory-Based Sharding

Use a lookup service to map keys to shards. Most flexible but adds latency.

# Directory-based sharding
class DirectoryShardRouter:
    def __init__(self):
        # Lookup service (could be Redis, separate database, etc.)
        self.directory = redis.Redis()
        self.shards = {
            "shard_1": Database("shard_1"),
            "shard_2": Database("shard_2"),
            "shard_3": Database("shard_3"),
        }
    
    def get_shard(self, user_id):
        # Look up which shard holds this user
        shard_name = self.directory.get(f"user:{user_id}:shard")
        
        if shard_name is None:
            # New user - assign to shard with least data
            shard_name = self._assign_to_best_shard(user_id)
        
        return self.shards[shard_name.decode()]
    
    def _assign_to_best_shard(self, user_id):
        # Find shard with smallest size
        shard_sizes = {name: shard.size() for name, shard in self.shards.items()}
        best_shard = min(shard_sizes, key=shard_sizes.get)
        
        # Record mapping
        self.directory.set(f"user:{user_id}:shard", best_shard)
        return best_shard
    
    def migrate_user(self, user_id, new_shard):
        """Move user to different shard"""
        old_shard = self.get_shard(user_id)
        data = old_shard.get_all_data(user_id)
        
        self.shards[new_shard].insert_all(user_id, data)
        old_shard.delete(user_id)
        
        # Update directory
        self.directory.set(f"user:{user_id}:shard", new_shard)

Pros: Flexible, easy to rebalance, supports any sharding logic
Cons: Directory is single point of failure, lookup latency

Flexible Easy Rebalancing

Choosing Shard Keys

The shard key determines how data is distributed. A good shard key:

  • Has high cardinality: Many unique values for even distribution
  • Avoids hotspots: Doesn't concentrate writes on single shard
  • Matches query patterns: Most queries can target single shard
  • Is immutable: Changing shard key requires moving data
Example: For a social media app, sharding by user_id is often ideal. All user data is co-located, and most queries are user-centric. Sharding by timestamp would be problematic—recent data would concentrate on one shard.

Shard Key Anti-Patterns

Anti-Pattern Problem Better Alternative
Monotonically increasing ID All new writes go to last shard Hash the ID or use compound key
Timestamp Hot shard for current time range Combine with entity ID
Boolean field Only 2 possible values Use different field
Country code alone US/CN get most traffic Compound key with user_id

Compound Shard Keys

# Compound shard key example
class CompoundShardRouter:
    def __init__(self, num_shards):
        self.num_shards = num_shards
    
    def get_shard(self, tenant_id, user_id):
        """Shard by (tenant_id, user_id) for multi-tenant SaaS"""
        compound_key = f"{tenant_id}:{user_id}"
        hash_value = hash(compound_key)
        return hash_value % self.num_shards

# Multi-tenant: Each tenant's data distributed across shards
# Queries within tenant are efficient
# Cross-tenant queries require scatter-gather

Replication

Replication copies data across multiple servers for high availability and read scaling. Unlike sharding (which divides data), replication creates copies of the same data.

Replication Topologies

Primary-Replica (Master-Slave)

One primary handles all writes; replicas receive copies and serve reads.

# Primary-Replica pattern
class PrimaryReplicaDatabase:
    def __init__(self):
        self.primary = Database("primary")
        self.replicas = [
            Database("replica_1"),
            Database("replica_2"),
            Database("replica_3")
        ]
        self.read_index = 0
    
    def write(self, key, value):
        # All writes go to primary
        self.primary.write(key, value)
        # Primary replicates to replicas asynchronously
    
    def read(self, key, consistency="eventual"):
        if consistency == "strong":
            # Read from primary for latest data
            return self.primary.read(key)
        else:
            # Round-robin across replicas
            replica = self.replicas[self.read_index % len(self.replicas)]
            self.read_index += 1
            return replica.read(key)  # May be slightly stale

Pros: Simple, read scaling, automatic failover
Cons: Single write bottleneck, replication lag

Read Scaling Simple

Multi-Primary (Master-Master)

Multiple nodes accept writes. Requires conflict resolution.

# Multi-Primary with conflict resolution
class MultiPrimaryDatabase:
    def __init__(self):
        self.nodes = [Database(f"node_{i}") for i in range(3)]
    
    def write(self, key, value, node_index=0):
        node = self.nodes[node_index]
        
        # Attach vector clock for conflict detection
        timestamp = time.time()
        node_id = node_index
        
        record = {
            "value": value,
            "vector_clock": {node_id: timestamp}
        }
        
        node.write(key, record)
        # Replicate to other nodes asynchronously
        self._replicate(key, record, exclude=node_index)
    
    def _resolve_conflict(self, record1, record2):
        # Last-Write-Wins (simplest strategy)
        ts1 = max(record1["vector_clock"].values())
        ts2 = max(record2["vector_clock"].values())
        return record1 if ts1 > ts2 else record2
        
        # Alternative: Merge both values (application-specific)

Pros: Write scaling, no single point of failure
Cons: Conflict resolution complexity, eventual consistency

Write Scaling Conflict Resolution

Synchronous vs Asynchronous Replication

Aspect Synchronous Asynchronous
Write Latency High (wait for replicas) Low (return immediately)
Data Safety No data loss on primary failure May lose recent writes
Read Consistency Strong (replicas always current) Eventual (replication lag)
Availability Lower (replica failure blocks writes) Higher (replicas independent)
Hybrid Approach: Many systems use semi-synchronous replication—wait for at least one replica to acknowledge before committing. This balances durability and performance.

Consistency Models

Different consistency models offer different guarantees about when writes become visible to reads:

Strong Consistency

All reads return the most recent write. System appears as single copy.

# Strong consistency (at cost of availability)
# Client always sees latest value

write(x, 1)  # At time T1
# Wait for all replicas to acknowledge
read(x)  # Always returns 1, even from replica

Example: Google Spanner uses synchronized clocks and 2PC for global strong consistency.

Eventual Consistency

If no new updates, all replicas will eventually converge. Reads may return stale data.

# Eventual consistency
write(x, 1)  # At time T1
read(x)  # May return old value for a while
# ...time passes...
read(x)  # Eventually returns 1

Example: Amazon DynamoDB, Cassandra. Good for high availability when stale reads are acceptable.

Read-Your-Writes Consistency

A user always sees their own writes immediately, even if other users see stale data.

# Read-your-writes implementation
class ReadYourWritesDB:
    def __init__(self):
        self.primary = Database("primary")
        self.replicas = [Database(f"replica_{i}") for i in range(3)]
        # Track last write timestamp per user
        self.user_timestamps = {}
    
    def write(self, user_id, key, value):
        timestamp = time.time()
        self.primary.write(key, value, timestamp)
        self.user_timestamps[user_id] = timestamp
    
    def read(self, user_id, key):
        last_write = self.user_timestamps.get(user_id, 0)
        
        for replica in self.replicas:
            if replica.last_sync_time >= last_write:
                return replica.read(key)
        
        # Fallback to primary if replicas too stale
        return self.primary.read(key)

Consistency Spectrum

From strongest to weakest:

  1. Linearizability: Real-time ordering, strictest guarantee
  2. Sequential Consistency: Operations appear in program order
  3. Causal Consistency: Causally related operations ordered
  4. Read-Your-Writes: See your own writes
  5. Monotonic Reads: Never see older values after seeing newer
  6. Eventual Consistency: Will converge, eventually

Next Steps

Database Schema & Sharding Design Generator

Document your database architecture, sharding strategy, and replication topology. Download as Word, Excel, or PDF.

Draft auto-saved

All data stays in your browser. Nothing is sent to or stored on any server.

Technology