System Design Series Part 11: Real-World Case Studies

A URL shortener is a classic system design interview question. It tests your understanding of hashing, databases, caching, and handling high read/write ratios.

                        
                        Key Insight: URL shorteners have a high read-to-write ratio (100:1 or more). Design for read-heavy workloads with aggressive caching.
                    

Functional Requirements

Given a long URL, return a short URL
Given a short URL, redirect to original URL
Custom short URLs (optional)
Analytics: click counts, geographic data
URL expiration (optional)

Scale Estimation

# Back-of-envelope estimation
# Assumptions:
# - 100M new URLs per month
# - 100:1 read/write ratio (10B redirects/month)

# Writes: 100M / (30 * 24 * 3600) ˜ 40 URLs/second
# Reads: 40 * 100 = 4000 redirects/second (peak: 40,000)

# Storage (5 years):
# - 100M * 12 * 5 = 6B URLs
# - Each URL: ~500 bytes (original URL + metadata)
# - Total: 6B * 500 = 3TB

# Short URL length:
# - Base62 (a-z, A-Z, 0-9) = 62 characters
# - 6 characters: 62^6 = 56.8B combinations (enough for 6B URLs)
# - 7 characters: 62^7 = 3.5T combinations (future-proof)

URL Shortener Architecture

Short URL Generation

# Approach 1: Base62 Encoding of Counter
import string

class Base62Encoder:
    CHARS = string.ascii_letters + string.digits  # 62 chars
    
    def encode(self, num):
        if num == 0:
            return self.CHARS[0]
        
        result = []
        while num:
            result.append(self.CHARS[num % 62])
            num //= 62
        return ''.join(reversed(result))
    
    def decode(self, short_code):
        num = 0
        for char in short_code:
            num = num * 62 + self.CHARS.index(char)
        return num

# Use distributed counter service for unique IDs
class URLShortener:
    def __init__(self, counter_service, db, cache):
        self.counter = counter_service
        self.encoder = Base62Encoder()
        self.db = db
        self.cache = cache
    
    def create_short_url(self, long_url, user_id=None):
        # Get unique ID from counter service
        unique_id = self.counter.get_next()
        short_code = self.encoder.encode(unique_id)
        
        # Store mapping
        self.db.insert({
            'short_code': short_code,
            'long_url': long_url,
            'user_id': user_id,
            'created_at': datetime.now(),
            'click_count': 0
        })
        
        # Pre-warm cache for popular URLs
        self.cache.set(short_code, long_url, ttl=86400)
        
        return f"https://short.url/{short_code}"
    
    def redirect(self, short_code):
        # Check cache first (fast path)
        long_url = self.cache.get(short_code)
        if long_url:
            self.increment_clicks_async(short_code)
            return long_url
        
        # Cache miss - query database
        result = self.db.find_one({'short_code': short_code})
        if not result:
            raise NotFoundError()
        
        # Update cache
        self.cache.set(short_code, result['long_url'], ttl=86400)
        self.increment_clicks_async(short_code)
        
        return result['long_url']

System Architecture

# High-Level Architecture
"""
                    +------------------+
                    |   Load Balancer  |
                    +--------+---------+
                             |
              +--------------+---------------+
              |              |               |
        +-----v-----+  +-----v-----+  +-----v-----+
        | API Server|  | API Server|  | API Server|
        +-----+-----+  +-----+-----+  +-----+-----+
              |              |               |
              +--------------+---------------+
                             |
              +--------------+---------------+
              |              |               |
        +-----v-----+  +-----v-----+  +-----v-----+
        |   Redis   |  |   Redis   |  |   Redis   | (Cache Cluster)
        +-----+-----+  +-----+-----+  +-----+-----+
              |              |               |
              +--------------+---------------+
                             |
              +--------------+---------------+
              |              |               |
        +-----v-----+  +-----v-----+  +-----v-----+
        |  DB Shard |  |  DB Shard |  |  DB Shard | (Sharded by short_code)
        +-----------+  +-----------+  +-----------+
"""

Chat System (like WhatsApp)

Functional Requirements

1:1 messaging with delivery receipts
Group chats (up to 500 members)
Online/offline status
Message history and sync across devices
Media sharing (images, videos, files)

Architecture diagram of a real-time chat system showing WebSocket servers, message queues, presence service, and database layers — High-level architecture of a WhatsApp-like chat system with real-time messaging, presence tracking, and media storage

Scale Estimation

# WhatsApp-scale estimation
# 2B users, 100B messages/day

# Messages per second: 100B / 86400 ˜ 1.15M messages/sec
# Peak: 3-5M messages/sec

# Storage per day:
# - Average message: 100 bytes
# - 100B * 100 bytes = 10TB/day
# - 7 days retention = 70TB active storage

# Connections:
# - 500M concurrent users
# - Each user maintains WebSocket connection
# - Need 500K+ servers for connections

Chat Architecture

Real-time Messaging

# WebSocket-based Chat Server
import asyncio
import websockets
import json
import redis

class ChatServer:
    def __init__(self):
        self.connections = {}  # user_id -> websocket
        self.redis = redis.Redis()
        self.pubsub = self.redis.pubsub()
    
    async def handle_connection(self, websocket, path):
        user_id = await self.authenticate(websocket)
        self.connections[user_id] = websocket
        
        # Subscribe to user's channel
        await self.subscribe_to_messages(user_id)
        
        try:
            async for message in websocket:
                await self.handle_message(user_id, json.loads(message))
        finally:
            del self.connections[user_id]
    
    async def handle_message(self, sender_id, message):
        msg_type = message['type']
        
        if msg_type == 'send':
            await self.send_message(
                sender_id,
                message['recipient_id'],
                message['content']
            )
        elif msg_type == 'ack':
            await self.acknowledge_message(message['message_id'])
    
    async def send_message(self, sender_id, recipient_id, content):
        # Generate message ID
        msg_id = generate_uuid()
        
        # Store message
        msg = {
            'id': msg_id,
            'sender': sender_id,
            'recipient': recipient_id,
            'content': content,
            'timestamp': time.time(),
            'status': 'sent'
        }
        
        # Persist to database
        await self.db.messages.insert_one(msg)
        
        # Try to deliver in real-time
        if recipient_id in self.connections:
            # Same server - deliver directly
            await self.connections[recipient_id].send(json.dumps(msg))
        else:
            # Different server - publish to Redis
            self.redis.publish(f"user:{recipient_id}", json.dumps(msg))
        
        # Send delivery receipt to sender
        await self.connections[sender_id].send(json.dumps({
            'type': 'sent',
            'message_id': msg_id
        }))

Message Delivery Flow

# Message States: sent ? delivered ? read

"""
Sender                Server              Recipient
  |                     |                     |
  |--- Send Message --->|                     |
  |                     |-- Store in DB       |
  |<-- Sent Receipt ----|                     |
  |                     |--- Push Message --->|
  |                     |<-- Delivered Ack ---|
  |<-- Delivered Receipt|                     |
  |                     |                     |
  |                     |<---- Read Ack ------|
  |<-- Read Receipt ----|                     |
"""

Social Media Feed (like Twitter)

Functional Requirements

Post tweets (280 chars, images, videos)
Follow/unfollow users
Home timeline (posts from followed users)
User timeline (user's own posts)
Like, retweet, reply
Notifications

Diagram comparing fan-out on write vs fan-out on read approaches for social media timeline generation — Fan-out on Write vs Fan-out on Read: two fundamental approaches to generating social media timelines

The Fan-out Problem

# Two approaches for generating timelines:

# 1. Fan-out on Write (Push Model)
# When user posts, push to all followers' timelines
# Pros: Fast reads (timeline pre-computed)
# Cons: Slow writes for users with millions of followers

# 2. Fan-out on Read (Pull Model)
# When user views timeline, fetch from followed users
# Pros: Fast writes, no wasted work
# Cons: Slow reads (must query many users)

# Hybrid Approach (Twitter's solution):
# - Regular users: Fan-out on write
# - Celebrities (>10K followers): Fan-out on read
# - Merge results at read time

Feed Architecture

Timeline Generation

# Hybrid Fan-out Implementation
class TimelineService:
    def __init__(self, redis, db, celebrity_threshold=10000):
        self.redis = redis
        self.db = db
        self.celebrity_threshold = celebrity_threshold
    
    def post_tweet(self, user_id, content):
        tweet = {
            'id': generate_id(),
            'user_id': user_id,
            'content': content,
            'timestamp': time.time()
        }
        
        # Store tweet
        self.db.tweets.insert_one(tweet)
        
        # Check if user is celebrity
        follower_count = self.get_follower_count(user_id)
        
        if follower_count < self.celebrity_threshold:
            # Fan-out on write for regular users
            self.fan_out_tweet(user_id, tweet)
        else:
            # Celebrities: tweet stays in their timeline only
            # Followers will pull on read
            pass
        
        return tweet
    
    def fan_out_tweet(self, user_id, tweet):
        """Push tweet to all followers' timelines"""
        followers = self.get_followers(user_id)
        
        # Use Redis pipeline for efficiency
        pipe = self.redis.pipeline()
        for follower_id in followers:
            # Add to follower's timeline (sorted set by timestamp)
            pipe.zadd(
                f"timeline:{follower_id}",
                {tweet['id']: tweet['timestamp']}
            )
            # Trim to keep only recent tweets
            pipe.zremrangebyrank(f"timeline:{follower_id}", 0, -801)
        pipe.execute()
    
    def get_timeline(self, user_id, count=50):
        # Get pre-computed timeline
        tweet_ids = self.redis.zrevrange(
            f"timeline:{user_id}", 0, count - 1
        )
        
        # Get celebrity tweets (fan-out on read)
        celebrity_tweets = self.get_celebrity_tweets(user_id)
        
        # Merge and sort
        all_ids = list(tweet_ids) + [t['id'] for t in celebrity_tweets]
        
        # Fetch full tweet data
        tweets = self.db.tweets.find({'id': {'$in': all_ids}})
        
        return sorted(tweets, key=lambda t: t['timestamp'], reverse=True)[:count]

Video Streaming (like Netflix)

Key Challenges

Content Delivery: Serve terabytes of video globally
Adaptive Streaming: Adjust quality based on bandwidth
Encoding: Multiple resolutions and codecs
Recommendations: Personalized content suggestions

Video streaming pipeline showing upload, transcoding, adaptive bitrate packaging, CDN distribution, and client playback — End-to-end video streaming pipeline: from upload and transcoding to adaptive bitrate delivery via CDN

Streaming Architecture

Video Processing Pipeline

# Video Processing Pipeline
"""
Upload ? Validation ? Transcoding ? Packaging ? CDN Distribution

1. Upload: Original video to object storage
2. Validation: Format check, virus scan, content moderation
3. Transcoding: Multiple resolutions (4K, 1080p, 720p, 480p, 360p)
4. Packaging: HLS/DASH adaptive streaming formats
5. CDN: Distribute to edge locations worldwide
"""

# HLS Adaptive Streaming
# Master playlist points to variant playlists
"""
#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=1280000,RESOLUTION=720x480
480p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2560000,RESOLUTION=1280x720
720p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=7680000,RESOLUTION=1920x1080
1080p/playlist.m3u8
"""

# Client adaptively switches quality based on bandwidth
class AdaptivePlayer:
    def __init__(self, master_playlist_url):
        self.variants = self.parse_master_playlist(master_playlist_url)
        self.current_bandwidth = self.measure_bandwidth()
    
    def select_quality(self):
        # Choose highest quality that fits bandwidth
        for variant in sorted(self.variants, key=lambda v: -v['bandwidth']):
            if variant['bandwidth'] < self.current_bandwidth * 0.8:
                return variant
        return self.variants[-1]  # Lowest quality fallback

Ride-Sharing (like Uber)

Key Challenges

Real-time Location: Track millions of drivers continuously
Matching: Connect riders with nearby drivers efficiently
ETA Calculation: Real-time traffic and route optimization
Surge Pricing: Dynamic pricing based on demand

Ride-sharing system architecture showing geospatial indexing, driver matching algorithm, real-time location tracking, and surge pricing engine — Uber-like ride-sharing architecture: geospatial indexing, driver matching, and real-time location tracking

Location Service

# Geospatial Indexing for Driver Matching
import redis
from geopy.distance import geodesic

class LocationService:
    def __init__(self, redis_client):
        self.redis = redis_client
    
    def update_driver_location(self, driver_id, lat, lng):
        """Update driver position in geospatial index"""
        # Redis GEO commands use sorted sets
        self.redis.geoadd("drivers:active", (lng, lat, driver_id))
        
        # Store additional driver metadata
        self.redis.hset(f"driver:{driver_id}", mapping={
            "lat": lat,
            "lng": lng,
            "updated_at": time.time(),
            "status": "available"
        })
    
    def find_nearby_drivers(self, lat, lng, radius_km=5, limit=10):
        """Find available drivers within radius"""
        # Query drivers within radius
        nearby = self.redis.georadius(
            "drivers:active",
            lng, lat,
            radius_km, unit="km",
            withdist=True,
            sort="ASC",
            count=limit * 2  # Get extra for filtering
        )
        
        # Filter by availability
        available = []
        for driver_id, distance in nearby:
            status = self.redis.hget(f"driver:{driver_id}", "status")
            if status == "available":
                available.append({
                    "driver_id": driver_id,
                    "distance_km": distance
                })
                if len(available) >= limit:
                    break
        
        return available
    
    def match_ride(self, rider_lat, rider_lng, destination_lat, destination_lng):
        """Match rider with optimal driver"""
        nearby_drivers = self.find_nearby_drivers(rider_lat, rider_lng)
        
        if not nearby_drivers:
            return None  # No drivers available
        
        # Select closest available driver
        driver = nearby_drivers[0]
        
        # Mark driver as assigned
        self.redis.hset(f"driver:{driver['driver_id']}", "status", "assigned")
        
        return driver

E-commerce Platform (like Amazon)

                        
                        Key Insight: E-commerce platforms must handle massive read traffic (browsing), complex inventory management, and maintain data consistency during high-volume sales events.
                    

E-commerce platform microservice architecture showing product catalog, cart, order processing, payment gateway, and inventory services — Microservice architecture of an Amazon-like e-commerce platform with inventory reservation and flash sale handling

Functional Requirements

Product catalog with search and filtering
Shopping cart with real-time inventory checks
Order placement with payment processing
Order tracking and delivery status
User reviews and recommendations

Non-Functional Requirements

Availability: 99.99% (minutes of downtime = millions in lost sales)
Scale: Handle 10x traffic during flash sales
Consistency: Inventory counts must be accurate (no overselling)
Latency: Sub-200ms for search, 500ms for checkout

E-commerce Architecture

Key Components

Product Service: Catalog, search (Elasticsearch), CDN for images
Cart Service: Redis for session, real-time inventory reservation
Order Service: Saga pattern for distributed transactions
Payment Service: Idempotent APIs, retry with exponential backoff
Inventory Service: Eventual consistency with reservation system
Recommendation Engine: Collaborative filtering, ML models

Handling Flash Sales

Rate limiting: Queue requests, serve in batches
Inventory reservation: Optimistic locking with TTL
Circuit breakers: Prevent cascade failures
Read replicas: Scale product catalog reads

Inventory Reservation

# Flash Sale Inventory Reservation
class InventoryService:
    def __init__(self, redis, db):
        self.redis = redis
        self.db = db
    
    def reserve_stock(self, product_id, quantity, user_id, ttl=600):
        """Reserve stock with TTL (prevents cart abandonment lock-up)"""
        reservation_id = f"res:{user_id}:{product_id}:{uuid.uuid4()}"
        
        # Lua script for atomic decrement and reservation
        lua_script = """
        local stock = tonumber(redis.call('GET', KEYS[1]) or 0)
        local quantity = tonumber(ARGV[1])
        
        if stock >= quantity then
            redis.call('DECRBY', KEYS[1], quantity)
            redis.call('SETEX', KEYS[2], ARGV[2], ARGV[1])
            return 1
        else
            return 0
        end
        """
        
        result = self.redis.eval(
            lua_script,
            2,  # Number of keys
            f"stock:{product_id}",  # KEYS[1]
            reservation_id,          # KEYS[2]
            quantity,                # ARGV[1]
            ttl                      # ARGV[2]
        )
        
        if result == 1:
            return reservation_id
        else:
            raise InsufficientStockError()
    
    def confirm_reservation(self, reservation_id):
        """Convert reservation to permanent stock reduction"""
        quantity = self.redis.get(reservation_id)
        if quantity:
            self.redis.delete(reservation_id)
            # Persist to database
            self.db.execute(
                "UPDATE inventory SET stock = stock - %s WHERE product_id = %s",
                (quantity, self.extract_product_id(reservation_id))
            )
    
    def release_reservation(self, reservation_id):
        """Release reservation (cart abandoned or payment failed)"""
        quantity = self.redis.get(reservation_id)
        if quantity:
            product_id = self.extract_product_id(reservation_id)
            self.redis.incrby(f"stock:{product_id}", int(quantity))
            self.redis.delete(reservation_id)

Search Engine (like Google)

                        
                        Key Insight: Web-scale search requires crawling billions of pages, building inverted indexes, and ranking results in milliseconds.
                    

Functional Requirements

Crawl and index web pages continuously
Return relevant results for any query
Autocomplete and spell correction
Personalized results based on user history
Image, video, and news search

Scale Considerations

Crawling: Billions of pages, politeness policies (robots.txt)
Indexing: Petabytes of data, inverted indexes
Query processing: Millions of QPS, sub-500ms latency
Freshness: Breaking news indexed in minutes

Search Engine Architecture

Core Components

URL Frontier: Priority queue of URLs to crawl
Web Crawler: Distributed crawlers with politeness policies
Document Store: Raw HTML storage (distributed file system)
Indexer: Build inverted index (word ? document IDs)
Index Server: Sharded index across thousands of machines
Query Processor: Parse query, retrieve docs, rank results
Ranker: PageRank + ML models for relevance scoring

Inverted Index Structure

// Inverted Index Example
{
  "distributed": [
    {"doc_id": 1, "positions": [5, 42], "tf": 0.02},
    {"doc_id": 7, "positions": [12], "tf": 0.01}
  ],
  "systems": [
    {"doc_id": 1, "positions": [6, 43], "tf": 0.02},
    {"doc_id": 3, "positions": [1, 15, 89], "tf": 0.03}
  ]
}

PageRank Algorithm

PageRank scores pages based on incoming links. Pages linked by many high-quality pages rank higher.

# Simplified PageRank
def pagerank(graph, damping=0.85, iterations=100):
    """
    graph: dict of page -> list of outgoing links
    """
    n = len(graph)
    ranks = {page: 1/n for page in graph}
    
    for _ in range(iterations):
        new_ranks = {}
        for page in graph:
            rank_sum = sum(
                ranks[incoming] / len(graph[incoming])
                for incoming in graph
                if page in graph[incoming]
            )
            new_ranks[page] = (1 - damping) / n + damping * rank_sum
        ranks = new_ranks
    
    return ranks

Series Summary

You've completed the case studies! Continue to Part 12 to learn Low-Level Design (LLD) fundamentals including object-oriented design, SOLID principles, and design patterns.

Design Case Study Analysis Generator

Analyze real-world system designs by documenting architecture choices, trade-offs, and lessons learned. Download as Word, Excel, PDF, or PowerPoint.

Draft auto-saved

All data stays in your browser. Nothing is sent to or stored on any server.

System Being Analyzed *

Domain / Industry

Scale & Numbers *

Core Problem / Challenge

Architecture Highlights *

Key Trade-offs Made

Lessons Learned

Technology Stack

Author Name

System Design Series Part 11: Real-World Case Studies

Table of Contents

URL Shortener (like bit.ly)

System Design Mastery

Introduction to System Design

Scalability Fundamentals

Load Balancing & Caching

Database Design & Sharding

Microservices Architecture

API Design & REST/GraphQL

Message Queues & Event-Driven

CAP Theorem & Consistency

Rate Limiting & Security

Monitoring & Observability

Real-World Case Studies

Data Modeling & Schema Design

Distributed Systems Deep Dive

Authentication & Security

Questions & Trade-offs

Functional Requirements

Scale Estimation

URL Shortener Architecture

Short URL Generation

System Architecture

Chat System (like WhatsApp)

Functional Requirements

Scale Estimation

Chat Architecture

Real-time Messaging

Message Delivery Flow

Social Media Feed (like Twitter)

Functional Requirements

The Fan-out Problem

Feed Architecture

Timeline Generation

Video Streaming (like Netflix)

Key Challenges

Streaming Architecture

Video Processing Pipeline

Ride-Sharing (like Uber)

Key Challenges

Location Service

E-commerce Platform (like Amazon)

Functional Requirements

Non-Functional Requirements

E-commerce Architecture

Key Components

Handling Flash Sales

Inventory Reservation

Search Engine (like Google)

Functional Requirements

Scale Considerations

Search Engine Architecture

Core Components

Inverted Index Structure

PageRank Algorithm

Series Summary

Design Case Study Analysis Generator

Continue the Series

Part 12: Low-Level Design (LLD) Fundamentals

Part 1: Introduction to System Design

Part 10: Monitoring & Observability