Back to Technology

System Design Series Part 11: Real-World Case Studies

January 25, 2026 Wasil Zafar 45 min read

Apply system design principles to real-world case studies. Design URL shorteners, chat systems, social media feeds, video streaming platforms, and learn from production architectures.

Table of Contents

  1. URL Shortener
  2. Chat System
  3. Social Media Feed
  4. Video Streaming
  5. Ride-Sharing
  6. E-commerce Platform
  7. Search Engine
  8. Series Summary

URL Shortener (like bit.ly)

Series Navigation: This is Part 11 of the 15-part System Design Series. Review Part 10: Monitoring & Observability first.

A URL shortener is a classic system design interview question. It tests your understanding of hashing, databases, caching, and handling high read/write ratios.

Key Insight: URL shorteners have a high read-to-write ratio (100:1 or more). Design for read-heavy workloads with aggressive caching.

Functional Requirements

  • Given a long URL, return a short URL
  • Given a short URL, redirect to original URL
  • Custom short URLs (optional)
  • Analytics: click counts, geographic data
  • URL expiration (optional)

Scale Estimation

# Back-of-envelope estimation
# Assumptions:
# - 100M new URLs per month
# - 100:1 read/write ratio (10B redirects/month)

# Writes: 100M / (30 * 24 * 3600) ˜ 40 URLs/second
# Reads: 40 * 100 = 4000 redirects/second (peak: 40,000)

# Storage (5 years):
# - 100M * 12 * 5 = 6B URLs
# - Each URL: ~500 bytes (original URL + metadata)
# - Total: 6B * 500 = 3TB

# Short URL length:
# - Base62 (a-z, A-Z, 0-9) = 62 characters
# - 6 characters: 62^6 = 56.8B combinations (enough for 6B URLs)
# - 7 characters: 62^7 = 3.5T combinations (future-proof)

URL Shortener Architecture

Short URL Generation

# Approach 1: Base62 Encoding of Counter
import string

class Base62Encoder:
    CHARS = string.ascii_letters + string.digits  # 62 chars
    
    def encode(self, num):
        if num == 0:
            return self.CHARS[0]
        
        result = []
        while num:
            result.append(self.CHARS[num % 62])
            num //= 62
        return ''.join(reversed(result))
    
    def decode(self, short_code):
        num = 0
        for char in short_code:
            num = num * 62 + self.CHARS.index(char)
        return num

# Use distributed counter service for unique IDs
class URLShortener:
    def __init__(self, counter_service, db, cache):
        self.counter = counter_service
        self.encoder = Base62Encoder()
        self.db = db
        self.cache = cache
    
    def create_short_url(self, long_url, user_id=None):
        # Get unique ID from counter service
        unique_id = self.counter.get_next()
        short_code = self.encoder.encode(unique_id)
        
        # Store mapping
        self.db.insert({
            'short_code': short_code,
            'long_url': long_url,
            'user_id': user_id,
            'created_at': datetime.now(),
            'click_count': 0
        })
        
        # Pre-warm cache for popular URLs
        self.cache.set(short_code, long_url, ttl=86400)
        
        return f"https://short.url/{short_code}"
    
    def redirect(self, short_code):
        # Check cache first (fast path)
        long_url = self.cache.get(short_code)
        if long_url:
            self.increment_clicks_async(short_code)
            return long_url
        
        # Cache miss - query database
        result = self.db.find_one({'short_code': short_code})
        if not result:
            raise NotFoundError()
        
        # Update cache
        self.cache.set(short_code, result['long_url'], ttl=86400)
        self.increment_clicks_async(short_code)
        
        return result['long_url']

System Architecture

# High-Level Architecture
"""
                    +------------------+
                    |   Load Balancer  |
                    +--------+---------+
                             |
              +--------------+---------------+
              |              |               |
        +-----v-----+  +-----v-----+  +-----v-----+
        | API Server|  | API Server|  | API Server|
        +-----+-----+  +-----+-----+  +-----+-----+
              |              |               |
              +--------------+---------------+
                             |
              +--------------+---------------+
              |              |               |
        +-----v-----+  +-----v-----+  +-----v-----+
        |   Redis   |  |   Redis   |  |   Redis   | (Cache Cluster)
        +-----+-----+  +-----+-----+  +-----+-----+
              |              |               |
              +--------------+---------------+
                             |
              +--------------+---------------+
              |              |               |
        +-----v-----+  +-----v-----+  +-----v-----+
        |  DB Shard |  |  DB Shard |  |  DB Shard | (Sharded by short_code)
        +-----------+  +-----------+  +-----------+
"""

Chat System (like WhatsApp)

Functional Requirements

  • 1:1 messaging with delivery receipts
  • Group chats (up to 500 members)
  • Online/offline status
  • Message history and sync across devices
  • Media sharing (images, videos, files)

Scale Estimation

# WhatsApp-scale estimation
# 2B users, 100B messages/day

# Messages per second: 100B / 86400 ˜ 1.15M messages/sec
# Peak: 3-5M messages/sec

# Storage per day:
# - Average message: 100 bytes
# - 100B * 100 bytes = 10TB/day
# - 7 days retention = 70TB active storage

# Connections:
# - 500M concurrent users
# - Each user maintains WebSocket connection
# - Need 500K+ servers for connections

Chat Architecture

Real-time Messaging

# WebSocket-based Chat Server
import asyncio
import websockets
import json
import redis

class ChatServer:
    def __init__(self):
        self.connections = {}  # user_id -> websocket
        self.redis = redis.Redis()
        self.pubsub = self.redis.pubsub()
    
    async def handle_connection(self, websocket, path):
        user_id = await self.authenticate(websocket)
        self.connections[user_id] = websocket
        
        # Subscribe to user's channel
        await self.subscribe_to_messages(user_id)
        
        try:
            async for message in websocket:
                await self.handle_message(user_id, json.loads(message))
        finally:
            del self.connections[user_id]
    
    async def handle_message(self, sender_id, message):
        msg_type = message['type']
        
        if msg_type == 'send':
            await self.send_message(
                sender_id,
                message['recipient_id'],
                message['content']
            )
        elif msg_type == 'ack':
            await self.acknowledge_message(message['message_id'])
    
    async def send_message(self, sender_id, recipient_id, content):
        # Generate message ID
        msg_id = generate_uuid()
        
        # Store message
        msg = {
            'id': msg_id,
            'sender': sender_id,
            'recipient': recipient_id,
            'content': content,
            'timestamp': time.time(),
            'status': 'sent'
        }
        
        # Persist to database
        await self.db.messages.insert_one(msg)
        
        # Try to deliver in real-time
        if recipient_id in self.connections:
            # Same server - deliver directly
            await self.connections[recipient_id].send(json.dumps(msg))
        else:
            # Different server - publish to Redis
            self.redis.publish(f"user:{recipient_id}", json.dumps(msg))
        
        # Send delivery receipt to sender
        await self.connections[sender_id].send(json.dumps({
            'type': 'sent',
            'message_id': msg_id
        }))

Message Delivery Flow

# Message States: sent ? delivered ? read

"""
Sender                Server              Recipient
  |                     |                     |
  |--- Send Message --->|                     |
  |                     |-- Store in DB       |
  |<-- Sent Receipt ----|                     |
  |                     |--- Push Message --->|
  |                     |<-- Delivered Ack ---|
  |<-- Delivered Receipt|                     |
  |                     |                     |
  |                     |<---- Read Ack ------|
  |<-- Read Receipt ----|                     |
"""

Social Media Feed (like Twitter)

Functional Requirements

  • Post tweets (280 chars, images, videos)
  • Follow/unfollow users
  • Home timeline (posts from followed users)
  • User timeline (user's own posts)
  • Like, retweet, reply
  • Notifications

The Fan-out Problem

# Two approaches for generating timelines:

# 1. Fan-out on Write (Push Model)
# When user posts, push to all followers' timelines
# Pros: Fast reads (timeline pre-computed)
# Cons: Slow writes for users with millions of followers

# 2. Fan-out on Read (Pull Model)
# When user views timeline, fetch from followed users
# Pros: Fast writes, no wasted work
# Cons: Slow reads (must query many users)

# Hybrid Approach (Twitter's solution):
# - Regular users: Fan-out on write
# - Celebrities (>10K followers): Fan-out on read
# - Merge results at read time

Feed Architecture

Timeline Generation

# Hybrid Fan-out Implementation
class TimelineService:
    def __init__(self, redis, db, celebrity_threshold=10000):
        self.redis = redis
        self.db = db
        self.celebrity_threshold = celebrity_threshold
    
    def post_tweet(self, user_id, content):
        tweet = {
            'id': generate_id(),
            'user_id': user_id,
            'content': content,
            'timestamp': time.time()
        }
        
        # Store tweet
        self.db.tweets.insert_one(tweet)
        
        # Check if user is celebrity
        follower_count = self.get_follower_count(user_id)
        
        if follower_count < self.celebrity_threshold:
            # Fan-out on write for regular users
            self.fan_out_tweet(user_id, tweet)
        else:
            # Celebrities: tweet stays in their timeline only
            # Followers will pull on read
            pass
        
        return tweet
    
    def fan_out_tweet(self, user_id, tweet):
        """Push tweet to all followers' timelines"""
        followers = self.get_followers(user_id)
        
        # Use Redis pipeline for efficiency
        pipe = self.redis.pipeline()
        for follower_id in followers:
            # Add to follower's timeline (sorted set by timestamp)
            pipe.zadd(
                f"timeline:{follower_id}",
                {tweet['id']: tweet['timestamp']}
            )
            # Trim to keep only recent tweets
            pipe.zremrangebyrank(f"timeline:{follower_id}", 0, -801)
        pipe.execute()
    
    def get_timeline(self, user_id, count=50):
        # Get pre-computed timeline
        tweet_ids = self.redis.zrevrange(
            f"timeline:{user_id}", 0, count - 1
        )
        
        # Get celebrity tweets (fan-out on read)
        celebrity_tweets = self.get_celebrity_tweets(user_id)
        
        # Merge and sort
        all_ids = list(tweet_ids) + [t['id'] for t in celebrity_tweets]
        
        # Fetch full tweet data
        tweets = self.db.tweets.find({'id': {'$in': all_ids}})
        
        return sorted(tweets, key=lambda t: t['timestamp'], reverse=True)[:count]

Video Streaming (like Netflix)

Key Challenges

  • Content Delivery: Serve terabytes of video globally
  • Adaptive Streaming: Adjust quality based on bandwidth
  • Encoding: Multiple resolutions and codecs
  • Recommendations: Personalized content suggestions

Streaming Architecture

Video Processing Pipeline

# Video Processing Pipeline
"""
Upload ? Validation ? Transcoding ? Packaging ? CDN Distribution

1. Upload: Original video to object storage
2. Validation: Format check, virus scan, content moderation
3. Transcoding: Multiple resolutions (4K, 1080p, 720p, 480p, 360p)
4. Packaging: HLS/DASH adaptive streaming formats
5. CDN: Distribute to edge locations worldwide
"""

# HLS Adaptive Streaming
# Master playlist points to variant playlists
"""
#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=1280000,RESOLUTION=720x480
480p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2560000,RESOLUTION=1280x720
720p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=7680000,RESOLUTION=1920x1080
1080p/playlist.m3u8
"""

# Client adaptively switches quality based on bandwidth
class AdaptivePlayer:
    def __init__(self, master_playlist_url):
        self.variants = self.parse_master_playlist(master_playlist_url)
        self.current_bandwidth = self.measure_bandwidth()
    
    def select_quality(self):
        # Choose highest quality that fits bandwidth
        for variant in sorted(self.variants, key=lambda v: -v['bandwidth']):
            if variant['bandwidth'] < self.current_bandwidth * 0.8:
                return variant
        return self.variants[-1]  # Lowest quality fallback

Ride-Sharing (like Uber)

Key Challenges

  • Real-time Location: Track millions of drivers continuously
  • Matching: Connect riders with nearby drivers efficiently
  • ETA Calculation: Real-time traffic and route optimization
  • Surge Pricing: Dynamic pricing based on demand

Location Service

# Geospatial Indexing for Driver Matching
import redis
from geopy.distance import geodesic

class LocationService:
    def __init__(self, redis_client):
        self.redis = redis_client
    
    def update_driver_location(self, driver_id, lat, lng):
        """Update driver position in geospatial index"""
        # Redis GEO commands use sorted sets
        self.redis.geoadd("drivers:active", (lng, lat, driver_id))
        
        # Store additional driver metadata
        self.redis.hset(f"driver:{driver_id}", mapping={
            "lat": lat,
            "lng": lng,
            "updated_at": time.time(),
            "status": "available"
        })
    
    def find_nearby_drivers(self, lat, lng, radius_km=5, limit=10):
        """Find available drivers within radius"""
        # Query drivers within radius
        nearby = self.redis.georadius(
            "drivers:active",
            lng, lat,
            radius_km, unit="km",
            withdist=True,
            sort="ASC",
            count=limit * 2  # Get extra for filtering
        )
        
        # Filter by availability
        available = []
        for driver_id, distance in nearby:
            status = self.redis.hget(f"driver:{driver_id}", "status")
            if status == "available":
                available.append({
                    "driver_id": driver_id,
                    "distance_km": distance
                })
                if len(available) >= limit:
                    break
        
        return available
    
    def match_ride(self, rider_lat, rider_lng, destination_lat, destination_lng):
        """Match rider with optimal driver"""
        nearby_drivers = self.find_nearby_drivers(rider_lat, rider_lng)
        
        if not nearby_drivers:
            return None  # No drivers available
        
        # Select closest available driver
        driver = nearby_drivers[0]
        
        # Mark driver as assigned
        self.redis.hset(f"driver:{driver['driver_id']}", "status", "assigned")
        
        return driver

E-commerce Platform (like Amazon)

Key Insight: E-commerce platforms must handle massive read traffic (browsing), complex inventory management, and maintain data consistency during high-volume sales events.

Functional Requirements

  • Product catalog with search and filtering
  • Shopping cart with real-time inventory checks
  • Order placement with payment processing
  • Order tracking and delivery status
  • User reviews and recommendations

Non-Functional Requirements

  • Availability: 99.99% (minutes of downtime = millions in lost sales)
  • Scale: Handle 10x traffic during flash sales
  • Consistency: Inventory counts must be accurate (no overselling)
  • Latency: Sub-200ms for search, 500ms for checkout

E-commerce Architecture

Key Components

  • Product Service: Catalog, search (Elasticsearch), CDN for images
  • Cart Service: Redis for session, real-time inventory reservation
  • Order Service: Saga pattern for distributed transactions
  • Payment Service: Idempotent APIs, retry with exponential backoff
  • Inventory Service: Eventual consistency with reservation system
  • Recommendation Engine: Collaborative filtering, ML models

Handling Flash Sales

  • Rate limiting: Queue requests, serve in batches
  • Inventory reservation: Optimistic locking with TTL
  • Circuit breakers: Prevent cascade failures
  • Read replicas: Scale product catalog reads

Inventory Reservation

# Flash Sale Inventory Reservation
class InventoryService:
    def __init__(self, redis, db):
        self.redis = redis
        self.db = db
    
    def reserve_stock(self, product_id, quantity, user_id, ttl=600):
        """Reserve stock with TTL (prevents cart abandonment lock-up)"""
        reservation_id = f"res:{user_id}:{product_id}:{uuid.uuid4()}"
        
        # Lua script for atomic decrement and reservation
        lua_script = """
        local stock = tonumber(redis.call('GET', KEYS[1]) or 0)
        local quantity = tonumber(ARGV[1])
        
        if stock >= quantity then
            redis.call('DECRBY', KEYS[1], quantity)
            redis.call('SETEX', KEYS[2], ARGV[2], ARGV[1])
            return 1
        else
            return 0
        end
        """
        
        result = self.redis.eval(
            lua_script,
            2,  # Number of keys
            f"stock:{product_id}",  # KEYS[1]
            reservation_id,          # KEYS[2]
            quantity,                # ARGV[1]
            ttl                      # ARGV[2]
        )
        
        if result == 1:
            return reservation_id
        else:
            raise InsufficientStockError()
    
    def confirm_reservation(self, reservation_id):
        """Convert reservation to permanent stock reduction"""
        quantity = self.redis.get(reservation_id)
        if quantity:
            self.redis.delete(reservation_id)
            # Persist to database
            self.db.execute(
                "UPDATE inventory SET stock = stock - %s WHERE product_id = %s",
                (quantity, self.extract_product_id(reservation_id))
            )
    
    def release_reservation(self, reservation_id):
        """Release reservation (cart abandoned or payment failed)"""
        quantity = self.redis.get(reservation_id)
        if quantity:
            product_id = self.extract_product_id(reservation_id)
            self.redis.incrby(f"stock:{product_id}", int(quantity))
            self.redis.delete(reservation_id)

Search Engine (like Google)

Key Insight: Web-scale search requires crawling billions of pages, building inverted indexes, and ranking results in milliseconds.

Functional Requirements

  • Crawl and index web pages continuously
  • Return relevant results for any query
  • Autocomplete and spell correction
  • Personalized results based on user history
  • Image, video, and news search

Scale Considerations

  • Crawling: Billions of pages, politeness policies (robots.txt)
  • Indexing: Petabytes of data, inverted indexes
  • Query processing: Millions of QPS, sub-500ms latency
  • Freshness: Breaking news indexed in minutes

Search Engine Architecture

Core Components

  • URL Frontier: Priority queue of URLs to crawl
  • Web Crawler: Distributed crawlers with politeness policies
  • Document Store: Raw HTML storage (distributed file system)
  • Indexer: Build inverted index (word ? document IDs)
  • Index Server: Sharded index across thousands of machines
  • Query Processor: Parse query, retrieve docs, rank results
  • Ranker: PageRank + ML models for relevance scoring

Inverted Index Structure

// Inverted Index Example
{
  "distributed": [
    {"doc_id": 1, "positions": [5, 42], "tf": 0.02},
    {"doc_id": 7, "positions": [12], "tf": 0.01}
  ],
  "systems": [
    {"doc_id": 1, "positions": [6, 43], "tf": 0.02},
    {"doc_id": 3, "positions": [1, 15, 89], "tf": 0.03}
  ]
}

PageRank Algorithm

PageRank scores pages based on incoming links. Pages linked by many high-quality pages rank higher.

# Simplified PageRank
def pagerank(graph, damping=0.85, iterations=100):
    """
    graph: dict of page -> list of outgoing links
    """
    n = len(graph)
    ranks = {page: 1/n for page in graph}
    
    for _ in range(iterations):
        new_ranks = {}
        for page in graph:
            rank_sum = sum(
                ranks[incoming] / len(graph[incoming])
                for incoming in graph
                if page in graph[incoming]
            )
            new_ranks[page] = (1 - damping) / n + damping * rank_sum
        ranks = new_ranks
    
    return ranks

Series Summary

You've completed the case studies! Continue to Part 12 to learn Low-Level Design (LLD) fundamentals including object-oriented design, SOLID principles, and design patterns.

Design Case Study Analysis Generator

Analyze real-world system designs by documenting architecture choices, trade-offs, and lessons learned. Download as Word, Excel, PDF, or PowerPoint.

Draft auto-saved

All data stays in your browser. Nothing is sent to or stored on any server.

Technology