Capstone: NSA for a Content Platform (like Netflix)

Platform Profile: StreamVerse

For this capstone, we design the NSA for StreamVerse — a video/audio content platform serving 80M subscribers globally with original content, user-generated content, and live streaming.

Scenario StreamVerse — Global Content Platform

Attribute	Details
Subscribers	80M globally across 45 countries
Content Library	2M titles (video, audio, interactive)
Daily Streams	500M stream starts per day
Creators	200K active content creators + in-house studios
Content Types	Movies, series, live sports, UGC, podcasts, interactive stories
Current Pain	Cold-start problem, buffering in emerging markets, manual content tagging

Scale Challenges

                            
                            Content Platform Scale Problems:
                            500M daily stream starts — Each requires instant recommendation, auth, CDN routing
2M content titles — Metadata, encoding, rights management at massive scale
45 countries — Content rights vary by region; latency must be <100ms everywhere
Real-time personalization — Homepage must be different for every user, every visit
Creator tools — Upload, transcode, analytics, monetization — self-service at scale

                        

Content Platform NSA Principles

                            
                            StreamVerse Architectural Principles:
                            Personalization-Native — Every surface personalized in real-time; no two users see the same experience
Edge-First Delivery — Content and computation pushed to edge; users served from nearest node
Content-as-Graph — All content is a node in a knowledge graph; relationships drive discovery
Creator-Empowered — Self-service tools for upload, analytics, monetization; creator success = platform success
Adaptive Quality — Every stream adapts to device, network, and user preference in real-time
Data-Informed Everything — Every pixel on screen earned its place through experimentation

                        

Target Architecture

StreamVerse North Star Architecture

flowchart TB
    subgraph L5["🖥️ Client Experience Layer"]
        direction LR
        C1[Smart TV]
        C2[Mobile]
        C3[Web]
        C4[Gaming Console]
    end

    subgraph L4["🎯 Personalization Engine"]
        direction LR
        P1[Recommendation API]
        P2[Home Feed Builder]
        P3[Search & Discovery]
        P4[Notification Engine]
    end

    subgraph L3["🎬 Content Services"]
        direction LR
        S1[Catalog Service]
        S2[Playback Service]
        S3[Creator Platform]
        S4[Rights Engine]
    end

    subgraph L2["📊 Data & ML Platform"]
        direction LR
        D1[Interaction Events]
        D2[Content Graph]
        D3[ML Models]
        D4[A/B Platform]
    end

    subgraph L1["🌍 Global Infrastructure"]
        direction LR
        I1[Multi-Region Cloud]
        I2[CDN / Edge]
        I3[Encoding Farm]
        I4[Observability]
    end

    L5 --> L4
    L4 --> L3
    L3 --> L2
    L2 --> L1

    style L5 fill:#e8f4f4,stroke:#3B9797
    style L4 fill:#f0f4f8,stroke:#16476A
    style L3 fill:#e8f4f4,stroke:#3B9797
    style L2 fill:#f0f4f8,stroke:#16476A
    style L1 fill:#e8f4f4,stroke:#3B9797

Content Pipeline

Content ingestion is the lifeblood of the platform — from upload to playable asset in under 2 hours for standard content:

Content Ingestion Pipeline

flowchart LR
    U[Upload] --> V[Validation]
    V --> T[Transcode
Multiple bitrates]
    T --> AI[AI Processing
Tags, thumbnails, chapters]
    AI --> QC[Quality Check]
    QC --> R[Rights Check
Region availability]
    R --> CDN[CDN Distribution
Edge pre-warm]
    CDN --> Live[Available to Users]

    style U fill:#e8f4f4,stroke:#3B9797
    style AI fill:#f0f4f8,stroke:#16476A
    style CDN fill:#3B9797,stroke:#3B9797,color:#fff
    style Live fill:#132440,stroke:#132440,color:#fff

{
  "content_pipeline": {
    "ingestion": {
      "upload_max_size": "200GB",
      "supported_formats": ["MP4", "MKV", "MOV", "ProRes", "H.265"],
      "validation": ["codec_check", "audio_levels", "resolution_verify"]
    },
    "transcoding": {
      "profiles": [
        { "resolution": "4K HDR", "bitrate": "16 Mbps", "codec": "AV1" },
        { "resolution": "1080p", "bitrate": "5 Mbps", "codec": "H.265" },
        { "resolution": "720p", "bitrate": "2.5 Mbps", "codec": "H.264" },
        { "resolution": "480p", "bitrate": "1 Mbps", "codec": "H.264" }
      ],
      "adaptive_streaming": "DASH + HLS"
    },
    "ai_enrichment": {
      "auto_tags": "scene detection + object recognition",
      "thumbnails": "AI-selected best frames per genre",
      "chapters": "audio/scene boundary detection",
      "subtitles": "whisper-based ASR + translation (20 languages)"
    }
  }
}

Recommendation Engine

The recommendation system is StreamVerse's most critical competitive advantage — 80% of viewing hours come from recommendations:

Recommendation Architecture Multi-Stage Ranking

Stage	Purpose	Input	Output
Candidate Gen	Find 1000 candidates from 2M titles	User embeddings + content graph	~1000 candidates
Ranking	Score each candidate for this user	User features + content features + context	Ranked list
Re-Ranking	Apply business rules (diversity, freshness)	Ranked list + business constraints	Final display order
Presentation	Optimize thumbnails + copy per user	User preferences + A/B assignment	Personalized UI

Content Knowledge Graph:

Every piece of content is a node connected by multiple edge types:

Genre / Mood / Theme — semantic relationships
Cast / Creator — people-based connections
Viewing Patterns — "users who watched X also watched Y"
Series / Franchise — narrative continuity
Temporal — trending now, seasonally popular, new release

Global Delivery Architecture

Delivering video to 80M users across 45 countries with sub-second start times requires a sophisticated multi-tier delivery system. The fundamental challenge: popular content (top 5% of titles) represents 60% of traffic, while the long tail (bottom 80%) is accessed infrequently but must still be available instantly when requested.

CDN Tiering Strategy

StreamVerse uses a 3-tier CDN architecture that optimizes the cost/performance tradeoff:

Edge tier (200+ PoPs): Stores top 500 titles per region in SSD cache — covers 60% of stream starts. Sub-100ms response time. Cost: highest per GB but delivers majority of traffic.
Mid-tier (12 regional nodes): Stores top 50K titles on HDD+SSD mix — covers 35% of starts. 200-500ms first-byte time. Balances cost and coverage.
Origin (3 master regions): Complete library (2M titles) in object storage — handles remaining 5% (long tail). 500ms-2s first-byte, but pre-fetched to mid-tier on first request.

Adaptive Bitrate & Quality of Experience

Every stream session dynamically adapts video quality based on network conditions, device capability, and user preferences:

Initial quality: Start with lowest viable quality for instant playback (<1s), then ramp up as buffer fills
Bandwidth estimation: Client reports throughput every 2 seconds; server-side model predicts next 30 seconds of available bandwidth
Device-aware: 4K HDR only on capable displays; mobile defaults to 720p (saves bandwidth without visible quality loss on small screens)
Data saver mode: For users on metered connections, reduces bitrate 50% with AI-optimized encoding that preserves perceptual quality

Global CDN Architecture

flowchart TD
    subgraph Origin["Origin (3 regions)"]
        O1[US-East Master]
        O2[EU-West Master]
        O3[APAC Master]
    end

    subgraph MidTier["Mid-Tier CDN"]
        M1[US Regional]
        M2[EU Regional]
        M3[APAC Regional]
        M4[LATAM Regional]
    end

    subgraph Edge["Edge (200+ PoPs)"]
        E1[Edge Cache A]
        E2[Edge Cache B]
        E3[Edge Cache C]
        E4[Edge Cache N]
    end

    subgraph Users["Users"]
        U1[Smart TV]
        U2[Mobile]
        U3[Web]
    end

    Origin --> MidTier
    MidTier --> Edge
    Edge --> Users

    style O1 fill:#16476A,stroke:#16476A,color:#fff
    style M1 fill:#3B9797,stroke:#3B9797,color:#fff
    style E1 fill:#e8f4f4,stroke:#3B9797

Real-Time Personalization

Every user session is personalized in real-time — from homepage layout to thumbnail selection:

Personalization Layers Real-Time Stack

Layer	What's Personalized	Latency Budget
Homepage Rows	Which categories appear, in what order	<50ms
Row Content	Which titles in each row	<100ms
Thumbnails	Which frame/artwork shown per title	<20ms (pre-computed)
Search Results	Result ranking + autocomplete	<150ms
Notifications	What, when, and how to notify	Async (minutes)

                            
                            Latency Architecture:
                            Pre-computed — Heavy ML models run offline; results cached per-user (refresh every 4 hours)
Near-real-time — Session signals (what you just watched) update recommendations within 30 seconds
Real-time — Context signals (time of day, device, mood selection) applied at request time

Conclusion

Gap Analysis: Current vs Target

Dimension	Current State	North Star Target	Gap
Recommendations	Collaborative filtering only	Multi-stage ranking with content graph + contextual signals	Critical
Content Tagging	Manual editorial tags	AI-generated tags, chapters, thumbnails at upload	High
Delivery Latency	3-5s start in emerging markets	<1s start globally via edge pre-warming	High
Cold Start	New users see generic homepage for 7+ days	Personalized within first session via onboarding signals	Critical
Experimentation	Monthly A/B tests, manual analysis	Continuous experimentation platform (100+ concurrent tests)	High
Creator Tools	Basic upload + revenue dashboard	Full self-service: analytics, audience insights, collaboration	High

Architecture Decision Summary

Decision	Choice	Rationale
CDN strategy	3-tier (origin → mid → edge)	Balances cost vs latency; popular content at edge, long-tail at mid-tier
Encoding	Multi-codec (AV1 + H.265 + H.264)	AV1 for modern devices (40% bandwidth savings); H.264 fallback for legacy
Personalization	Hybrid pre-computed + real-time	Heavy ML offline; session context applied at request time for freshness
Content metadata	Knowledge graph (Neo4j)	Enables rich discovery paths; supports "because you watched X" explanations
Experimentation	Server-side A/B with client holdback	No client SDK dependency; instant rollout; clean measurement

A content platform NSA is defined by three forces: scale (500M daily streams), personalization (every user gets a unique experience), and global reach (content delivered from 200+ edge locations). The architecture must support all three simultaneously while enabling rapid content experimentation and creator empowerment.

                            
                            Key Takeaway: In a content platform, the recommendation engine isn't a feature — it's the product. The entire architecture exists to deliver the right content to the right user at the right time. Every other component (CDN, encoding, metadata, search) is in service of that mission. The platforms that master this flywheel — better recommendations → more engagement → more signal → even better recommendations — compound their advantage exponentially.
                        

Capstone: NSA for a Content Platform

Table of Contents