Back to Technology

Complete Database Mastery Part 15: Capstone Projects

January 31, 2026 Wasil Zafar 50 min read

Apply your database mastery with portfolio-ready capstone projects. Build an e-commerce system, real-time analytics pipeline, multi-tenant SaaS platform, and global distributed database—complete implementations showcasing production-grade skills.

Table of Contents

  1. Introduction
  2. Project 1: E-Commerce Platform
  3. Project 2: Real-Time Analytics
  4. Project 3: Multi-Tenant SaaS
  5. Project 4: Global Distributed DB
  6. Building Your Portfolio
  7. Series Conclusion

Introduction: Capstone Overview

Capstone projects are your opportunity to synthesize everything you've learned across this series. These four projects demonstrate real-world database design, implementation, and optimization skills that impress employers and clients.

Series Finale: This is Part 15 of 15 in the Complete Database Mastery series. We're bringing together all the concepts from SQL to distributed systems into portfolio-ready implementations.

Four Capstone Projects

Project Skills Demonstrated Difficulty
E-Commerce Platform Data modeling, transactions, indexing ⭐⭐⭐
Real-Time Analytics Redis, time-series, dashboards ⭐⭐⭐⭐
Multi-Tenant SaaS Security, RLS, tenant isolation ⭐⭐⭐⭐
Global Distributed DB Replication, sharding, CAP trade-offs ⭐⭐⭐⭐⭐

Project 1: E-Commerce Platform Database

Requirements

E-Commerce Database Requirements:

Functional:
- User registration and authentication
- Product catalog with categories and variants
- Shopping cart and order management
- Inventory tracking
- Reviews and ratings
- Payment processing records

Non-Functional:
- Sub-100ms product queries
- Handle 10,000+ concurrent users
- 99.9% uptime
- PCI-DSS compliance for payment data

Schema Design

-- Users & Authentication
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    first_name VARCHAR(100),
    last_name VARCHAR(100),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE addresses (
    id SERIAL PRIMARY KEY,
    user_id INT REFERENCES users(id),
    address_type VARCHAR(20),  -- 'shipping', 'billing'
    street_line1 VARCHAR(255),
    street_line2 VARCHAR(255),
    city VARCHAR(100),
    state VARCHAR(100),
    postal_code VARCHAR(20),
    country VARCHAR(3),
    is_default BOOLEAN DEFAULT false
);

-- Product Catalog
CREATE TABLE categories (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    parent_id INT REFERENCES categories(id),
    slug VARCHAR(100) UNIQUE,
    path LTREE  -- PostgreSQL hierarchical path
);

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    description TEXT,
    category_id INT REFERENCES categories(id),
    base_price DECIMAL(10,2) NOT NULL,
    sku VARCHAR(50) UNIQUE,
    status VARCHAR(20) DEFAULT 'active',
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE product_variants (
    id SERIAL PRIMARY KEY,
    product_id INT REFERENCES products(id),
    sku VARCHAR(50) UNIQUE,
    attributes JSONB,  -- {size: 'L', color: 'blue'}
    price_adjustment DECIMAL(10,2) DEFAULT 0,
    inventory_count INT DEFAULT 0,
    low_stock_threshold INT DEFAULT 10
);

-- Orders & Transactions
CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INT REFERENCES users(id),
    status VARCHAR(20) DEFAULT 'pending',
    shipping_address_id INT REFERENCES addresses(id),
    billing_address_id INT REFERENCES addresses(id),
    subtotal DECIMAL(12,2),
    tax_amount DECIMAL(10,2),
    shipping_amount DECIMAL(10,2),
    total_amount DECIMAL(12,2),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE order_items (
    id SERIAL PRIMARY KEY,
    order_id INT REFERENCES orders(id),
    variant_id INT REFERENCES product_variants(id),
    quantity INT NOT NULL,
    unit_price DECIMAL(10,2) NOT NULL,
    total_price DECIMAL(12,2) NOT NULL
);

-- Indexes for performance
CREATE INDEX idx_products_category ON products(category_id);
CREATE INDEX idx_products_status ON products(status);
CREATE INDEX idx_variants_product ON product_variants(product_id);
CREATE INDEX idx_variants_attrs ON product_variants USING GIN(attributes);
CREATE INDEX idx_orders_user ON orders(user_id);
CREATE INDEX idx_orders_status ON orders(status);
CREATE INDEX idx_orders_created ON orders(created_at DESC);

Implementation

-- Checkout transaction with inventory management
CREATE OR REPLACE FUNCTION process_checkout(
    p_user_id INT,
    p_cart_items JSONB,  -- [{variant_id, quantity}, ...]
    p_shipping_address_id INT,
    p_billing_address_id INT
) RETURNS INT AS $$
DECLARE
    v_order_id INT;
    v_item JSONB;
    v_variant RECORD;
    v_subtotal DECIMAL(12,2) := 0;
BEGIN
    -- Create order
    INSERT INTO orders (user_id, shipping_address_id, billing_address_id, status)
    VALUES (p_user_id, p_shipping_address_id, p_billing_address_id, 'processing')
    RETURNING id INTO v_order_id;

    -- Process each cart item
    FOR v_item IN SELECT * FROM jsonb_array_elements(p_cart_items)
    LOOP
        -- Lock and verify inventory
        SELECT * INTO v_variant 
        FROM product_variants 
        WHERE id = (v_item->>'variant_id')::INT
        FOR UPDATE;
        
        IF v_variant.inventory_count < (v_item->>'quantity')::INT THEN
            RAISE EXCEPTION 'Insufficient inventory for variant %', v_variant.id;
        END IF;
        
        -- Deduct inventory
        UPDATE product_variants 
        SET inventory_count = inventory_count - (v_item->>'quantity')::INT
        WHERE id = v_variant.id;
        
        -- Add order item
        INSERT INTO order_items (order_id, variant_id, quantity, unit_price, total_price)
        VALUES (
            v_order_id,
            v_variant.id,
            (v_item->>'quantity')::INT,
            v_variant.price_adjustment + (
                SELECT base_price FROM products WHERE id = v_variant.product_id
            ),
            (v_item->>'quantity')::INT * (
                v_variant.price_adjustment + (
                    SELECT base_price FROM products WHERE id = v_variant.product_id
                )
            )
        );
        
        v_subtotal := v_subtotal + (v_item->>'quantity')::INT * (
            v_variant.price_adjustment + (
                SELECT base_price FROM products WHERE id = v_variant.product_id
            )
        );
    END LOOP;

    -- Update order totals
    UPDATE orders SET 
        subtotal = v_subtotal,
        tax_amount = v_subtotal * 0.08,
        total_amount = v_subtotal * 1.08
    WHERE id = v_order_id;

    RETURN v_order_id;
END;
$$ LANGUAGE plpgsql;

Project 2: Real-Time Analytics Pipeline

Requirements

Real-Time Analytics Requirements:

Functional:
- Track website events (page views, clicks, conversions)
- Real-time dashboards with sub-second updates
- Historical trend analysis
- Funnel analysis and cohort tracking

Non-Functional:
- Ingest 100,000+ events/second
- Dashboard queries < 100ms
- 1 year data retention
- 99.9% uptime

Architecture

Architecture:

[Web/Mobile Apps]
       ↓
[Event Collectors] → [Message Queue (Kafka)]
                            ↓
                    [Stream Processor]
                     ↓           ↓
              [Redis]      [TimescaleDB]
           (real-time)     (historical)
                ↓               ↓
            [Dashboard API Gateway]
                      ↓
              [React Dashboard]

Implementation

-- TimescaleDB: Hypertable for events
CREATE TABLE events (
    time TIMESTAMPTZ NOT NULL,
    session_id UUID,
    user_id INT,
    event_type VARCHAR(50),
    page_url TEXT,
    referrer TEXT,
    properties JSONB,
    device_type VARCHAR(20),
    country VARCHAR(3)
);

-- Convert to hypertable
SELECT create_hypertable('events', 'time');

-- Compression for older data
ALTER TABLE events SET (
    timescaledb.compress,
    timescaledb.compress_segmentby = 'event_type, device_type'
);

SELECT add_compression_policy('events', INTERVAL '7 days');

-- Continuous aggregates for dashboards
CREATE MATERIALIZED VIEW events_hourly
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 hour', time) AS bucket,
    event_type,
    device_type,
    country,
    COUNT(*) as event_count,
    COUNT(DISTINCT user_id) as unique_users,
    COUNT(DISTINCT session_id) as sessions
FROM events
GROUP BY 1, 2, 3, 4
WITH NO DATA;

-- Refresh policy
SELECT add_continuous_aggregate_policy('events_hourly',
    start_offset => INTERVAL '3 hours',
    end_offset => INTERVAL '1 hour',
    schedule_interval => INTERVAL '1 hour'
);
# Redis for real-time counters
import redis
from datetime import datetime

r = redis.Redis(host='localhost', port=6379, decode_responses=True)

def track_event(event_type: str, user_id: int, properties: dict):
    now = datetime.utcnow()
    minute_bucket = now.strftime('%Y%m%d%H%M')
    
    pipe = r.pipeline()
    
    # Real-time counters
    pipe.incr(f'events:{event_type}:total')
    pipe.incr(f'events:{event_type}:minute:{minute_bucket}')
    pipe.expire(f'events:{event_type}:minute:{minute_bucket}', 3600)
    
    # Unique users (HyperLogLog for approximation)
    pipe.pfadd(f'events:{event_type}:users:{minute_bucket}', user_id)
    
    # Recent events stream
    pipe.xadd(f'stream:events', {'type': event_type, 'user': user_id}, maxlen=10000)
    
    pipe.execute()

def get_realtime_stats(event_type: str):
    minute_bucket = datetime.utcnow().strftime('%Y%m%d%H%M')
    
    return {
        'total': r.get(f'events:{event_type}:total'),
        'this_minute': r.get(f'events:{event_type}:minute:{minute_bucket}'),
        'unique_users': r.pfcount(f'events:{event_type}:users:{minute_bucket}')
    }

Project 3: Multi-Tenant SaaS Platform

Requirements

Multi-Tenant SaaS Requirements:

Functional:
- Multiple organizations (tenants)
- User management per organization
- Subscription tiers with feature limits
- Data isolation between tenants

Non-Functional:
- Complete data isolation (GDPR/SOC2)
- Per-tenant query performance SLAs
- Scalable to 10,000+ tenants
- Support enterprise single-tenant option

Tenancy Models

Tenancy Model Comparison:

1. Shared Schema (Row-Level Isolation)
   ✅ Efficient resource usage
   ✅ Easy to manage/deploy
   ❌ Limited isolation
   Best for: SMB SaaS, cost-sensitive

2. Schema-per-Tenant
   ✅ Better isolation
   ✅ Per-tenant backups
   ❌ Schema migration complexity
   Best for: Mid-market

3. Database-per-Tenant
   ✅ Maximum isolation
   ✅ Easy compliance
   ❌ Highest cost
   Best for: Enterprise, regulated industries

Implementation

-- Shared Schema with Row-Level Security
CREATE TABLE tenants (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name VARCHAR(100) NOT NULL,
    subdomain VARCHAR(50) UNIQUE,
    plan VARCHAR(20) DEFAULT 'free',
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id UUID REFERENCES tenants(id) NOT NULL,
    email VARCHAR(255) NOT NULL,
    role VARCHAR(20) DEFAULT 'member',
    UNIQUE(tenant_id, email)
);

CREATE TABLE projects (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id UUID REFERENCES tenants(id) NOT NULL,
    name VARCHAR(100) NOT NULL,
    created_by UUID REFERENCES users(id)
);

-- Enable RLS
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;

-- Tenant isolation policies
CREATE POLICY tenant_isolation_users ON users
    USING (tenant_id = current_setting('app.current_tenant')::uuid);

CREATE POLICY tenant_isolation_projects ON projects
    USING (tenant_id = current_setting('app.current_tenant')::uuid);

-- Application sets tenant context
-- SET app.current_tenant = 'tenant-uuid-here';
# Python middleware for tenant context
from functools import wraps
import psycopg2

def tenant_context(f):
    @wraps(f)
    def wrapper(*args, **kwargs):
        tenant_id = get_tenant_from_request()  # From subdomain or header
        
        conn = get_db_connection()
        cursor = conn.cursor()
        
        # Set tenant context for RLS
        cursor.execute(
            "SET app.current_tenant = %s", 
            (str(tenant_id),)
        )
        
        try:
            return f(*args, **kwargs)
        finally:
            cursor.execute("RESET app.current_tenant")
            conn.commit()
    
    return wrapper

@tenant_context
def get_projects():
    cursor = get_db_cursor()
    # RLS automatically filters by tenant!
    cursor.execute("SELECT * FROM projects")
    return cursor.fetchall()

Project 4: Global Distributed Database

Requirements

Global Distributed Database Requirements:

Functional:
- Users access from NA, EU, APAC regions
- Low-latency reads in each region
- Cross-region writes with conflict resolution
- Data residency compliance (GDPR)

Non-Functional:
- Sub-50ms reads in each region
- Handle network partitions gracefully
- RPO < 1 second, RTO < 30 seconds
- Support for both strong and eventual consistency

Design

Multi-Region Architecture:

         [NA Region]              [EU Region]             [APAC Region]
    ┌─────────────────┐      ┌─────────────────┐      ┌─────────────────┐
    │   App Servers   │      │   App Servers   │      │   App Servers   │
    └────────┬────────┘      └────────┬────────┘      └────────┬────────┘
             │                        │                        │
    ┌────────┴────────┐      ┌────────┴────────┐      ┌────────┴────────┐
    │  CockroachDB    │◄────►│  CockroachDB    │◄────►│  CockroachDB    │
    │  (Primary)      │      │  (Replica)      │      │  (Replica)      │
    └─────────────────┘      └─────────────────┘      └─────────────────┘
    
Data Placement:
- User profiles: Pinned to user's home region
- Global data: Replicated across all regions
- EU user data: Must stay in EU (GDPR)

Implementation

-- CockroachDB: Multi-region table configuration
CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email VARCHAR(255) UNIQUE,
    region VARCHAR(10) NOT NULL,  -- 'na', 'eu', 'apac'
    profile JSONB,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Pin data to specific regions
ALTER TABLE users CONFIGURE ZONE USING
    num_replicas = 3,
    constraints = '{"+region=na": 1, "+region=eu": 1, "+region=apac": 1}',
    lease_preferences = '[[+region=na]]';

-- Regional pinning for EU compliance
CREATE TABLE eu_user_data (
    user_id UUID REFERENCES users(id),
    sensitive_data JSONB
) LOCALITY REGIONAL BY ROW;

ALTER TABLE eu_user_data SET LOCALITY REGIONAL IN "eu-west-1";
# Python: Region-aware connection routing
from typing import Optional
import os

REGION_ENDPOINTS = {
    'na': 'cockroach-na.example.com:26257',
    'eu': 'cockroach-eu.example.com:26257',
    'apac': 'cockroach-apac.example.com:26257'
}

def get_regional_connection(user_region: Optional[str] = None):
    """Route to nearest region for low latency."""
    region = user_region or detect_client_region()
    endpoint = REGION_ENDPOINTS.get(region, REGION_ENDPOINTS['na'])
    
    return psycopg2.connect(
        host=endpoint.split(':')[0],
        port=26257,
        database='myapp',
        user='app_user',
        sslmode='require'
    )

def detect_client_region() -> str:
    """Detect region from request headers or IP."""
    # In practice: Use CloudFlare, AWS headers, or IP geolocation
    return os.environ.get('AWS_REGION', 'na')[:2]

Building Your Portfolio

Portfolio Tips:
  • GitHub Repository: Create a well-documented repo with README, schema diagrams, and setup instructions
  • Docker Compose: Make projects easy to run locally with docker-compose
  • Performance Benchmarks: Include load test results showing your optimization work
  • Blog Post: Write about challenges faced and solutions implemented
  • Live Demo: Deploy a read-only version for interviewers to explore
Recommended Portfolio Structure:

database-capstone-projects/
├── README.md                    # Overview of all projects
├── docker-compose.yml           # Run everything locally
├── 01-ecommerce/
│   ├── README.md
│   ├── schema/
│   │   └── migrations/
│   ├── seed-data/
│   ├── queries/
│   └── benchmarks/
├── 02-realtime-analytics/
│   ├── README.md
│   ├── timescaledb/
│   ├── redis/
│   └── dashboard/
├── 03-multi-tenant-saas/
│   ├── README.md
│   ├── schema/
│   ├── rls-policies/
│   └── api-examples/
└── 04-global-distributed/
    ├── README.md
    ├── cockroachdb-config/
    ├── region-routing/
    └── chaos-testing/

Series Conclusion: Your Database Mastery Journey

Congratulations on completing the Complete Database Mastery Series! You've journeyed from SQL fundamentals through advanced query optimization, explored both relational and NoSQL databases, understood distributed systems and cloud offerings, and applied security and analytics best practices.

Series Complete! You now have the knowledge to design, implement, optimize, secure, and scale databases for any application. These capstone projects demonstrate your mastery—build them, deploy them, and showcase them in your portfolio.

What's Next?

  • Build the capstone projects — Start with the e-commerce project and progress through all four
  • Get certified — Pursue PostgreSQL, AWS Database, or MongoDB certifications
  • Contribute to open source — Apply your skills to database-related open source projects
  • Teach others — Solidify your knowledge by helping others learn databases
Technology