Module 10: Event-Driven Architecture
In a synchronous world, services call each other directly — Service A sends a request to Service B and waits for a response. This creates temporal coupling (both must be alive at the same time) and behavioral coupling (A must know B's API). Event-driven architecture eliminates both.
Instead of "Service A calls Service B," the model becomes: "Service A publishes an event. Anyone interested can react." The producer doesn't know — or care — who's listening. This is the fundamental shift that enables truly decoupled, scalable systems.
Event Fundamentals: Producers, Consumers, Brokers
An event-driven system has three core participants:
- Producers — Emit events when something interesting happens ("OrderPlaced", "PaymentProcessed", "InventoryLow")
- Consumers — Subscribe to events they care about and react accordingly
- Brokers — The intermediary that receives events from producers, stores them durably, and delivers them to consumers
Here's a well-structured event schema that captures intent clearly:
{
"eventId": "evt_a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"eventType": "OrderPlaced",
"aggregateId": "order_12345",
"aggregateType": "Order",
"timestamp": "2026-05-15T14:30:00.000Z",
"version": 1,
"metadata": {
"correlationId": "req_xyz789",
"causationId": "cmd_abc123",
"userId": "user_456",
"source": "order-service"
},
"payload": {
"orderId": "order_12345",
"customerId": "cust_789",
"items": [
{ "productId": "prod_001", "quantity": 2, "unitPrice": 29.99 }
],
"totalAmount": 59.98,
"currency": "USD"
}
}
Notice the correlationId and causationId — these enable distributed tracing across async boundaries. Every event carries the context of why it happened, not just what happened.
Event Sourcing: Storing Events, Not State
Traditional systems store current state — the latest snapshot of an entity. Event sourcing inverts this: you store the sequence of events that led to the current state. The current state is a derived view, computed by replaying events.
flowchart LR
subgraph EVENTS["Event Store (Immutable Log)"]
E1["OrderCreated
$0.00"]
E2["ItemAdded
+$29.99"]
E3["ItemAdded
+$19.99"]
E4["DiscountApplied
-$5.00"]
E5["OrderConfirmed
$44.98"]
end
subgraph STATE["Current State (Derived)"]
S1["Order #12345
Status: Confirmed
Total: $44.98
Items: 2"]
end
E1 --> E2 --> E3 --> E4 --> E5
E5 -->|"Replay"| S1
style E1 fill:#e8f4f4,stroke:#3B9797,color:#132440
style E2 fill:#e8f4f4,stroke:#3B9797,color:#132440
style E3 fill:#e8f4f4,stroke:#3B9797,color:#132440
style E4 fill:#e8f4f4,stroke:#3B9797,color:#132440
style E5 fill:#e8f4f4,stroke:#3B9797,color:#132440
style S1 fill:#f0f4f8,stroke:#16476A,color:#132440
Why store events instead of state?
- Complete audit trail — You can answer "what happened and when" for any entity at any point in time
- Temporal queries — "What was the order state at 2:30 PM?" Replay events up to that timestamp
- Event replay — Fix a bug in your projection logic, replay all events, get corrected state
- Decoupled projections — Different services can build different read models from the same event stream
The tradeoffs:
- Event schemas become your API contract — changing them requires careful versioning
- Replaying millions of events is slow — you need snapshots (periodic state captures) for performance
- Querying is complex — you can't "SELECT * WHERE status = 'confirmed'" against an event store
CQRS: Separating Reads from Writes
Command Query Responsibility Segregation (CQRS) separates the write model (commands that change state) from the read model (queries that return data). This isn't just architectural tidiness — it solves a fundamental scaling mismatch.
flowchart TD
CLIENT["Client Application"]
subgraph WRITE["Write Side (Commands)"]
CMD["Command Handler"]
AGG["Domain Aggregate"]
ES["Event Store"]
end
subgraph READ["Read Side (Queries)"]
PROJ["Projection Engine"]
RM1["Read Model A
(SQL - List View)"]
RM2["Read Model B
(Elasticsearch - Search)"]
RM3["Read Model C
(Redis - Dashboard)"]
end
CLIENT -->|"PlaceOrder"| CMD
CMD --> AGG
AGG -->|"OrderPlaced"| ES
ES -->|"Publish"| PROJ
PROJ --> RM1
PROJ --> RM2
PROJ --> RM3
CLIENT -->|"GetOrders"| RM1
CLIENT -->|"SearchOrders"| RM2
CLIENT -->|"GetMetrics"| RM3
style CMD fill:#e8f4f4,stroke:#3B9797,color:#132440
style AGG fill:#e8f4f4,stroke:#3B9797,color:#132440
style ES fill:#e8f4f4,stroke:#3B9797,color:#132440
style PROJ fill:#f0f4f8,stroke:#16476A,color:#132440
style RM1 fill:#f0f4f8,stroke:#16476A,color:#132440
style RM2 fill:#f0f4f8,stroke:#16476A,color:#132440
style RM3 fill:#f0f4f8,stroke:#16476A,color:#132440
In most systems, reads outnumber writes by 100:1 or more. CQRS lets you:
- Scale reads and writes independently — 50 read replicas, one write master
- Optimize each side differently — Normalize writes for consistency, denormalize reads for speed
- Use different storage engines — Writes to PostgreSQL, reads from Elasticsearch and Redis
- Evolve read models without touching business logic — Add a new read model by simply processing existing events
Apache Kafka: The Distributed Event Platform
Kafka isn't just a message queue — it's a distributed commit log. Understanding its architecture explains why it scales to millions of events per second while maintaining strict ordering guarantees.
flowchart TD
subgraph TOPIC["Topic: order-events"]
subgraph P0["Partition 0"]
P0E1["Offset 0: OrderPlaced"]
P0E2["Offset 1: OrderConfirmed"]
P0E3["Offset 2: OrderShipped"]
end
subgraph P1["Partition 1"]
P1E1["Offset 0: OrderPlaced"]
P1E2["Offset 1: OrderCancelled"]
end
subgraph P2["Partition 2"]
P2E1["Offset 0: OrderPlaced"]
P2E2["Offset 1: OrderPlaced"]
P2E3["Offset 2: OrderConfirmed"]
end
end
subgraph CG["Consumer Group: fulfillment"]
C1["Consumer 1"]
C2["Consumer 2"]
C3["Consumer 3"]
end
P0 --> C1
P1 --> C2
P2 --> C3
style P0E1 fill:#e8f4f4,stroke:#3B9797,color:#132440
style P0E2 fill:#e8f4f4,stroke:#3B9797,color:#132440
style P0E3 fill:#e8f4f4,stroke:#3B9797,color:#132440
style P1E1 fill:#e8f4f4,stroke:#3B9797,color:#132440
style P1E2 fill:#e8f4f4,stroke:#3B9797,color:#132440
style P2E1 fill:#e8f4f4,stroke:#3B9797,color:#132440
style P2E2 fill:#e8f4f4,stroke:#3B9797,color:#132440
style P2E3 fill:#e8f4f4,stroke:#3B9797,color:#132440
style C1 fill:#f0f4f8,stroke:#16476A,color:#132440
style C2 fill:#f0f4f8,stroke:#16476A,color:#132440
style C3 fill:#f0f4f8,stroke:#16476A,color:#132440
Core Kafka concepts:
- Topics — Named categories of events (like database tables for events)
- Partitions — Each topic splits into partitions for parallelism. Events with the same key always go to the same partition (guaranteeing order for that key)
- Consumer Groups — Multiple consumers share partitions. Each partition is read by exactly one consumer in a group — this enables horizontal scaling
- Offsets — Each event in a partition has a sequential offset. Consumers track their position — they can replay from any offset
- Retention — Kafka retains events for a configurable period (days, weeks, forever). Unlike traditional queues, reading doesn't delete the message
A typical Kafka topic configuration:
# Kafka topic configuration for order events
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
name: order-events
labels:
strimzi.io/cluster: production-cluster
spec:
partitions: 12
replicas: 3
config:
retention.ms: 604800000 # 7 days retention
cleanup.policy: delete
min.insync.replicas: 2 # At least 2 replicas must ACK
compression.type: lz4 # Compress for network efficiency
max.message.bytes: 1048576 # 1MB max event size
segment.bytes: 1073741824 # 1GB log segment files
And a Python producer that publishes order events with proper error handling:
from confluent_kafka import Producer
import json
import uuid
from datetime import datetime, timezone
# Configure Kafka producer with reliability settings
producer_config = {
'bootstrap.servers': 'kafka-broker-1:9092,kafka-broker-2:9092',
'acks': 'all', # Wait for all replicas to ACK
'retries': 3, # Retry transient failures
'retry.backoff.ms': 100,
'enable.idempotence': True, # Exactly-once semantics
'max.in.flight.requests.per.connection': 5,
'compression.type': 'lz4',
'linger.ms': 5, # Batch small messages
}
producer = Producer(producer_config)
def publish_order_event(order_id, customer_id, items, total):
"""Publish an OrderPlaced event to Kafka with idempotent delivery."""
event = {
'eventId': str(uuid.uuid4()),
'eventType': 'OrderPlaced',
'aggregateId': order_id,
'timestamp': datetime.now(timezone.utc).isoformat(),
'payload': {
'orderId': order_id,
'customerId': customer_id,
'items': items,
'totalAmount': total,
}
}
# Use order_id as key — ensures all events for same order
# go to same partition (preserving ordering per order)
producer.produce(
topic='order-events',
key=order_id.encode('utf-8'),
value=json.dumps(event).encode('utf-8'),
callback=delivery_callback
)
producer.flush()
def delivery_callback(err, msg):
if err:
print(f"Delivery failed: {err}")
else:
print(f"Delivered to {msg.topic()}[{msg.partition()}] @ offset {msg.offset()}")
# Example usage
publish_order_event(
order_id='order_12345',
customer_id='cust_789',
items=[{'productId': 'prod_001', 'quantity': 2, 'unitPrice': 29.99}],
total=59.98
)
print("Event published successfully")
Key Kafka commands for operations:
# Create a topic with 12 partitions and replication factor 3
kafka-topics.sh --bootstrap-server localhost:9092 \
--create --topic order-events \
--partitions 12 --replication-factor 3
# Describe topic configuration and partition assignments
kafka-topics.sh --bootstrap-server localhost:9092 \
--describe --topic order-events
# Check consumer group lag (how far behind consumers are)
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
--group fulfillment-service --describe
# Read events from beginning (useful for debugging)
kafka-console-consumer.sh --bootstrap-server localhost:9092 \
--topic order-events --from-beginning --max-messages 10
# Reset consumer group offset to replay events
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
--group fulfillment-service --topic order-events \
--reset-offsets --to-earliest --execute
RabbitMQ & NATS: Alternative Broker Patterns
Kafka excels at high-throughput event streaming with durable storage. But not every system needs a distributed commit log. RabbitMQ and NATS serve different use cases:
When to Use Each Broker
| Criteria | Kafka | RabbitMQ | NATS |
|---|---|---|---|
| Model | Distributed log | Smart broker / dumb consumer | At-most-once pub/sub |
| Ordering | Per-partition | Per-queue | None (best effort) |
| Throughput | Millions/sec | Tens of thousands/sec | Millions/sec |
| Retention | Configurable (days–forever) | Until consumed | JetStream for persistence |
| Best For | Event sourcing, analytics pipelines | Task queues, routing, RPC | IoT, edge computing, microservices |
RabbitMQ Exchange Types provide sophisticated routing:
- Direct — Route by exact routing key match (e.g., "payment.success" → payment queue)
- Fanout — Broadcast to all bound queues (e.g., order event → notification + analytics + audit)
- Topic — Pattern matching with wildcards (e.g., "order.*.failed" matches "order.payment.failed")
- Headers — Route by message header attributes (rarely used)
Message Ordering & Idempotency
Two problems that plague every event-driven system:
Ordering: In Kafka, order is guaranteed within a partition. If you need total ordering for an entity (all events for order #123 in sequence), use the entity ID as the partition key. Across partitions, there's no ordering guarantee — design for this.
Idempotency: Messages can be delivered more than once (network retries, consumer crashes before committing offset). Every consumer must handle duplicates safely:
import hashlib
import redis
# Redis-based idempotency check for event consumers
redis_client = redis.Redis(host='localhost', port=6379, db=0)
def process_event_idempotently(event):
"""Process event exactly once using deduplication with Redis."""
event_id = event['eventId']
dedup_key = f"processed:{event_id}"
# Check if already processed (atomic SET NX with TTL)
already_processed = not redis_client.set(
dedup_key, '1', nx=True, ex=86400 # 24h TTL
)
if already_processed:
print(f"Skipping duplicate event: {event_id}")
return
# Process the event (business logic here)
if event['eventType'] == 'OrderPlaced':
order = event['payload']
print(f"Processing order {order['orderId']} "
f"for customer {order['customerId']}")
# ... fulfill order logic ...
print(f"Successfully processed event: {event_id}")
# Example: simulate processing with deduplication
sample_event = {
'eventId': 'evt_abc123',
'eventType': 'OrderPlaced',
'payload': {'orderId': 'order_001', 'customerId': 'cust_456'}
}
process_event_idempotently(sample_event) # Processes
process_event_idempotently(sample_event) # Skips (duplicate)
Module 11: Data Architecture
Once you move beyond a single database server, every operation involves tradeoffs. You can't have instant consistency, high availability, and network partition tolerance simultaneously. Understanding these tradeoffs — and knowing which ones your system can tolerate — is the core skill of data architecture.
Replication Strategies
Replication serves two purposes: fault tolerance (survive node failures) and read scaling (spread read load across replicas). The strategy you choose determines your consistency guarantees.
Leader-Follower (Single-Leader) Replication:
- One node accepts writes (leader); followers replicate asynchronously or synchronously
- Synchronous: follower confirms before client gets ACK — strong consistency but higher latency
- Asynchronous: leader ACKs immediately, followers catch up eventually — lower latency but risk of data loss if leader crashes
- Used by: PostgreSQL, MySQL, MongoDB (primary/secondary)
Multi-Leader Replication:
- Multiple nodes accept writes — useful for multi-datacenter deployments
- Conflict resolution required: last-write-wins (LWW), merge, or custom resolution
- Used by: CockroachDB, Google Spanner, Cassandra (any node writes)
Leaderless Replication:
- Any node accepts reads and writes. Quorum-based: W + R > N ensures consistency
- Write to W of N nodes, read from R of N nodes — if W + R > N, at least one node has the latest value
- Used by: Cassandra, DynamoDB, Riak
Sharding & Partitioning
When a single node can't hold all your data or handle all your traffic, you partition (shard) data across multiple nodes. Each shard holds a subset of the data. The sharding strategy determines data distribution and query efficiency.
Range-Based Sharding:
- Assign key ranges to shards (e.g., users A–M → shard 1, N–Z → shard 2)
- Pro: Range queries are efficient (find all users between "Smith" and "Taylor")
- Con: Hot spots if data isn't uniformly distributed (most users start with "S")
Hash-Based Sharding:
- Hash the key, mod by number of shards:
shard = hash(key) % num_shards - Pro: Even distribution regardless of key patterns
- Con: Range queries require scatter-gather across all shards
- Con: Adding shards requires rehashing (consistent hashing mitigates this)
Directory-Based Sharding:
- A lookup service maps keys to shards. Most flexible but adds a single point of failure and latency
- Used when sharding logic is complex or data needs manual placement (e.g., geographic compliance)
CAP Theorem: The Fundamental Tradeoff
The CAP theorem (Brewer, 2000) states: in a distributed system experiencing a network partition, you must choose between Consistency (every read returns the most recent write) and Availability (every request receives a response). You cannot have both during a partition.
flowchart TD
CAP["CAP Theorem"]
C["Consistency
Every read sees latest write"]
A["Availability
Every request gets a response"]
P["Partition Tolerance
System works despite network splits"]
CAP --> C
CAP --> A
CAP --> P
CP["CP Systems
etcd, ZooKeeper, HBase
MongoDB (default)"]
AP["AP Systems
Cassandra, DynamoDB
CouchDB, Riak"]
C ---|"+ P"| CP
A ---|"+ P"| AP
style C fill:#e8f4f4,stroke:#3B9797,color:#132440
style A fill:#f0f4f8,stroke:#16476A,color:#132440
style P fill:#fdf0f0,stroke:#BF092F,color:#132440
style CP fill:#e8f4f4,stroke:#3B9797,color:#132440
style AP fill:#f0f4f8,stroke:#16476A,color:#132440
Key clarification: CAP isn't "pick 2 of 3" — it's "during a partition (P is forced), choose C or A." When there's no partition, you can have both consistency and availability. The question is: what happens when the network splits?
CP vs AP in Production
CP (Consistency + Partition Tolerance):
- etcd / ZooKeeper: Used for service discovery and leader election. During a partition, minority nodes stop serving reads/writes rather than risk serving stale data. Correctness > availability.
- Google Spanner: Globally consistent via TrueTime. Transactions may block during partitions. Chosen for financial systems where inconsistency = money loss.
AP (Availability + Partition Tolerance):
- DynamoDB: Every request gets a response, even during partitions. Eventual consistency — reads may return stale data. Chosen for shopping carts where "add to cart always works" beats "cart is perfectly consistent."
- Cassandra: Tunable consistency (ONE, QUORUM, ALL). At QUORUM, it's CP-ish; at ONE, it's fully AP. Flexibility per query.
PACELC: Beyond CAP
CAP only describes behavior during partitions. PACELC extends this: "If there's a Partition, choose Availability or Consistency. Else (normal operation), choose Latency or Consistency."
This captures the latency-consistency tradeoff that exists even when the network is healthy:
- PA/EL (DynamoDB, Cassandra at ONE): During partition → available. Normal operation → low latency (async replication)
- PC/EC (etcd, ZooKeeper): During partition → consistent. Normal operation → consistent (synchronous replication, higher latency)
- PA/EC (rare): Available during partitions, but consistent during normal ops. Possible with sync replication + failover-to-AP
Case Studies
LinkedIn: The Birth of Kafka
LinkedIn's Data Pipeline Problem
By 2010, LinkedIn had dozens of services that needed to share data: the activity feed consumed events from connections, messaging, job applications, and endorsements. They tried point-to-point integrations — each service calling each other service — and ended up with an O(N²) coupling nightmare.
The solution: Jay Kreps, Neha Narkhede, and Jun Rao built Kafka as an internal platform (2011). Instead of N² connections, every service publishes to Kafka and consumes from Kafka. The coupling became O(N) — each service has one integration point.
Scale today: LinkedIn processes over 7 trillion messages per day through Kafka. Every action — profile view, connection request, message sent — becomes an event that feeds dozens of downstream systems: search indexing, recommendation engines, analytics, compliance, and the activity feed itself.
Key architectural decisions:
- Append-only log (not destructive read) — multiple consumers read the same stream independently
- Partition-based parallelism — scale consumers by adding partitions
- Retention-based storage — keep events for days/weeks, enabling replay and reprocessing
DynamoDB: Choosing Availability Over Consistency
Amazon's Dynamo Paper (2007)
During the 2004 holiday season, Amazon's shopping cart service experienced availability issues. The engineering team asked: "Is it worse if a customer sees a stale cart (missing a recently added item) or if the cart is completely unavailable?" The answer was clear — an available but slightly stale cart is better than a 500 error.
Design choices that followed:
- AP over CP: The system always accepts writes, even during network partitions. Conflicting writes are resolved later using vector clocks and application-level merging.
- Consistent hashing: Data distributed across a ring of nodes. Adding/removing nodes moves minimal data.
- Sloppy quorums: If the designated node is unreachable, write to a "hint" node temporarily. Reconcile when the original node recovers.
- Anti-entropy with Merkle trees: Background process compares data between replicas using hash trees, identifying and repairing inconsistencies.
Impact: The Dynamo paper influenced Cassandra (Facebook), Riak (Basho), and Voldemort (LinkedIn). DynamoDB (the managed AWS service) builds on these principles, serving single-digit millisecond latency at any scale.
Conclusion & Next Steps
Modules 10 and 11 covered the two pillars of distributed data systems: how data moves (event-driven architecture) and how data persists (distributed data architecture).
The key takeaways:
- Event sourcing stores history, not snapshots. You gain auditability, temporal queries, and replay capability — at the cost of query complexity and schema evolution challenges.
- CQRS separates concerns that have different scaling profiles. Writes need consistency; reads need speed. Optimize each independently, but accept eventual consistency between them.
- Kafka is a distributed commit log, not a message queue. Its partition model enables both ordering guarantees (per key) and horizontal scaling (more partitions = more consumers).
- Idempotency is non-negotiable. In any system where messages can be delivered more than once (which is every distributed system), consumers must safely handle duplicates.
- CAP theorem forces a choice during partitions. Most systems choose AP (availability) for user-facing paths and CP (consistency) for coordination/metadata paths.
- PACELC reveals the latency-consistency tradeoff that exists even during normal operation. Synchronous replication = consistent + slow. Asynchronous = fast + eventually consistent.
Next in the Series
In Part 8: API & Cloud-Native Architecture, we'll explore API design styles (REST, gRPC, GraphQL), gateway patterns, and the principles of cloud-native architecture — immutability, elasticity, and the Twelve-Factor methodology that underpins modern deployments.