Introduction
Software architecture is the skeleton of a system. Just as a building's architecture determines whether it can support ten floors or fifty, software architecture determines whether a system can serve ten users or ten million — whether a team of five can evolve it or whether it will require two hundred engineers to keep it alive.
Ralph Johnson defined architecture as "the decisions that are hard to change." Martin Fowler extended this: architecture is the shared understanding that the expert developers in a project have of the system design. Both definitions point to the same truth — architecture is about the significant decisions, the ones that constrain everything else.
In this article, we explore what software architecture actually is, how it differs from design, and then dive deep into the major architectural patterns you will encounter in real systems. We finish with a practical tool — the Architectural Decision Record — that helps teams document and communicate these critical choices.
Why Architecture Matters for Delivery
Architecture directly determines three delivery properties:
- Deployment units: How many independent pieces can be deployed separately? A monolith is one deployment unit. A microservices system may have hundreds.
- Team boundaries: Conway's Law states that organisations design systems that mirror their communication structures. Architecture defines where teams can work independently without blocking each other.
- Scalability constraints: A single-process monolith must scale vertically (bigger machines). A distributed system can scale horizontally (more machines). Architecture determines which scaling path is available.
Poor architecture choices early in a project create accidental complexity that compounds over time. Every feature takes longer, every change risks breaking something else, and eventually the system becomes so rigid that a rewrite is the only option.
Architecture vs Design
These terms are often used interchangeably, but they describe different levels of abstraction:
| Dimension | Architecture | Design |
|---|---|---|
| Scope | System-wide structure and boundaries | Within a single component or module |
| Concerns | Quality attributes, deployment topology, communication patterns | Classes, interfaces, algorithms, data structures |
| Change Cost | High — affects multiple teams, services, infrastructure | Lower — contained within a module |
| Decided By | Senior architects, tech leads, cross-team decisions | Individual developers within their component |
| Examples | "We use event-driven communication between services" | "This class uses the Strategy pattern for payment processing" |
| Documentation | Architecture Decision Records, C4 diagrams | Code itself, UML class diagrams, inline comments |
What Makes Something Architectural?
A decision is architectural if it satisfies one or more of these criteria:
- It is hard to reverse — Changing it would require significant rework across the system
- It constrains other decisions — Once chosen, it limits what designs are possible within components
- It affects multiple stakeholders — Development teams, operations, security, and business all have opinions
- It involves tradeoffs between quality attributes — You cannot optimise for everything simultaneously
Key Architectural Patterns
An architectural pattern is a reusable solution to a commonly occurring problem in system structure. Patterns are not prescriptions — they are options. The skill of architecture lies in knowing which pattern fits which context.
Client-Server
The most fundamental distributed architecture pattern. A client sends requests; a server processes them and returns responses. The entire web is built on this pattern.
flowchart LR
subgraph Clients
A[Web Browser]
B[Mobile App]
C[CLI Tool]
end
subgraph Server
D[Load Balancer]
E[Application Server]
F[Database]
end
A -->|HTTP Request| D
B -->|REST API| D
C -->|gRPC| D
D --> E
E --> F
E -->|Response| D
D -->|HTTP Response| A
Thin vs Thick Clients:
- Thin client: Minimal logic in the client. The server does most processing. Example: traditional server-rendered web apps (Rails, Django).
- Thick client: Significant logic in the client. The server provides APIs. Example: Single Page Applications (React, Angular), mobile apps.
When to use: Almost every web application, mobile backend, API service. It is the default starting point for most systems.
Tradeoffs: Simple to understand and deploy. Single point of failure at the server. Server must scale to handle all client load. Network latency affects every interaction.
Layered (N-Tier) Architecture
The layered pattern organises code into horizontal layers, each with a specific responsibility. Each layer only communicates with the layer directly below it (strict layering) or any layer below it (relaxed layering).
flowchart TD
A[Presentation Layer
UI, Controllers, Views] --> B[Business Logic Layer
Services, Domain Objects, Rules]
B --> C[Data Access Layer
Repositories, ORMs, Queries]
C --> D[Database Layer
PostgreSQL, MongoDB, Redis]
style A fill:#3B9797,color:#fff
style B fill:#16476A,color:#fff
style C fill:#132440,color:#fff
style D fill:#BF092F,color:#fff
Strict vs Relaxed Layering:
- Strict: Layer N can only call Layer N-1. Forces all requests through every layer. Maximum separation but can create "pass-through" layers that add no value.
- Relaxed: Layer N can call any layer below it. More flexible but creates hidden dependencies that make refactoring harder.
Common Layer Configurations:
- 3-tier: Presentation → Business → Data (most web applications)
- 4-tier: Presentation → Application → Domain → Infrastructure (Domain-Driven Design)
- 2-tier: Client → Database (simple desktop applications)
When to use: Business applications with clear separation of concerns. Teams that want predictable structure. Codebases where multiple developers work on different layers simultaneously.
Tradeoffs: Easy to understand and implement. Can become monolithic if all layers deploy together. Performance overhead from layer-to-layer calls. Risk of "architecture sinkhole" where layers just pass data through without transformation.
The Architecture Sinkhole Anti-Pattern
A team building a financial reporting system implemented strict 4-tier architecture. They discovered that 80% of their requests simply passed data from the database through the Data Access Layer → Business Layer → Application Layer → Presentation Layer without any transformation. The "Business Logic" layer was just calling repository.findById(id) and returning the result unchanged. The solution: bypass layers when they add no value. Allow the Presentation layer to call the Data Access layer directly for simple read operations. This is the pragmatic reality of relaxed layering.
Pipe-and-Filter
Data flows through a chain of processing stages (filters), connected by channels (pipes). Each filter is independent — it receives input, transforms it, and produces output. The Unix command line is the canonical example: cat file.log | grep ERROR | sort | uniq -c | sort -rn.
flowchart LR
A[Data Source
CSV Files] --> B[Extract
Parse & Validate]
B --> C[Transform
Clean & Enrich]
C --> D[Aggregate
Group & Sum]
D --> E[Format
JSON Output]
E --> F[Load
Data Warehouse]
style A fill:#132440,color:#fff
style B fill:#3B9797,color:#fff
style C fill:#3B9797,color:#fff
style D fill:#3B9797,color:#fff
style E fill:#3B9797,color:#fff
style F fill:#BF092F,color:#fff
Key Properties:
- Composability: Filters can be rearranged, added, or removed without changing other filters
- Reusability: A "validate email" filter can be used in multiple pipelines
- Parallelism: Independent filters can run concurrently on different data chunks
- Testability: Each filter can be tested in isolation with known input/output
When to use: Data processing pipelines (ETL), stream processing, compiler stages (lexing → parsing → semantic analysis → code generation), image processing, log analysis.
Tradeoffs: Excellent composability and testability. Not suitable for interactive applications. Overhead from serialisation/deserialisation between stages. Error handling across the pipeline is complex.
Event-Driven Architecture
Components communicate by producing and consuming events — records of something that happened. Producers do not know (or care) who consumes their events. Consumers do not know who produced the events. This creates extreme decoupling.
flowchart TD
subgraph Producers
A[Order Service]
B[Payment Service]
C[Inventory Service]
end
subgraph Event Bus
D[Message Broker
Kafka / RabbitMQ]
end
subgraph Consumers
E[Email Service]
F[Analytics Service]
G[Audit Log Service]
H[Shipping Service]
end
A -->|OrderPlaced| D
B -->|PaymentProcessed| D
C -->|StockUpdated| D
D -->|OrderPlaced| E
D -->|OrderPlaced| F
D -->|PaymentProcessed| G
D -->|OrderPlaced| H
Event Types:
- Domain Events: "OrderPlaced", "UserRegistered" — business-meaningful things that happened
- Integration Events: Events published for other services to consume across boundaries
- Event Notifications: Thin events that say "something changed" — consumers must query for details
- Event-Carried State Transfer: Fat events containing all the data consumers need — no callbacks required
Eventual Consistency: Because events are processed asynchronously, the system is eventually consistent — different services may have different views of the world for brief periods. This is the fundamental tradeoff of event-driven systems.
When to use: Systems requiring high decoupling between services. Scenarios where multiple consumers need to react to the same event. Systems where eventual consistency is acceptable. High-throughput systems (millions of events per second).
Tradeoffs: Maximum decoupling and scalability. Difficult to debug (no single call stack). Eventual consistency complicates business logic. Event schema evolution requires careful versioning.
Primary-Replica (Master-Slave)
One node (the primary) handles all write operations. Multiple replicas receive copies of the data and handle read operations. This separates read and write workloads, enabling horizontal scaling of reads.
Use cases:
- Database replication: PostgreSQL primary with read replicas for reporting queries
- Content distribution: Primary content server replicating to CDN edge nodes
- High availability: If the primary fails, a replica is promoted (failover)
Consistency Models:
- Synchronous replication: Primary waits for replicas to confirm before acknowledging writes. Strong consistency but higher latency.
- Asynchronous replication: Primary acknowledges writes immediately, replicates in the background. Lower latency but risk of data loss on primary failure.
- Semi-synchronous: Primary waits for at least one replica to confirm. Balance between consistency and performance.
When to use: Read-heavy workloads (90%+ reads), systems requiring high availability, scenarios where read scaling is more important than write scaling.
Tradeoffs: Excellent read scalability. Write bottleneck at primary. Replication lag creates potential for stale reads. Failover adds operational complexity.
Microservices
The system is decomposed into small, independently deployable services, each owning its own data and communicating via well-defined APIs or events. Each service is built, deployed, and scaled independently.
Key Properties:
- Single Responsibility: Each service does one business capability well (User Service, Payment Service, Inventory Service)
- Independent Deployment: Changing the Payment Service does not require redeploying the User Service
- Polyglot Persistence: Each service chooses the best database for its needs (SQL, NoSQL, Graph, Time-series)
- API Contracts: Services communicate through versioned APIs — internal implementation is hidden
- Fault Isolation: If the Recommendation Service crashes, the core checkout flow still works
When to use: Large organisations (100+ engineers) where team autonomy is critical. Systems requiring different scaling characteristics for different components. When deployment independence is worth the operational overhead.
Tradeoffs: Maximum team autonomy and scaling flexibility. Massive operational complexity (service mesh, distributed tracing, API gateways). Network calls replace function calls (latency). Data consistency across services is fundamentally hard.
Amazon's "Two-Pizza Teams" and Service-Oriented Architecture
In 2002, Jeff Bezos issued his famous mandate: all teams must communicate through service interfaces. No direct database access. No shared-memory models. Every team's service must be designed to be exposed externally. This forced Amazon to decompose their monolithic bookstore into hundreds of independent services — each owned by a "two-pizza team" (6-8 people). The result: Amazon could scale both their technology and their organisation. Each team could innovate independently, deploy multiple times per day, and choose their own technology stack. This became the blueprint for what we now call "microservices."
Monolithic Architecture
The entire application is a single deployment unit. All code runs in one process, shares one database, and is deployed together. Despite its reputation, the monolith is often the correct architectural choice — especially for new products, small teams, and systems where simplicity trumps flexibility.
When monoliths are correct:
- Team size is small (< 20 engineers)
- Domain boundaries are not yet clear (early product)
- Deployment simplicity is valued over independent scaling
- The system does not have drastically different scaling requirements for different components
- You cannot afford the operational overhead of distributed systems (Kubernetes, service mesh, distributed tracing)
The Modular Monolith: A pragmatic middle ground. The codebase is structured into well-defined modules with clear boundaries and interfaces — but deployed as a single unit. Each module could theoretically become a microservice, but you defer that decision until it is actually needed. This preserves simplicity while maintaining clean architecture.
# Modular Monolith Directory Structure
src/
├── modules/
│ ├── users/
│ │ ├── api/ # Public interface (what other modules can call)
│ │ ├── domain/ # Business logic (private to this module)
│ │ ├── persistence/ # Database access (private)
│ │ └── tests/
│ ├── payments/
│ │ ├── api/
│ │ ├── domain/
│ │ ├── persistence/
│ │ └── tests/
│ └── inventory/
│ ├── api/
│ ├── domain/
│ ├── persistence/
│ └── tests/
├── shared/ # Cross-cutting concerns (logging, auth)
└── main.py # Single entry point
Architectural Quality Attributes
Architecture decisions are fundamentally about tradeoffs between quality attributes — the non-functional requirements that determine how a system behaves under various conditions.
| Attribute | Definition | Measured By | Architecture Impact |
|---|---|---|---|
| Performance | Response time and throughput under load | Latency (p50, p95, p99), requests/second | Caching layers, async processing, database choice |
| Scalability | Ability to handle increased load | Linear vs sublinear throughput growth | Statelessness, horizontal partitioning, load balancing |
| Availability | System uptime and fault tolerance | Nines (99.9%, 99.99%), MTTR, MTBF | Redundancy, failover, circuit breakers, health checks |
| Security | Protection against threats and unauthorised access | Vulnerability count, time-to-patch, compliance | Network segmentation, auth layers, encryption at rest/transit |
| Maintainability | Ease of modification and evolution | Change lead time, defect rate after changes | Modularity, loose coupling, clear boundaries |
| Testability | Ease of verifying correctness | Test coverage achievable, test execution time | Dependency injection, interface contracts, isolation |
Tradeoff Analysis
You cannot optimise all attributes simultaneously. Architecture is the art of choosing which attributes matter most for your context:
- Performance vs Maintainability: Optimised code is often harder to read and change
- Availability vs Consistency: The CAP theorem — distributed systems must choose (AP or CP)
- Security vs Usability: More security layers create more friction for users
- Scalability vs Simplicity: Distributed systems scale better but are orders of magnitude more complex
Architectural Decision Records (ADRs)
Architectural decisions are some of the most important choices a team makes — yet they are often buried in meeting notes, Slack threads, or (worst of all) a single person's memory. When that person leaves, the why behind the architecture is lost forever.
An Architectural Decision Record (ADR) is a short document that captures one architectural decision — the context, the decision itself, and its consequences. ADRs are stored in the repository alongside the code they govern.
ADR Template
# ADR-NNN: [Short Title of Decision]
## Status
[Proposed | Accepted | Deprecated | Superseded by ADR-XXX]
## Context
[What is the issue we are facing? What forces are at play?
What constraints exist? What options did we consider?]
## Decision
[What is the change we are proposing or have agreed to?
State it clearly and definitively.]
## Consequences
[What becomes easier or harder as a result of this decision?
What are the positive, negative, and neutral consequences?]
## Alternatives Considered
[What other options were evaluated and why were they rejected?]
Sample ADR: Choosing an Event Bus
# ADR-007: Use Apache Kafka as the Event Bus
## Status
Accepted (2026-03-15)
## Context
Our e-commerce platform needs asynchronous communication between
services (Order, Payment, Inventory, Notification). We need:
- At-least-once delivery guarantees
- Message ordering within a partition
- Support for 50,000+ events/second at peak
- Message retention for replay (consumer catch-up)
- Multi-consumer support (same event, multiple subscribers)
Options evaluated: RabbitMQ, Apache Kafka, AWS SQS/SNS, Redis Streams.
## Decision
We will use Apache Kafka (managed via Confluent Cloud) as our
primary event bus for all inter-service communication.
## Consequences
Positive:
- High throughput (100K+ events/sec demonstrated in load testing)
- Built-in partitioning for horizontal scaling
- Log-based retention allows consumer replay
- Strong ecosystem (Schema Registry, Connect, Streams)
Negative:
- Operational complexity higher than RabbitMQ
- Eventual consistency model requires idempotent consumers
- Team needs Kafka-specific training (partitions, consumer groups)
- Cost: ~$1,200/month for Confluent Cloud at expected throughput
Neutral:
- Requires schema registry for event versioning (additional component)
- Consumer offset management is our responsibility
## Alternatives Considered
- RabbitMQ: Rejected due to lack of built-in log retention and
replay capability. Better for task queues, not event streaming.
- AWS SQS/SNS: Rejected to avoid AWS vendor lock-in (multi-cloud
requirement from stakeholders).
- Redis Streams: Rejected due to durability concerns and limited
ecosystem for schema management.
Pattern Comparison Table
| Pattern | Coupling | Scalability | Complexity | Deployment | Team Size | Best For |
|---|---|---|---|---|---|---|
| Client-Server | Medium | Vertical | Low | Single unit | Any | Web apps, APIs |
| Layered | Medium-High | Vertical | Low | Single unit | 5-30 | Enterprise apps, CRUD |
| Pipe-and-Filter | Very Low | Horizontal | Medium | Per filter | 3-20 | Data pipelines, ETL |
| Event-Driven | Very Low | Horizontal | High | Per service | 20-200+ | Reactive systems, IoT |
| Primary-Replica | Medium | Read-horizontal | Medium | Per node | Any | Read-heavy workloads |
| Microservices | Very Low | Horizontal | Very High | Per service | 50-1000+ | Large orgs, scaling teams |
| Monolith | High | Vertical | Low | Single unit | 2-20 | Startups, MVPs, small teams |
Exercises
Conclusion & Next Steps
Software architecture is not about choosing the "best" pattern — it is about choosing the right pattern for your context. A startup with three engineers choosing microservices is making a different mistake than an enterprise with five hundred engineers staying on a single monolith. Context determines correctness.
The patterns we explored — Client-Server, Layered, Pipe-and-Filter, Event-Driven, Primary-Replica, Microservices, and Monolithic — are tools in your architectural toolkit. The ADR is the mechanism for documenting why you reached for a particular tool. Together, they form the foundation for making and communicating architectural decisions.
In the next article, we go one level deeper — from system-level architecture down to module-level design, exploring the three forces that determine whether software is maintainable: modularity, coupling, and cohesion.
Next in the Series
In Part 7: Modularity, Coupling & Cohesion, we explore the three forces that determine whether your modules are maintainable, testable, and evolvable — from David Parnas's information hiding principle to measuring instability and abstractness.