Back to Software Engineering & Delivery Mastery Series

Part 40: LEAN Principles in Software Projects

May 14, 2026 Wasil Zafar 40 min read

Toyota revolutionised manufacturing. Then Mary and Tom Poppendieck brought those same principles to software. Learn how LEAN thinking — waste elimination, value stream mapping, WIP limits, and continuous improvement — transforms how teams deliver software.

Table of Contents

  1. Introduction
  2. The Seven Principles
  3. The Seven Wastes of Software
  4. Value Stream Mapping
  5. Flow & WIP Limits
  6. Pull Systems
  7. Continuous Improvement (Kaizen)
  8. Batch Size Reduction
  9. Lean Metrics
  10. Case Studies
  11. Exercises
  12. Conclusion & Next Steps

Introduction

In the late 1940s, Toyota was a struggling car manufacturer with limited capital and a devastated post-war economy. Out of necessity, Taiichi Ohno and Shigeo Shingo developed the Toyota Production System (TPS) — a radically different approach to manufacturing that eliminated waste, respected workers, and achieved extraordinary quality at scale.

Fifty years later, Mary and Tom Poppendieck published Lean Software Development: An Agile Toolkit (2003), translating Toyota's principles into the software domain. Their insight was profound: software development shares the same fundamental challenges as manufacturing — variability, queues, feedback delays, and the temptation to overproduce.

Why LEAN Thinking Transforms Delivery

Most process frameworks (Scrum, SAFe, XP) prescribe practices. LEAN is different — it provides thinking tools. Rather than saying "have a daily standup," LEAN asks: "Where is the waste in your system? What is preventing flow?" This makes LEAN universally applicable, whether you run Scrum sprints, Kanban boards, or something entirely custom.

Key Insight: LEAN is not a methodology — it is a mindset. You do not "implement LEAN" like you implement Scrum. You develop LEAN thinking and then apply it to whatever process you already use. Every team's LEAN implementation looks different because every team's waste profile is different.

The core LEAN question is deceptively simple: What activities add value from the customer's perspective, and what activities do not? Everything that does not directly contribute to delivering customer value is waste — and waste should be eliminated or minimised.

The Seven Principles of Lean Software Development

The Poppendiecks distilled Toyota's philosophy into seven principles specifically adapted for software teams. These principles form a coherent system — they reinforce each other and create compounding benefits when applied together.

1. Eliminate Waste

The foundational principle. Waste (Japanese: muda) is anything that does not add value from the customer's perspective. In manufacturing, waste is visible — scrap metal on the floor, parts waiting in inventory. In software, waste is invisible — half-finished features in a branch, meetings that produce no decisions, handoff documents nobody reads.

The first step is learning to see waste. Most teams are so accustomed to their waste that it becomes invisible. Value stream mapping (covered in Section 4) makes waste visible and quantifiable.

2. Amplify Learning

Software development is fundamentally a learning process, not a production process. You are not assembling known components — you are discovering what the right solution looks like. This means:

  • Short feedback cycles so you learn quickly whether you are on the right track
  • Iterative development to refine understanding through building
  • Pair programming and code reviews to spread knowledge
  • Retrospectives to learn from experience
  • Spikes and prototypes to reduce uncertainty before commitment

3. Decide as Late as Possible

Irreversible decisions made with incomplete information are expensive. LEAN advocates deferring commitment until the "last responsible moment" — the point at which not deciding becomes more costly than deciding with imperfect information.

This is not procrastination. It is strategic delay — keeping options open while gathering information. Examples:

  • Choose your database after understanding access patterns (not before writing the first line of code)
  • Decide on microservices vs monolith after understanding team boundaries
  • Defer UI framework choice until user research clarifies interaction patterns

4. Deliver as Fast as Possible

Speed is not about working harder — it is about reducing cycle time. The faster you deliver, the sooner you get feedback, the less inventory accumulates, and the more responsive you are to change. Speed comes from:

  • Eliminating queues and wait times
  • Reducing batch sizes
  • Automating repetitive work
  • Removing handoffs between teams

5. Empower the Team

Toyota learned that the people closest to the work are the best positioned to improve it. In software, this means:

  • Developers choose their tools and approaches
  • Teams own their delivery pipeline end-to-end
  • Decisions are pushed down to the lowest level with sufficient context
  • Managers create conditions for success rather than directing work

6. Build Integrity In

Quality is not inspected into a product — it is built in from the start. In manufacturing, this means designing the process so defects cannot occur (poka-yoke). In software, this means:

  • Test-driven development (TDD) — tests before code
  • Continuous integration — catch problems immediately
  • Refactoring — maintain conceptual integrity as the system grows
  • Automated quality gates — prevent bad code from progressing

7. Optimize the Whole

Local optimisation often causes global degradation. A team that optimises its own throughput by batching large PRs may slow down every other team that depends on their changes. LEAN thinking requires systems thinking — optimising the entire value stream, not individual stations.

Case Study

The Sub-Optimisation Trap

A platform team measured their success by "number of features shipped." They shipped 40 features in a quarter — a record. But downstream teams could only consume 12 of those features because the documentation was incomplete and APIs were inconsistent. The platform team optimised their throughput at the expense of system throughput. LEAN thinking would measure the platform team by features successfully adopted by consumers, not features shipped.

Systems Thinking Local vs Global Throughput

The Seven Wastes of Software

The Poppendiecks mapped Toyota's seven wastes of manufacturing to their software equivalents. Learning to recognise these wastes is the first step toward eliminating them.

Manufacturing Waste Software Equivalent Examples Impact
Inventory Partially Done Work Unmerged branches, undeployed features, half-written specs Becomes stale, merge conflicts, delayed feedback
Over-production Extra Features Gold-plating, "just in case" features, unused configuration options Maintenance burden, complexity, wasted effort
Extra Processing Relearning Lost knowledge, poor documentation, team member departure without handover Repeated mistakes, slow onboarding, duplicated effort
Transportation Handoffs Dev → QA → Ops transitions, requirements thrown over walls, approval chains Information loss, delays, context switching
Motion Task Switching Context switching between projects, interrupt-driven work, multi-tasking Cognitive load, reduced focus, lower quality
Waiting Delays Waiting for code review, waiting for approvals, waiting for environments Blocked developers, slow lead time, frustration
Defects Defects Bugs found in production, rework, misunderstood requirements Rework, customer impact, firefighting
The Seven Wastes of Software Development
mindmap
    root((Seven Wastes))
        Partially Done Work
            Unmerged branches
            Feature flags never removed
            Specs without implementation
        Extra Features
            Gold-plating
            Unused config options
            Premature abstraction
        Relearning
            Lost tribal knowledge
            No documentation
            Repeated mistakes
        Handoffs
            Dev to QA walls
            Approval chains
            Ticket ping-pong
        Task Switching
            Multiple projects
            Interrupt-driven culture
            Slack/email overload
        Delays
            Waiting for review
            Environment provisioning
            Dependency on other teams
        Defects
            Production bugs
            Misunderstood requirements
            Integration failures
                            

How to Identify Waste in Your Team

Waste identification requires observation and measurement. Here are practical techniques:

  • Walk the board: For every item on your Kanban board, ask "Is someone actively working on this right now?" Items that are not being worked on are partially done work waste.
  • Measure wait time: Track how long work items spend in each column. If items spend 3 days "In Review" but only 30 minutes of actual review time, you have 2.98 days of delay waste.
  • Count handoffs: How many people must touch a feature between "idea" and "production"? Each handoff loses information and adds delay.
  • Ask "Who uses this?": For every feature, report, or meeting — who actually uses the output? Features nobody uses are extra features waste.

Value Stream Mapping

A Value Stream Map (VSM) is a visual representation of the entire flow from customer request to delivered value. It shows every step, the time spent actively working, the time spent waiting, and the handoffs between people or teams.

Value Stream Mapping was originally developed by Mike Rother and John Shook in Learning to See (1999) for manufacturing. Karen Martin and Mike Osterling adapted it for knowledge work in Value Stream Mapping (2014).

Step-by-Step: Creating a Value Stream Map

  1. Define the scope: What is the start event (e.g., "feature request created") and end event (e.g., "feature live in production")?
  2. Walk the process: Identify every step the work item passes through. Include waiting states.
  3. Measure times: For each step, record process time (active work) and wait time (sitting idle).
  4. Identify handoffs: Mark every point where work transfers between people or teams.
  5. Calculate flow efficiency: Total process time ÷ total elapsed time × 100%.
  6. Identify bottlenecks: Where are the longest wait times? Where does WIP pile up?
Example Value Stream Map — Feature Delivery
flowchart LR
    A["Feature Request\n(Wait: 5d)"] --> B["Prioritisation\n(Work: 1h | Wait: 3d)"]
    B --> C["Design\n(Work: 4h | Wait: 2d)"]
    C --> D["Development\n(Work: 16h | Wait: 1d)"]
    D --> E["Code Review\n(Work: 1h | Wait: 2d)"]
    E --> F["QA Testing\n(Work: 3h | Wait: 3d)"]
    F --> G["Deployment\n(Work: 0.5h | Wait: 1d)"]
    G --> H["Live in Production"]
                            

Flow Efficiency

Flow efficiency is the ratio of value-adding time to total elapsed time:

Flow Efficiency = Process Time ÷ Total Lead Time × 100%

In the example above:

  • Total process time: 1h + 4h + 16h + 1h + 3h + 0.5h = 25.5 hours
  • Total elapsed time: 5d + 3d + 2d + 1d + 2d + 3d + 1d = 17 days = 136 hours
  • Flow efficiency: 25.5 ÷ 136 = 18.75%
Typical flow efficiency in software teams is 5–15%. This means that for every hour of actual work, the feature waits 6–20 hours doing nothing. The biggest opportunity for improvement is almost always reducing wait time, not making developers type faster.

Flow & WIP Limits

Little's Law

John D.C. Little proved in 1961 that for any stable system:

Lead Time = WIP ÷ Throughput

This has profound implications for software delivery:

  • If your throughput is 5 items/week and you have 20 items in progress, your lead time is 4 weeks.
  • To halve your lead time without changing throughput, halve your WIP.
  • The fastest way to improve lead time is to reduce WIP.
# Little's Law Calculator
def calculate_lead_time(wip: int, throughput: float) -> float:
    """
    Little's Law: L = W / λ (Lead Time = WIP / Throughput)

    Args:
        wip: Number of items currently in progress
        throughput: Items completed per time unit (e.g., per week)

    Returns:
        Average lead time in the same time unit as throughput
    """
    if throughput <= 0:
        raise ValueError("Throughput must be positive")
    return wip / throughput

# Example: Team with 15 items in progress, completing 5 per week
current_wip = 15
weekly_throughput = 5.0

lead_time = calculate_lead_time(current_wip, weekly_throughput)
print(f"Current lead time: {lead_time} weeks")  # 3.0 weeks

# If we reduce WIP to 8:
reduced_wip = 8
new_lead_time = calculate_lead_time(reduced_wip, weekly_throughput)
print(f"New lead time: {new_lead_time} weeks")  # 1.6 weeks
print(f"Improvement: {((lead_time - new_lead_time) / lead_time) * 100:.0f}%")  # 47%

Kanban WIP Limits

WIP limits are constraints placed on each stage of your workflow. They are the mechanism that turns a push system into a pull system. When a stage reaches its WIP limit, no new work can enter until existing work exits.

Setting WIP limits forces teams to:

  • Stop starting, start finishing — complete existing work before taking on new work
  • Expose bottlenecks — when a stage is full, upstream stages are blocked, making the constraint visible
  • Collaborate — team members swarm on blocked items rather than starting new items
  • Reduce multitasking — fewer items in progress means more focus per item
# Example Kanban Board with WIP Limits
board:
  columns:
    - name: "Backlog"
      wip_limit: null  # No limit on ideas
    - name: "Ready"
      wip_limit: 5     # Only 5 items refined and ready
    - name: "In Dev"
      wip_limit: 3     # Max 3 items being coded
    - name: "In Review"
      wip_limit: 3     # Max 3 items awaiting/in review
    - name: "Testing"
      wip_limit: 2     # Max 2 items being tested
    - name: "Done"
      wip_limit: null  # No limit on completed items

# Rule: If "In Dev" is at WIP limit (3), no one pulls
# from "Ready" until a dev item moves to "In Review"
Starting WIP Limits: A good starting point is to set WIP limits equal to the number of people who work in that stage. For a team of 4 developers, start with a "In Dev" WIP limit of 4. Then gradually reduce it — many teams find that WIP = number of pairs (2) or even WIP = 1 produces the best flow.

Pull Systems

Traditional software development uses a push system: managers assign work to developers based on priority lists, capacity planning, and sprint commitments. Work is pushed into the system regardless of whether the system can handle it.

LEAN advocates pull systems: work is only started when there is capacity to process it. Workers pull the next item when they finish their current item. This is the fundamental mechanism behind Kanban.

Aspect Push System Pull System
Work assignment Manager assigns work to people People pull work when ready
WIP control Grows unbounded Constrained by limits
Overload signal Team burnout (late signal) WIP limit hit (early signal)
Bottleneck visibility Hidden in queues Exposed immediately
Lead time Unpredictable Stabilises over time

Why does pull reduce overproduction? Because the system only produces what is needed, when it is needed. No more "building features in advance" that may never be used. No more stacking up a backlog of code reviews that creates merge conflicts.

Continuous Improvement (Kaizen)

Kaizen (改善) means "change for better" in Japanese. It is the philosophy of small, incremental, continuous improvements rather than large, disruptive transformations. In software teams, Kaizen manifests as:

  • Retrospectives — regular team reflection on what to improve (Scrum's Sprint Retrospective is a Kaizen event)
  • Process experiments — try a small change for one sprint, measure the impact, decide whether to keep it
  • Improvement backlogs — treat process improvements as work items alongside feature work
  • Gemba walks — managers observe actual work processes (in software: sit with developers, watch the deployment process)

The PDCA Cycle

PDCA (Plan-Do-Check-Act) is the scientific method applied to process improvement:

PDCA (Deming) Cycle
flowchart TD
    P["PLAN\nIdentify problem\nAnalyse root cause\nHypothesise solution"] --> D["DO\nImplement change\nSmall scale first\nCollect data"]
    D --> C["CHECK\nMeasure results\nCompare to baseline\nDid it work?"]
    C --> A["ACT\nStandardise if successful\nAdjust if partial\nAbandon if failed"]
    A --> P
                            

A3 Problem Solving

The A3 report is a structured problem-solving format that fits on a single A3-sized sheet of paper. It forces clarity and brevity. The format:

  1. Background: Why is this problem worth solving?
  2. Current condition: What is happening now? (Data, not opinions)
  3. Goal: What should be happening? (Measurable target)
  4. Root cause analysis: Why is there a gap? (5 Whys, fishbone diagrams)
  5. Countermeasures: What changes will address root causes?
  6. Implementation plan: Who, what, when?
  7. Follow-up: How will we know it worked?
# A3 Problem Solving Template — YAML Format
title: "Code Review Bottleneck"
owner: "Platform Team"
date: "2026-05-14"

background: |
  Code reviews are our largest source of delay.
  Average wait time: 2.3 days per PR.
  Team frustration score: 7/10.

current_condition:
  metric: "Time from PR opened to first review"
  baseline: "2.3 days average (measured over 4 weeks)"
  data_source: "GitHub PR analytics"

goal:
  target: "First review within 4 hours"
  timeline: "Achieve within 6 weeks"

root_cause_analysis:
  - why: "Reviews wait 2+ days"
    because: "Reviewers batch reviews to end of day"
  - why: "Reviewers batch reviews"
    because: "Deep work is interrupted by review requests"
  - why: "Reviews interrupt deep work"
    because: "No dedicated review time in schedule"

countermeasures:
  - action: "Establish 'review hour' — 10-11am daily"
    owner: "All developers"
    expected_impact: "Reviews started within 4h"
  - action: "Reduce PR size to <200 lines"
    owner: "All developers"
    expected_impact: "Reviews take 15min not 45min"

follow_up:
  check_date: "2026-06-25"
  success_metric: "95th percentile first review < 4h"

Batch Size Reduction

Batch size is one of the most powerful levers in software delivery. Smaller batches:

  • Flow through the system faster (Little's Law)
  • Get feedback sooner
  • Have lower risk (less change = less that can go wrong)
  • Are easier to review (200-line PRs are reviewed in minutes, 2000-line PRs take days)
  • Have fewer merge conflicts
  • Are easier to roll back

The U-Curve of Batch Size

There is an economic tradeoff in batch size:

  • Transaction cost: The overhead of processing a batch (creating a PR, running CI, deploying). This cost is fixed per batch, so smaller batches increase total transaction cost.
  • Holding cost: The cost of carrying inventory (merge conflicts, delayed feedback, increased risk). This cost increases with batch size.

The optimal batch size minimises total cost (transaction + holding). The LEAN approach is to reduce transaction costs (faster CI, cheaper deployments, automated testing) so that the optimal batch size shrinks.

Connection to DORA Metrics: Teams that deploy more frequently (higher deployment frequency) are working in smaller batches. The Accelerate research shows these teams also have lower change failure rates. Smaller batches are both faster AND safer — this is not a tradeoff.

The ideal — single-piece flow — means each change flows through the entire system independently. In software, this is trunk-based development with feature flags: every commit is a potential release. Transaction costs must be near zero (fully automated CI/CD) for this to work.

Lean Metrics

LEAN teams measure flow, not busyness. The key metrics:

Metric Definition Target Direction How to Measure
Lead Time Time from request to delivery ↓ Lower is better Ticket created → deployed to production
Cycle Time Time from work started to work completed ↓ Lower is better "In Progress" → "Done"
Throughput Items completed per time period ↑ Higher is better Count items entering "Done" per week
WIP Items currently in progress ↓ Lower is better Count items between "In Progress" and "Done"
Flow Efficiency Active time ÷ total elapsed time ↑ Higher is better Value stream mapping

Cumulative Flow Diagrams (CFDs)

A CFD is the most powerful visualisation for LEAN teams. It shows the cumulative count of items in each workflow state over time. The vertical distance between bands shows WIP; the horizontal distance shows lead time; the slope of the "Done" band shows throughput.

# Generating a Cumulative Flow Diagram
import matplotlib.pyplot as plt
import numpy as np

# Simulated data: items in each state per day
days = np.arange(1, 31)
backlog = np.maximum(50 - days * 1.5, 10)
in_progress = np.minimum(days * 0.5, 8) + np.random.randint(0, 3, 30)
done = np.cumsum(np.random.poisson(2, 30))

# Stack the areas
fig, ax = plt.subplots(figsize=(12, 6))
ax.stackplot(days, done, in_progress, backlog,
             labels=['Done', 'In Progress', 'Backlog'],
             colors=['#3B9797', '#16476A', '#BF092F'],
             alpha=0.8)

ax.set_xlabel('Day')
ax.set_ylabel('Cumulative Items')
ax.set_title('Cumulative Flow Diagram')
ax.legend(loc='upper left')
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Case Studies

Case Study

Toyota's Influence on Tech

Toyota's production system directly influenced: Kanban (David Anderson, 2010), Lean Startup (Eric Ries, 2011), Continuous Delivery (Jez Humble & Dave Farley, 2010), and DevOps (The Phoenix Project, Gene Kim, 2013). All trace their intellectual heritage to Taiichi Ohno's shop floor innovations. The core insight that crossed from manufacturing to software: flow trumps utilisation — it is better to have developers waiting for work than work waiting for developers.

Toyota DevOps Heritage Flow
Case Study

Amazon's Two-Pizza Teams as Lean Units

Jeff Bezos's "two-pizza teams" (teams small enough to feed with two pizzas, ~6-8 people) are LEAN units in disguise. Each team owns a service end-to-end — they build it, deploy it, run it, and respond to its incidents. This eliminates handoffs (waste #4), enables fast decision-making (principle #3), and empowers the team (principle #5). The result: Amazon deploys to production every 11.7 seconds on average, with thousands of independent teams operating as autonomous LEAN value streams.

Amazon Team Autonomy Microservices
Case Study

Spotify's Squad Model Through a Lean Lens

Spotify organised into "squads" (stream-aligned teams), "tribes" (collections of related squads), "chapters" (discipline communities), and "guilds" (cross-cutting interest groups). Through the LEAN lens: squads minimise handoffs by owning features end-to-end; tribes optimise the whole (principle #7) by aligning related work; chapters amplify learning (principle #2) by sharing expertise across squads. The model is not perfect — Spotify themselves have acknowledged it evolved significantly — but it demonstrates LEAN thinking at scale.

Spotify Scaling Organisation Design

Exercises

Exercise 1 — Create a Value Stream Map: Map your team's delivery process from "feature request" to "live in production." For each step, estimate the process time (actual work) and wait time (sitting idle). Calculate your flow efficiency. Where is the largest source of waste?
Exercise 2 — Identify the Seven Wastes: Over the next sprint, keep a "waste log." Every time you encounter one of the seven wastes, note it down with: (a) which waste type, (b) how much time it consumed, (c) a proposed countermeasure. At the end of the sprint, rank them by impact and pick the top one to address.
Exercise 3 — Set WIP Limits: Examine your current Kanban board or backlog. How many items are "in progress" right now? Apply Little's Law to calculate your current lead time. Then set a WIP limit that would halve your lead time. Try it for two weeks and measure the effect on cycle time and throughput.
Exercise 4 — Calculate Flow Efficiency: Pick 5 recently completed work items. For each one, calculate the total elapsed time and the active working time. Compute the flow efficiency for each. What is the average? What would it take to double it?

Conclusion & Next Steps

LEAN thinking is the most powerful process framework available to software teams because it is not prescriptive — it gives you thinking tools to diagnose and improve any delivery system. The seven principles provide the philosophy; the seven wastes give you a taxonomy of problems; value stream mapping makes those problems visible; WIP limits and pull systems provide the mechanism for improvement; and Kaizen ensures you never stop getting better.

The most important shift is this: stop optimising for resource utilisation and start optimising for flow. A developer who is 100% utilised produces maximum WIP, maximum wait times, and minimum throughput. A developer who is 80% utilised has slack to respond to pull signals, help teammates, and maintain flow.

Next in the Series

In Part 41: Teams, Speed vs Quality & High-Performance Delivery, we explore the human side of delivery — team topologies, Conway's Law, the speed-vs-quality myth, psychological safety, and engineering culture patterns that scale.