Introduction — Testing in Agile Is Fundamentally Different
In waterfall, testing happens after development is "complete." There is a test phase, a test team, and a handoff. In agile, there is no test phase. There is no handoff. Testing is a continuous activity performed by the whole team, within the sprint, alongside development.
This fundamental difference means that agile testing is not just "doing the same testing faster." It requires different techniques, different roles, different thinking. The goal shifts from "finding defects" to "preventing defects" — and from "proving it works" to "building confidence continuously."
The Mindset Shift
| Aspect | Waterfall Testing | Agile Testing |
|---|---|---|
| When | After development phase | Continuously within each sprint |
| Who | Dedicated QA team | Whole team (devs + testers + PO) |
| Goal | Find defects before release | Prevent defects; build confidence |
| Automation | Separate automation phase after manual testing | Automation written alongside code |
| Feedback loop | Weeks to months | Minutes to hours |
| Documentation | Test plans, test cases, RTM | Living documentation (BDD scenarios) |
| Defect handling | Defect backlog, triaging meetings | Same-day fix, zero-bug sprints |
The Agile Testing Quadrants
Brian Marick's Testing Quadrants (popularised by Lisa Crispin and Janet Gregory in Agile Testing) provide a framework for understanding what kinds of testing serve what purposes. The quadrants map tests along two axes: business-facing vs technology-facing and guiding development vs critiquing the product.
quadrantChart
title Agile Testing Quadrants
x-axis "Technology-Facing" --> "Business-Facing"
y-axis "Critique Product" --> "Guide Development"
quadrant-1 "Q2: Business-Facing, Guide Dev"
quadrant-2 "Q1: Technology-Facing, Guide Dev"
quadrant-3 "Q4: Technology-Facing, Critique"
quadrant-4 "Q3: Business-Facing, Critique"
"Unit Tests": [0.2, 0.8]
"Component Tests": [0.3, 0.7]
"Functional Tests": [0.7, 0.8]
"Story Tests (BDD)": [0.8, 0.75]
"Prototypes": [0.85, 0.65]
"Exploratory Testing": [0.8, 0.3]
"Usability Testing": [0.75, 0.2]
"UAT": [0.85, 0.35]
"Performance Tests": [0.25, 0.3]
"Security Tests": [0.2, 0.2]
"Load Tests": [0.3, 0.25]
What Fits in Each Quadrant
| Quadrant | Purpose | Activities | Automated? | Who |
|---|---|---|---|---|
| Q1 (Tech, Guide) | Support the team — verify code works as designed | Unit tests, component tests, integration tests | Fully automated | Developers |
| Q2 (Business, Guide) | Validate we are building the right thing | BDD scenarios, story tests, prototypes, simulations | Automated where possible | Team + PO |
| Q3 (Business, Critique) | Evaluate the product from user perspective | Exploratory testing, usability testing, UAT, beta testing | Manual (human judgment) | Testers + Users |
| Q4 (Tech, Critique) | Critique non-functional properties | Performance tests, security scans, load tests, stress tests | Automated with tools | Specialists |
In-Sprint Testing — Same-Day Quality
In-sprint testing means that every user story is developed, tested, and verified within the same sprint — ideally within the same day or two. There is no "we'll test it next sprint" and no separate QA sprint. The Definition of Done includes testing.
The Same-Day Defect Resolution Goal
When a defect is found during the sprint (not weeks later), the cost of fixing it is minimal. The developer still has context. The code is fresh. The fix is a 30-minute task, not a 2-day archaeological dig through code you wrote months ago.
- Day 1: Developer picks up story, writes code + unit tests
- Day 1-2: Tester pairs with developer to write acceptance tests
- Day 2: Story is "code complete" — tests passing, ready for review
- Day 2-3: Exploratory testing finds edge case
- Day 3: Developer fixes edge case same day, story is Done
Definition of Done — Testing Criteria
# Definition of Done — includes testing requirements
definition_of_done:
code:
- Code peer-reviewed and approved
- No compiler warnings or linter errors
- Feature flag configured (if applicable)
testing:
- Unit tests written and passing (>80% branch coverage for new code)
- Integration tests passing
- Acceptance criteria verified (BDD scenarios green)
- Exploratory testing session completed (30 min minimum)
- No open defects of severity Critical or High
- Performance baseline not degraded (p99 latency)
automation:
- New automated tests added to CI pipeline
- No increase in flaky test count
- Test data factories updated (if new domain objects)
documentation:
- BDD scenarios serve as living documentation
- API changes documented in OpenAPI spec
- Release notes updated
Behavior-Driven Development (BDD)
BDD bridges the gap between business requirements and automated tests by expressing behaviour in a structured natural language that both humans and machines can read. The Given-When-Then format is a shared language between Product Owners, Developers, and Testers.
Given-When-Then (Gherkin Language)
# features/shopping_cart.feature
Feature: Shopping Cart
As a customer
I want to manage items in my shopping cart
So that I can purchase products I need
Background:
Given the product catalog contains:
| name | price | stock |
| Widget A | 19.99 | 100 |
| Gadget B | 49.99 | 50 |
| Doohickey | 9.99 | 200 |
Scenario: Add item to empty cart
Given my cart is empty
When I add "Widget A" to my cart
Then my cart should contain 1 item
And the cart total should be $19.99
Scenario: Add multiple quantities
Given my cart is empty
When I add 3 of "Widget A" to my cart
Then my cart should contain 3 items
And the cart total should be $59.97
Scenario: Remove item from cart
Given my cart contains 2 of "Gadget B"
When I remove "Gadget B" from my cart
Then my cart should be empty
And the cart total should be $0.00
Scenario: Cannot add out-of-stock item
Given "Widget A" has 0 in stock
When I try to add "Widget A" to my cart
Then I should see an error "Widget A is out of stock"
And my cart should be empty
Scenario Outline: Discount tiers
Given my cart total is <total>
When the discount is calculated
Then the discount should be <discount>%
Examples:
| total | discount |
| $49.99 | 0 |
| $100.00 | 5 |
| $200.00 | 10 |
| $500.00 | 15 |
Implementation with pytest-bdd
import pytest
from pytest_bdd import scenarios, given, when, then, parsers
from shopping_cart import Cart, Product, ProductCatalog
# Load all scenarios from the feature file
scenarios('features/shopping_cart.feature')
@pytest.fixture
def catalog():
return ProductCatalog()
@pytest.fixture
def cart():
return Cart()
@given("the product catalog contains:", target_fixture="catalog")
def catalog_with_products(catalog, datatable):
for row in datatable:
catalog.add(Product(
name=row["name"],
price=float(row["price"]),
stock=int(row["stock"])
))
return catalog
@given("my cart is empty", target_fixture="cart")
def empty_cart():
return Cart()
@given(parsers.parse('my cart contains {quantity:d} of "{product_name}"'))
def cart_with_items(cart, catalog, quantity, product_name):
product = catalog.get(product_name)
cart.add(product, quantity)
@when(parsers.parse('I add "{product_name}" to my cart'))
def add_to_cart(cart, catalog, product_name):
product = catalog.get(product_name)
cart.add(product, 1)
@when(parsers.parse('I add {quantity:d} of "{product_name}" to my cart'))
def add_quantity_to_cart(cart, catalog, quantity, product_name):
product = catalog.get(product_name)
cart.add(product, quantity)
@then(parsers.parse("my cart should contain {count:d} item"))
@then(parsers.parse("my cart should contain {count:d} items"))
def cart_item_count(cart, count):
assert cart.item_count == count
@then(parsers.parse("the cart total should be ${total:f}"))
def cart_total(cart, total):
assert cart.total == pytest.approx(total, rel=1e-2)
Acceptance Test-Driven Development (ATDD)
ATDD takes BDD one step further: acceptance tests are written before development begins, as part of sprint planning. The "Three Amigos" — Product Owner, Developer, and Tester — collaborate to define acceptance criteria as executable specifications.
The Three Amigos Conversation
sequenceDiagram
participant PO as Product Owner
participant Dev as Developer
participant QA as Tester
PO->>Dev: Here's the user story
PO->>QA: Here's the user story
Note over PO,QA: Three Amigos Session (30 min)
QA->>PO: What about edge case X?
PO->>QA: Good catch — here's the expected behaviour
Dev->>PO: Is this technically feasible in sprint?
PO->>Dev: Yes, simplified version is fine
QA->>Dev: I'll write acceptance scenarios
Dev->>QA: I'll make them pass
Note over Dev,QA: Development + Testing in parallel
QA-->>Dev: Scenarios are green ✓
Dev-->>PO: Story is Done ✓
TDD vs BDD vs ATDD — When to Use Each
| Practice | Scope | Written By | Language | Primary Benefit |
|---|---|---|---|---|
| TDD | Method/function level | Developer alone | Code (test framework) | Better code design, regression safety |
| BDD | Feature/behaviour level | Team collaboration | Gherkin (natural language) | Shared understanding, living docs |
| ATDD | Story/acceptance level | Three Amigos before dev | Executable specifications | Right thing built, no rework |
Rule of thumb: Use TDD for internal code quality. Use BDD for features that cross team boundaries. Use ATDD for stories where requirements are ambiguous or have high business impact.
Exploratory Testing — Human Intelligence
Exploratory testing is simultaneous learning, test design, and test execution. Unlike scripted testing (where you follow predefined steps), exploratory testing uses human creativity, domain knowledge, and intuition to find bugs that automation cannot.
James Bach defines it as: "Simultaneously learning about the system, designing tests, and executing those tests." It is not ad-hoc or unstructured — it is session-based with clear charters, time-boxes, and debriefs.
Session-Based Test Management (SBTM)
# Exploratory Testing Charter
session:
id: ET-2026-05-034
tester: "Jane Developer"
date: "2026-05-13"
duration: 45 minutes
charter: |
Explore the checkout flow with international addresses
to discover edge cases in address validation
using various country formats (UK, Germany, Japan, Brazil)
areas:
- Checkout address form
- Address validation API
- Shipping cost calculation
- Order confirmation display
notes: |
- UK postcodes with spaces (SW1A 1AA) accepted correctly ✓
- German addresses with umlauts (Müller Straße) display correctly ✓
- Japanese addresses: 3-line format not supported — BUG filed
- Brazilian CEP format (12345-678) rejected by validator — BUG filed
- Empty state/province field causes 500 error for countries without states — BUG filed
bugs_found: 3
- JIRA-4521: Japanese address format not supported (Medium)
- JIRA-4522: Brazilian CEP postal code regex too restrictive (Low)
- JIRA-4523: 500 error when state/province is empty (High)
insights: |
The address validation relies on a US-centric regex pattern.
Recommend: Replace with Google Address Validation API or
country-specific validation libraries.
debrief_notes: |
Shared findings in standup. High-severity bug fixed same day.
Medium bugs added to next sprint backlog.
Exploratory Testing Effectiveness (Itkonen et al., 2012)
A study at the University of Helsinki compared exploratory testing with scripted test case execution across four software projects. Key findings: exploratory testing found 48% more defects per hour than scripted testing, and the defects found were of higher severity (more likely to affect users). However, scripted tests provided better coverage of specified requirements. The researchers concluded that the approaches are complementary, not competing — exploratory testing excels at finding unexpected issues that scripted tests cannot anticipate, while scripted/automated tests provide regression safety. The optimal strategy uses both: automated tests for the known, exploratory testing for the unknown.
Testing & the Sprint Cycle
Testing is not a separate activity that happens after coding in a sprint. It is integrated into every sprint ceremony:
flowchart TD
A[Sprint Planning] --> B[Daily Development]
B --> C[Daily Standup]
C --> B
B --> D[Sprint Review]
D --> E[Retrospective]
E --> A
A -.- A1[Testability discussions]
A -.- A2[Write acceptance criteria]
A -.- A3[Identify test data needs]
B -.- B1[TDD / Write tests alongside code]
B -.- B2[Pair testing sessions]
B -.- B3[Exploratory testing]
C -.- C1[Test progress updates]
C -.- C2[Blocker escalation]
D -.- D1[Demo with confidence]
D -.- D2[No surprises - all tested]
E -.- E1[Testing process improvements]
E -.- E2[Flakiness review]
Testing in Each Ceremony
- Sprint Planning: Discuss testability of stories. Estimate testing effort. Identify risky areas needing exploratory sessions. Define acceptance criteria as team.
- Daily Standup: Report test progress (not just code progress). Raise blockers — "I can't test story X because test data isn't ready." Coordinate pairing sessions.
- Sprint Review: Demonstrate features with confidence because they are already tested. No "we still need to verify this" disclaimers. Show BDD scenario results as evidence.
- Retrospective: Review testing process. Were stories held up by testing? Were there defects that should have been caught earlier? Is the automation suite growing sustainably?
Test-First Acceptance — Specifications as Tests
The most powerful pattern in agile testing is writing acceptance criteria as executable specifications before development starts. This eliminates ambiguity, prevents rework, and creates living documentation automatically.
# Write BEFORE development — this IS the requirement
Feature: Loyalty Points Calculation
As a returning customer
I want to earn loyalty points on purchases
So that I can redeem them for discounts
Rule: Earn 1 point per $10 spent (rounded down)
Scenario: Standard purchase earns points
Given I am a loyalty member
When I complete a purchase of $75.50
Then I should earn 7 loyalty points
And my points balance should increase by 7
Rule: Double points on birthday month
Scenario: Birthday month doubles points
Given I am a loyalty member
And today is within my birthday month
When I complete a purchase of $50.00
Then I should earn 10 loyalty points
Rule: Points expire after 12 months of inactivity
Scenario: Points expire when inactive
Given I am a loyalty member with 150 points
And I have not made a purchase in 13 months
When the nightly expiration job runs
Then my points balance should be 0
And I should receive an expiration notification email
When the developer picks up this story, they have zero ambiguity about what to build. The tester can run these scenarios immediately when code is delivered. The Product Owner can read them and confirm they match intent. Everyone speaks the same language.
Whole-Team Quality
In mature agile teams, quality is everyone's responsibility, not just the tester's job. This does not mean everyone does the same testing — it means everyone contributes to quality in their area of expertise.
| Role | Quality Contribution | Testing Activities |
|---|---|---|
| Developer | Code quality, design quality, test automation | Unit tests, integration tests, TDD, code review |
| Tester / QA Engineer | Test strategy, exploratory skills, risk analysis | Exploratory testing, E2E automation, test data strategy |
| Product Owner | Clear acceptance criteria, priority decisions | Define acceptance scenarios, UAT, prioritise bug fixes |
| Scrum Master | Process quality, removing impediments | Ensure testing is in DoD, facilitate three amigos |
| DevOps/Platform | Pipeline reliability, environment quality | CI/CD quality gates, test environment provisioning |
Common Agile Testing Mistakes
The Seven Deadly Sins of Agile Testing
Based on patterns observed across hundreds of agile teams, these are the most common testing anti-patterns that undermine sprint delivery:
| Mistake | Symptom | Root Cause | Fix |
|---|---|---|---|
| Mini-Waterfall | Dev in week 1, testing in week 2 of sprint | Testing not integrated into Definition of Done | Pair testing, story-level DoD with tests |
| QA Sprint | "Hardening sprint" or "stabilisation sprint" | Quality debt accumulating sprint over sprint | Zero-bug policy, same-sprint testing |
| Automation Backlog | Growing list of "tests to automate later" | Automation not part of story estimation | Include automation in story points |
| Testing Not Estimated | Stories finish "code complete" but not "done" | Only development effort estimated | Estimate dev + test + automation together |
| Tester Bottleneck | Stories queue up waiting for the one tester | Team relies on single person for all testing | Developers write tests, share testing responsibility |
| No Exploratory Time | Only scripted/automated tests, subtle bugs escape | 100% focus on automation, no time for thinking | Budget 20% of testing time for exploration |
| Requirements as Tests | Hundreds of UI tests mirroring requirements doc | Confusing "acceptance criteria" with "test the UI" | Test at the right level (API > UI) |
The Zero-Bug Sprint Goal
A "zero-bug sprint" does not mean "no bugs are found." It means every bug found during the sprint is fixed during the sprint. No defect backlog. No "we'll fix it later." This forces the team to:
- Limit work-in-progress (fewer stories = more focus = fewer bugs)
- Write tests before code (prevents defects instead of finding them)
- Fix bugs immediately while context is fresh (30-minute fix vs 2-day fix)
- Improve quality over time (fewer defects per sprint as practices mature)
Exercises
Put agile testing concepts into practice with your current team or a sample project.
Conclusion & Next Steps
Agile testing is not "do testing faster." It is a fundamental rethinking of when, how, and by whom testing happens. The key practices are: testing quadrants (balance all four), BDD/ATDD (executable specifications before code), exploratory testing (human creativity finds what automation cannot), and whole-team quality (no handoffs, shared responsibility).
The most impactful change a team can make is adopting the zero-bug sprint policy: every defect found in the sprint is fixed in the sprint. This single practice forces all the other good habits — smaller stories, test-first development, immediate feedback, and continuous quality improvement.
Next in the Series
In Part 37: AI in Software Development, we will explore how artificial intelligence is transforming every stage of the software delivery lifecycle — from AI-assisted coding and automated test generation to intelligent deployment decisions and self-healing systems.