Introduction — The Base of the Pyramid
Unit tests form the widest, fastest, and cheapest layer of the testing pyramid. They verify individual units of code — functions, methods, classes — in complete isolation from external dependencies. A well-tested codebase might have thousands of unit tests that execute in under ten seconds.
But what makes a test a "unit" test? The industry has debated this for decades. Some define "unit" as a single function; others define it as a single class or module. The practical definition comes down to three properties:
Properties of Good Unit Tests
| Property | What It Means | Why It Matters |
|---|---|---|
| Fast | Executes in milliseconds, not seconds | Developers run them constantly during development |
| Isolated | No filesystem, network, database, or external service | Can run anywhere, in any order, in parallel |
| Deterministic | Same input always produces same result | No flaky failures, no time-dependent behaviour |
| Self-Validating | Pass or fail without human interpretation | Automation requires binary outcomes |
| Timely | Written close to (or before) the production code | Catches bugs when they are cheapest to fix |
Anatomy of a Unit Test
Every unit test follows the same three-phase structure, regardless of language or framework:
Arrange-Act-Assert (AAA)
- Arrange — Set up the test data, create objects, configure mocks
- Act — Call the function or method under test
- Assert — Verify the result matches expectations
Some teams prefer the BDD-style equivalent: Given-When-Then. The semantics are identical.
Naming Conventions
Good test names communicate what is being tested, under what conditions, and what the expected outcome is. Common patterns:
methodName_condition_expectedResult— e.g.,calculateDiscount_orderOver100_returns10Percentshould [expected behaviour] when [condition]— e.g.,should return 10% discount when order exceeds 100test_[scenario](Python convention) — e.g.,test_discount_applied_for_large_orders
JavaScript (Jest) — Complete Examples
// calculator.js — The module under test
function add(a, b) {
if (typeof a !== 'number' || typeof b !== 'number') {
throw new TypeError('Arguments must be numbers');
}
return a + b;
}
function divide(a, b) {
if (b === 0) throw new Error('Division by zero');
return a / b;
}
module.exports = { add, divide };
// calculator.test.js — Unit tests using Jest
const { add, divide } = require('./calculator');
describe('add', () => {
// Arrange-Act-Assert pattern
test('adds two positive numbers', () => {
// Arrange
const a = 2;
const b = 3;
// Act
const result = add(a, b);
// Assert
expect(result).toBe(5);
});
test('adds negative numbers correctly', () => {
expect(add(-1, -2)).toBe(-3);
});
test('throws TypeError for non-numeric input', () => {
expect(() => add('2', 3)).toThrow(TypeError);
expect(() => add(2, null)).toThrow(TypeError);
});
});
describe('divide', () => {
test('divides two numbers', () => {
expect(divide(10, 2)).toBe(5);
});
test('returns float for non-even division', () => {
expect(divide(7, 2)).toBeCloseTo(3.5);
});
test('throws error when dividing by zero', () => {
expect(() => divide(10, 0)).toThrow('Division by zero');
});
});
Python (pytest) — Complete Examples
# calculator.py — The module under test
def add(a: float, b: float) -> float:
if not isinstance(a, (int, float)) or not isinstance(b, (int, float)):
raise TypeError("Arguments must be numbers")
return a + b
def divide(a: float, b: float) -> float:
if b == 0:
raise ZeroDivisionError("Division by zero")
return a / b
# test_calculator.py — Unit tests using pytest
import pytest
def add(a, b):
if not isinstance(a, (int, float)) or not isinstance(b, (int, float)):
raise TypeError("Arguments must be numbers")
return a + b
def divide(a, b):
if b == 0:
raise ZeroDivisionError("Division by zero")
return a / b
# Test: add two positive numbers
def test_add_positive_numbers():
# Arrange
a, b = 2, 3
# Act
result = add(a, b)
# Assert
assert result == 5
# Test: add negative numbers
def test_add_negative_numbers():
assert add(-1, -2) == -3
# Test: add raises TypeError for non-numeric input
def test_add_raises_type_error():
with pytest.raises(TypeError):
add("2", 3)
# Test: divide two numbers
def test_divide_numbers():
assert divide(10, 2) == 5.0
# Test: divide returns float
def test_divide_returns_float():
assert divide(7, 2) == pytest.approx(3.5)
# Test: divide by zero raises error
def test_divide_by_zero():
with pytest.raises(ZeroDivisionError):
divide(10, 0)
describe/test blocks with nested structure. Pytest uses flat functions prefixed with test_. Both achieve the same isolation and clarity — choose based on your ecosystem.
Test Doubles
In unit testing, you need to isolate the code under test from its dependencies. Test doubles replace real collaborators with controlled substitutes. Gerard Meszaros' taxonomy (from xUnit Test Patterns) defines five types:
flowchart LR
TD[Test Double] --> DU[Dummy]
TD --> ST[Stub]
TD --> SP[Spy]
TD --> MO[Mock]
TD --> FA[Fake]
DU -.- DU_DESC["Fills parameter lists
Never actually used"]
ST -.- ST_DESC["Returns canned data
No verification"]
SP -.- SP_DESC["Records calls
Verified after act"]
MO -.- MO_DESC["Pre-programmed expectations
Verifies interactions"]
FA -.- FA_DESC["Working implementation
Simplified version"]
Dummy
A dummy fills a required parameter but is never actually used. It exists only to satisfy a function signature.
// JavaScript — Dummy example
// The logger is required by the constructor but never called in this test
const dummyLogger = {};
function createUser(name, logger) {
return { name, createdAt: Date.now() };
}
test('createUser returns user with name', () => {
const user = createUser('Alice', dummyLogger);
expect(user.name).toBe('Alice');
});
Stub
A stub provides canned responses to calls made during a test. It does not verify how it was called — only the test's assertions matter.
# Python — Stub example
from unittest.mock import MagicMock
def get_user_greeting(user_service, user_id):
"""Function under test — depends on user_service."""
user = user_service.find_by_id(user_id)
return f"Hello, {user['name']}!"
def test_greeting_uses_user_name():
# Arrange: stub the user service
stub_service = MagicMock()
stub_service.find_by_id.return_value = {"name": "Alice", "email": "alice@example.com"}
# Act
result = get_user_greeting(stub_service, 42)
# Assert: only verify the output, not how the stub was called
assert result == "Hello, Alice!"
Spy
A spy records information about how it was called. After the act phase, you verify calls were made correctly.
// JavaScript (Jest) — Spy example
function processOrder(order, notificationService) {
// Business logic...
const total = order.items.reduce((sum, item) => sum + item.price, 0);
notificationService.sendConfirmation(order.email, total);
return { status: 'confirmed', total };
}
test('processOrder sends confirmation email', () => {
// Arrange: spy on the notification service
const spyService = { sendConfirmation: jest.fn() };
const order = {
email: 'bob@example.com',
items: [{ price: 10 }, { price: 20 }]
};
// Act
const result = processOrder(order, spyService);
// Assert: verify the spy was called correctly
expect(spyService.sendConfirmation).toHaveBeenCalledWith('bob@example.com', 30);
expect(spyService.sendConfirmation).toHaveBeenCalledTimes(1);
expect(result.total).toBe(30);
});
Mock
A mock is pre-programmed with expectations. It verifies that specific interactions occur — often including call order, argument values, and call count. Mocks fail the test if expectations aren't met.
# Python — Mock with assertion on call
from unittest.mock import MagicMock, call
def transfer_funds(from_account, to_account, amount, audit_log):
"""Transfer money and log the transaction."""
from_account.debit(amount)
to_account.credit(amount)
audit_log.record(f"Transferred {amount} from {from_account.id} to {to_account.id}")
def test_transfer_funds_logs_transaction():
# Arrange: create mocks
from_acc = MagicMock(id="ACC-001")
to_acc = MagicMock(id="ACC-002")
mock_audit = MagicMock()
# Act
transfer_funds(from_acc, to_acc, 500, mock_audit)
# Assert: mock verifies interactions
from_acc.debit.assert_called_once_with(500)
to_acc.credit.assert_called_once_with(500)
mock_audit.record.assert_called_once_with("Transferred 500 from ACC-001 to ACC-002")
Fake
A fake has a working implementation but takes shortcuts. Common examples: in-memory databases, fake file systems, local email servers.
// JavaScript — Fake repository (in-memory implementation)
class FakeUserRepository {
constructor() {
this.users = new Map();
this.nextId = 1;
}
save(user) {
const id = this.nextId++;
const saved = { ...user, id };
this.users.set(id, saved);
return saved;
}
findById(id) {
return this.users.get(id) || null;
}
findAll() {
return Array.from(this.users.values());
}
}
test('user service creates and retrieves users', () => {
const repo = new FakeUserRepository();
const saved = repo.save({ name: 'Alice', email: 'alice@test.com' });
const found = repo.findById(saved.id);
expect(found.name).toBe('Alice');
expect(found.id).toBe(1);
});
Meszaros' xUnit Test Patterns (2007)
Gerard Meszaros' seminal work introduced the unified taxonomy of test doubles that the industry still uses today. Before this book, terms like "mock" and "stub" were used interchangeably. His classification — Dummy, Stub, Spy, Mock, Fake — provides precise language for discussing test isolation strategies. Martin Fowler's article "Mocks Aren't Stubs" (2004) popularised the distinction between state verification (stubs) and behaviour verification (mocks).
Mocking Best Practices
Mocking is powerful but dangerous. Overuse creates brittle tests that break on every refactor. Here are the guiding principles:
Rule 1: Mock at Boundaries
Mock external dependencies — HTTP clients, databases, file systems, third-party APIs. Do not mock internal collaborators within the same module. If you find yourself mocking everything, your code probably has too many dependencies.
// GOOD: Mock the HTTP client (external boundary)
const axios = require('axios');
jest.mock('axios');
async function fetchUserProfile(userId) {
const response = await axios.get(`/api/users/${userId}`);
return { name: response.data.name, email: response.data.email };
}
test('fetchUserProfile extracts name and email', async () => {
axios.get.mockResolvedValue({
data: { name: 'Alice', email: 'alice@test.com', id: 1, role: 'admin' }
});
const profile = await fetchUserProfile(1);
expect(profile).toEqual({ name: 'Alice', email: 'alice@test.com' });
});
Rule 2: Don't Over-Mock
// BAD: Mocking internal implementation details
function calculateTotal(items) {
return items.reduce((sum, item) => sum + item.price * item.quantity, 0);
}
// This is WRONG — don't mock Array.reduce!
// Instead, just test with real data:
test('calculateTotal sums price * quantity', () => {
const items = [
{ price: 10, quantity: 2 },
{ price: 5, quantity: 3 }
];
expect(calculateTotal(items)).toBe(35);
});
Rule 3: Don't Mock What You Don't Own
If you mock a third-party library's internals, your test is coupled to that library's implementation. Instead, create a thin wrapper (adapter) around the library and mock your own wrapper.
# Python — Adapter pattern for testability
import requests
# Your adapter (you own this — safe to mock)
class WeatherClient:
def __init__(self, base_url="https://api.weather.com"):
self.base_url = base_url
def get_temperature(self, city):
response = requests.get(f"{self.base_url}/current?city={city}")
response.raise_for_status()
return response.json()["temperature"]
# Function under test
def should_wear_jacket(weather_client, city):
temp = weather_client.get_temperature(city)
return temp < 15
# Test — mock YOUR adapter, not requests library
from unittest.mock import MagicMock
def test_jacket_recommended_when_cold():
mock_client = MagicMock()
mock_client.get_temperature.return_value = 8
assert should_wear_jacket(mock_client, "London") is True
def test_no_jacket_when_warm():
mock_client = MagicMock()
mock_client.get_temperature.return_value = 22
assert should_wear_jacket(mock_client, "London") is False
Test-Driven Development (TDD)
TDD inverts the natural order: you write the test before the production code. Kent Beck formalised this approach in Test-Driven Development: By Example (2002), and it remains one of the most debated practices in software engineering.
The Red-Green-Refactor Cycle
flowchart LR
R["🔴 RED
Write a failing test"] --> G["🟢 GREEN
Write minimum code
to pass the test"]
G --> RF["🔵 REFACTOR
Clean up code
while tests stay green"]
RF --> R
- Red — Write a test for the next piece of functionality. Run it. Watch it fail (red). This proves the test is actually testing something.
- Green — Write the simplest code that makes the test pass. Don't over-engineer. Don't optimise. Just make it green.
- Refactor — Now that you have a green test as a safety net, clean up the code. Remove duplication, improve names, extract methods. Run tests again to confirm they still pass.
The Three Laws of TDD (Robert C. Martin)
- You may not write production code until you have written a failing unit test.
- You may not write more of a unit test than is sufficient to fail (and not compiling counts as failing).
- You may not write more production code than is sufficient to pass the currently failing test.
Step-by-Step TDD Example: FizzBuzz
Let's build FizzBuzz using strict TDD. We add one test at a time, write minimum code, then refactor.
# Step 1: RED — First test: return "1" for input 1
# test_fizzbuzz.py
def fizzbuzz(n):
pass # Not implemented yet
def test_returns_1_for_1():
assert fizzbuzz(1) == "1"
# FAILS: fizzbuzz returns None
# Step 2: GREEN — Simplest code to pass
def fizzbuzz(n):
return str(n)
def test_returns_1_for_1():
assert fizzbuzz(1) == "1" # PASSES
def test_returns_2_for_2():
assert fizzbuzz(2) == "2" # PASSES (freebie)
# Step 3: RED — Add Fizz for multiples of 3
def fizzbuzz(n):
return str(n)
def test_returns_fizz_for_3():
assert fizzbuzz(3) == "Fizz"
# FAILS: returns "3" instead of "Fizz"
# Step 4: GREEN — Handle multiples of 3
def fizzbuzz(n):
if n % 3 == 0:
return "Fizz"
return str(n)
def test_returns_1_for_1():
assert fizzbuzz(1) == "1" # PASSES
def test_returns_fizz_for_3():
assert fizzbuzz(3) == "Fizz" # PASSES
def test_returns_fizz_for_6():
assert fizzbuzz(6) == "Fizz" # PASSES
# Step 5: RED → GREEN — Handle Buzz (multiples of 5)
def fizzbuzz(n):
if n % 3 == 0:
return "Fizz"
if n % 5 == 0:
return "Buzz"
return str(n)
def test_returns_buzz_for_5():
assert fizzbuzz(5) == "Buzz" # PASSES
def test_returns_buzz_for_10():
assert fizzbuzz(10) == "Buzz" # PASSES
# Step 6: RED → GREEN — Handle FizzBuzz (multiples of both)
def fizzbuzz(n):
if n % 15 == 0:
return "FizzBuzz"
if n % 3 == 0:
return "Fizz"
if n % 5 == 0:
return "Buzz"
return str(n)
def test_returns_fizzbuzz_for_15():
assert fizzbuzz(15) == "FizzBuzz" # PASSES
def test_returns_fizzbuzz_for_30():
assert fizzbuzz(30) == "FizzBuzz" # PASSES
# All tests pass — REFACTOR if needed (this is already clean)
n % 15 before the individual checks. Without TDD, many developers write the % 3 and % 5 checks first and forget that 15 satisfies both — leading to a bug. TDD's incremental approach naturally surfaces edge cases.
TDD Benefits & Criticisms
Benefits
- Design Pressure: TDD forces you to think about interfaces before implementation. Hard-to-test code signals design problems.
- Regression Safety: Every feature has a test from birth. Refactoring is safe.
- Living Documentation: Tests document what the code actually does, not what someone thinks it should do.
- Smaller Increments: You build in tiny steps, catching errors immediately rather than debugging a 200-line function.
- Confidence: The green test suite gives you psychological permission to refactor boldly.
Criticisms
- Slow for Exploration: When you don't know what you're building (spike/prototype phase), writing tests first adds friction.
- Over-Specification: Tests can become coupled to implementation details, making refactoring harder rather than easier.
- The 100% Coverage Myth: TDD can push teams toward testing trivial code (getters, setters) to hit coverage targets.
- Learning Curve: Writing testable code is a design skill that takes months to develop.
- Not All Code Benefits Equally: CRUD operations, UI layouts, and data transformations often don't benefit from test-first.
Code Coverage
Code coverage measures what percentage of your code is exercised by tests. It is a necessary but insufficient quality metric — high coverage doesn't guarantee quality, but low coverage guarantees blind spots.
Coverage Types
| Type | What It Measures | Example |
|---|---|---|
| Statement | % of statements executed | Every line runs at least once |
| Branch | % of if/else branches taken | Both true and false paths tested |
| Function | % of functions called | Every exported function invoked |
| Line | % of executable lines hit | Similar to statement (multi-statement lines differ) |
| Condition | % of boolean sub-expressions evaluated both ways | if (a && b) — tests a=true/b=true AND a=true/b=false |
Coverage Tools
# JavaScript — Run Jest with coverage
npx jest --coverage
# Output:
# ----------------------|---------|----------|---------|---------|
# File | % Stmts | % Branch | % Funcs | % Lines |
# ----------------------|---------|----------|---------|---------|
# calculator.js | 100 | 100 | 100 | 100 |
# userService.js | 87 | 75 | 100 | 87 |
# ----------------------|---------|----------|---------|---------|
# Python — Run pytest with coverage
pip install pytest-cov
pytest --cov=src --cov-report=term-missing
# Output:
# Name Stmts Miss Cover Missing
# ---------------------------------------------------
# src/calculator.py 12 0 100%
# src/user_service.py 28 4 86% 34-37
# ---------------------------------------------------
# TOTAL 40 4 90%
Coverage as a Metric vs Coverage as a Goal
Coverage is useful for identifying untested areas, not for proving quality. A function with 100% coverage can still have logic errors if the assertions are weak. Conversely, well-written tests at 80% coverage often catch more bugs than poorly-written tests at 100%.
Property-Based Testing
Traditional unit tests use specific examples: "add(2, 3) should return 5". Property-based testing generates hundreds of random inputs and verifies that properties (invariants) always hold.
# Python — Property-based testing with Hypothesis
from hypothesis import given
from hypothesis.strategies import integers
def add(a, b):
return a + b
# Property: addition is commutative
@given(integers(), integers())
def test_addition_is_commutative(a, b):
assert add(a, b) == add(b, a)
# Property: adding zero is identity
@given(integers())
def test_adding_zero_is_identity(a):
assert add(a, 0) == a
# Property: addition is associative
@given(integers(), integers(), integers())
def test_addition_is_associative(a, b, c):
assert add(add(a, b), c) == add(a, add(b, c))
// JavaScript — Property-based testing with fast-check
const fc = require('fast-check');
function sort(arr) {
return [...arr].sort((a, b) => a - b);
}
// Property: sorted output has same length as input
test('sort preserves array length', () => {
fc.assert(
fc.property(fc.array(fc.integer()), (arr) => {
return sort(arr).length === arr.length;
})
);
});
// Property: sorted output is ordered
test('sort produces ordered output', () => {
fc.assert(
fc.property(fc.array(fc.integer()), (arr) => {
const sorted = sort(arr);
for (let i = 1; i < sorted.length; i++) {
if (sorted[i] < sorted[i - 1]) return false;
}
return true;
})
);
});
// Property: sorted output contains same elements
test('sort preserves elements', () => {
fc.assert(
fc.property(fc.array(fc.integer()), (arr) => {
const sorted = sort(arr);
return JSON.stringify([...arr].sort()) === JSON.stringify(sorted.sort());
})
);
});
When to use property-based testing: parsers (parse(serialize(x)) === x), mathematical functions (commutativity, associativity), data structures (invariants), serialization/deserialization roundtrips.
Mutation Testing
Mutation testing answers the question: "If I introduce a bug, will my tests catch it?" The tool creates small modifications (mutants) to your production code — changing + to -, > to >=, true to false — and runs your test suite. If tests still pass, the mutant survived, revealing a weakness in your tests.
| Tool | Language | Mutation Types |
|---|---|---|
| Stryker | JavaScript/TypeScript | Arithmetic, conditional, string, array, block removal |
| PIT (Pitest) | Java/Kotlin | Conditional boundary, negation, void method, return values |
| mutmut | Python | Operator replacement, keyword mutation, number changes |
| infection | PHP | 50+ mutation operators |
# Run Stryker on a JavaScript project
npx stryker run
# Output:
# Mutation score: 82%
# Killed: 45 Survived: 10 Timeout: 2 No coverage: 3
#
# Survived mutants:
# src/discount.js:12 — Changed >= to > (boundary mutation)
# src/validator.js:8 — Removed return statement
Test Organization
When a project grows to hundreds or thousands of tests, organization becomes critical for maintainability.
File Structure Conventions
# Convention 1: Mirror source structure (JavaScript/TypeScript)
src/
services/
userService.js
orderService.js
utils/
validator.js
tests/
services/
userService.test.js
orderService.test.js
utils/
validator.test.js
# Convention 2: Co-located tests (common in React, Go)
src/
services/
userService.js
userService.test.js
orderService.js
orderService.test.js
# Convention 3: Python standard layout
src/
my_package/
services/
user_service.py
order_service.py
tests/
test_user_service.py
test_order_service.py
conftest.py # Shared fixtures
Test Fixtures and Setup
# Python — conftest.py for shared fixtures
import pytest
@pytest.fixture
def sample_user():
"""Reusable test fixture for user data."""
return {
"id": 1,
"name": "Alice",
"email": "alice@example.com",
"role": "admin"
}
@pytest.fixture
def sample_order(sample_user):
"""Fixture that depends on another fixture."""
return {
"id": 101,
"user_id": sample_user["id"],
"items": [
{"product": "Widget", "price": 25.00, "qty": 2},
{"product": "Gadget", "price": 49.99, "qty": 1}
],
"total": 99.99
}
# Any test file can use these fixtures by name:
def test_order_belongs_to_user(sample_order, sample_user):
assert sample_order["user_id"] == sample_user["id"]
// JavaScript (Jest) — Setup and teardown
describe('UserService', () => {
let service;
let mockRepo;
// Runs before each test in this describe block
beforeEach(() => {
mockRepo = {
findById: jest.fn(),
save: jest.fn(),
delete: jest.fn()
};
service = new UserService(mockRepo);
});
// Runs after each test (cleanup)
afterEach(() => {
jest.clearAllMocks();
});
test('getUser returns user from repository', () => {
mockRepo.findById.mockReturnValue({ id: 1, name: 'Alice' });
const user = service.getUser(1);
expect(user.name).toBe('Alice');
});
test('deleteUser calls repository delete', () => {
service.deleteUser(1);
expect(mockRepo.delete).toHaveBeenCalledWith(1);
});
});
Common Unit Testing Anti-Patterns
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Testing Implementation | Tests break on every refactor because they verify how code works internally | Test behaviour (inputs → outputs), not method call sequences |
| Shared Mutable State | Tests affect each other through global variables or singletons | Create fresh state in each test; avoid module-level variables |
| Test Interdependence | Test B depends on Test A running first (ordering issues) | Each test must stand alone; use setup/teardown properly |
| Slow Tests | Unit tests hitting network/DB/filesystem take seconds each | Replace external calls with test doubles; move slow tests to integration suite |
| No Assertions | Test runs code but never verifies results ("happy path only") | Every test must have at least one meaningful assertion |
| Testing Trivial Code | Testing getters/setters/constructors wastes time | Test behaviour and logic, not data containers |
| The Giant Test | One test validates 10 different things — unclear what failed | One logical assertion per test; split into focused tests |
| Copy-Paste Tests | Duplicated setup across dozens of tests | Use fixtures, builders, or parameterised tests |
Parameterised Tests — Eliminating Duplication
# Python — Parameterized tests with pytest
import pytest
def is_palindrome(s):
cleaned = s.lower().replace(" ", "")
return cleaned == cleaned[::-1]
@pytest.mark.parametrize("input_str,expected", [
("racecar", True),
("hello", False),
("A man a plan a canal Panama", True),
("", True),
("ab", False),
("aba", True),
])
def test_is_palindrome(input_str, expected):
assert is_palindrome(input_str) == expected
// JavaScript (Jest) — Parameterized tests with test.each
function isEven(n) {
return n % 2 === 0;
}
test.each([
[2, true],
[3, false],
[0, true],
[-4, true],
[7, false],
])('isEven(%i) returns %s', (input, expected) => {
expect(isEven(input)).toBe(expected);
});
Exercises
Write Unit Tests for a Shopping Cart
Implement a ShoppingCart class with methods: addItem(name, price, quantity), removeItem(name), getTotal(), and applyDiscount(percentage). Write at least 8 unit tests covering: adding items, removing items, calculating totals, applying discounts, edge cases (empty cart, negative quantities, discount > 100%).
Practice TDD — Build a String Calculator
Using strict TDD (Red-Green-Refactor), build a stringCalculator(input) function: (1) Empty string returns 0, (2) Single number returns that number, (3) Two comma-separated numbers returns their sum, (4) Handles newlines as delimiters, (5) Throws on negative numbers. Write one test at a time, make it pass, then add the next.
Refactor Legacy Code with Tests
Take any function with complex logic (a tax calculator, a URL parser, or a date formatter). First, write characterisation tests that document the current behaviour. Then refactor the implementation while keeping all tests green. Document which tests you added before vs after refactoring.
Find Coverage Gaps with Mutation Testing
Take a project with >90% line coverage and run Stryker (JS) or mutmut (Python). Identify at least 3 surviving mutants. For each, explain: (1) What mutation was introduced, (2) Why existing tests didn't catch it, (3) What test you would add to kill it. Write the missing tests.
Conclusion & Next Steps
Unit testing is the foundation — fast, isolated, deterministic tests that give you instant feedback on every code change. Test doubles let you isolate units from their dependencies. TDD provides a disciplined workflow that improves design and catches bugs early. Coverage and mutation testing help you identify weak spots in your test suite.
But unit tests alone are insufficient. They verify that components work in isolation — they say nothing about whether components work together. In the next article, we move up the testing pyramid to integration tests, consumer-driven contracts, and API testing.
Next in the Series
In Part 20: Integration, Contract & API Testing, we'll explore testing real dependencies with Testcontainers, consumer-driven contracts with Pact, API testing strategies, and service virtualisation.