Part 19: Unit Testing & Test-Driven Development

Introduction — The Base of the Pyramid

Unit tests form the widest, fastest, and cheapest layer of the testing pyramid. They verify individual units of code — functions, methods, classes — in complete isolation from external dependencies. A well-tested codebase might have thousands of unit tests that execute in under ten seconds.

But what makes a test a "unit" test? The industry has debated this for decades. Some define "unit" as a single function; others define it as a single class or module. The practical definition comes down to three properties:

Properties of Good Unit Tests

Property	What It Means	Why It Matters
Fast	Executes in milliseconds, not seconds	Developers run them constantly during development
Isolated	No filesystem, network, database, or external service	Can run anywhere, in any order, in parallel
Deterministic	Same input always produces same result	No flaky failures, no time-dependent behaviour
Self-Validating	Pass or fail without human interpretation	Automation requires binary outcomes
Timely	Written close to (or before) the production code	Catches bugs when they are cheapest to fix

                            
                            Key Insight: A test that hits the database is not a unit test — it's an integration test. A test that takes 2 seconds to run is not a unit test — it's too slow for the tight feedback loop developers need. The distinction matters because unit tests give you speed, while integration tests give you confidence in wiring.
                        

Anatomy of a Unit Test

Every unit test follows the same three-phase structure, regardless of language or framework:

Arrange-Act-Assert (AAA)

Arrange — Set up the test data, create objects, configure mocks
Act — Call the function or method under test
Assert — Verify the result matches expectations

Some teams prefer the BDD-style equivalent: Given-When-Then. The semantics are identical.

Naming Conventions

Good test names communicate what is being tested, under what conditions, and what the expected outcome is. Common patterns:

methodName_condition_expectedResult — e.g., calculateDiscount_orderOver100_returns10Percent
should [expected behaviour] when [condition] — e.g., should return 10% discount when order exceeds 100
test_[scenario] (Python convention) — e.g., test_discount_applied_for_large_orders

JavaScript (Jest) — Complete Examples

// calculator.js — The module under test
function add(a, b) {
    if (typeof a !== 'number' || typeof b !== 'number') {
        throw new TypeError('Arguments must be numbers');
    }
    return a + b;
}

function divide(a, b) {
    if (b === 0) throw new Error('Division by zero');
    return a / b;
}

module.exports = { add, divide };

// calculator.test.js — Unit tests using Jest
const { add, divide } = require('./calculator');

describe('add', () => {
    // Arrange-Act-Assert pattern
    test('adds two positive numbers', () => {
        // Arrange
        const a = 2;
        const b = 3;

        // Act
        const result = add(a, b);

        // Assert
        expect(result).toBe(5);
    });

    test('adds negative numbers correctly', () => {
        expect(add(-1, -2)).toBe(-3);
    });

    test('throws TypeError for non-numeric input', () => {
        expect(() => add('2', 3)).toThrow(TypeError);
        expect(() => add(2, null)).toThrow(TypeError);
    });
});

describe('divide', () => {
    test('divides two numbers', () => {
        expect(divide(10, 2)).toBe(5);
    });

    test('returns float for non-even division', () => {
        expect(divide(7, 2)).toBeCloseTo(3.5);
    });

    test('throws error when dividing by zero', () => {
        expect(() => divide(10, 0)).toThrow('Division by zero');
    });
});

Python (pytest) — Complete Examples

# calculator.py — The module under test
def add(a: float, b: float) -> float:
    if not isinstance(a, (int, float)) or not isinstance(b, (int, float)):
        raise TypeError("Arguments must be numbers")
    return a + b

def divide(a: float, b: float) -> float:
    if b == 0:
        raise ZeroDivisionError("Division by zero")
    return a / b

# test_calculator.py — Unit tests using pytest
import pytest

def add(a, b):
    if not isinstance(a, (int, float)) or not isinstance(b, (int, float)):
        raise TypeError("Arguments must be numbers")
    return a + b

def divide(a, b):
    if b == 0:
        raise ZeroDivisionError("Division by zero")
    return a / b

# Test: add two positive numbers
def test_add_positive_numbers():
    # Arrange
    a, b = 2, 3

    # Act
    result = add(a, b)

    # Assert
    assert result == 5

# Test: add negative numbers
def test_add_negative_numbers():
    assert add(-1, -2) == -3

# Test: add raises TypeError for non-numeric input
def test_add_raises_type_error():
    with pytest.raises(TypeError):
        add("2", 3)

# Test: divide two numbers
def test_divide_numbers():
    assert divide(10, 2) == 5.0

# Test: divide returns float
def test_divide_returns_float():
    assert divide(7, 2) == pytest.approx(3.5)

# Test: divide by zero raises error
def test_divide_by_zero():
    with pytest.raises(ZeroDivisionError):
        divide(10, 0)

                            
                            Convention Difference: Jest uses describe/test blocks with nested structure. Pytest uses flat functions prefixed with test_. Both achieve the same isolation and clarity — choose based on your ecosystem.
                        

Test Doubles

In unit testing, you need to isolate the code under test from its dependencies. Test doubles replace real collaborators with controlled substitutes. Gerard Meszaros' taxonomy (from xUnit Test Patterns) defines five types:

Test Double Taxonomy

flowchart LR
    TD[Test Double] --> DU[Dummy]
    TD --> ST[Stub]
    TD --> SP[Spy]
    TD --> MO[Mock]
    TD --> FA[Fake]
    DU -.- DU_DESC["Fills parameter lists
Never actually used"]
    ST -.- ST_DESC["Returns canned data
No verification"]
    SP -.- SP_DESC["Records calls
Verified after act"]
    MO -.- MO_DESC["Pre-programmed expectations
Verifies interactions"]
    FA -.- FA_DESC["Working implementation
Simplified version"]

Dummy

A dummy fills a required parameter but is never actually used. It exists only to satisfy a function signature.

// JavaScript — Dummy example
// The logger is required by the constructor but never called in this test
const dummyLogger = {};

function createUser(name, logger) {
    return { name, createdAt: Date.now() };
}

test('createUser returns user with name', () => {
    const user = createUser('Alice', dummyLogger);
    expect(user.name).toBe('Alice');
});

Stub

A stub provides canned responses to calls made during a test. It does not verify how it was called — only the test's assertions matter.

# Python — Stub example
from unittest.mock import MagicMock

def get_user_greeting(user_service, user_id):
    """Function under test — depends on user_service."""
    user = user_service.find_by_id(user_id)
    return f"Hello, {user['name']}!"

def test_greeting_uses_user_name():
    # Arrange: stub the user service
    stub_service = MagicMock()
    stub_service.find_by_id.return_value = {"name": "Alice", "email": "alice@example.com"}

    # Act
    result = get_user_greeting(stub_service, 42)

    # Assert: only verify the output, not how the stub was called
    assert result == "Hello, Alice!"

Spy

A spy records information about how it was called. After the act phase, you verify calls were made correctly.

// JavaScript (Jest) — Spy example
function processOrder(order, notificationService) {
    // Business logic...
    const total = order.items.reduce((sum, item) => sum + item.price, 0);
    notificationService.sendConfirmation(order.email, total);
    return { status: 'confirmed', total };
}

test('processOrder sends confirmation email', () => {
    // Arrange: spy on the notification service
    const spyService = { sendConfirmation: jest.fn() };
    const order = {
        email: 'bob@example.com',
        items: [{ price: 10 }, { price: 20 }]
    };

    // Act
    const result = processOrder(order, spyService);

    // Assert: verify the spy was called correctly
    expect(spyService.sendConfirmation).toHaveBeenCalledWith('bob@example.com', 30);
    expect(spyService.sendConfirmation).toHaveBeenCalledTimes(1);
    expect(result.total).toBe(30);
});

Mock

A mock is pre-programmed with expectations. It verifies that specific interactions occur — often including call order, argument values, and call count. Mocks fail the test if expectations aren't met.

# Python — Mock with assertion on call
from unittest.mock import MagicMock, call

def transfer_funds(from_account, to_account, amount, audit_log):
    """Transfer money and log the transaction."""
    from_account.debit(amount)
    to_account.credit(amount)
    audit_log.record(f"Transferred {amount} from {from_account.id} to {to_account.id}")

def test_transfer_funds_logs_transaction():
    # Arrange: create mocks
    from_acc = MagicMock(id="ACC-001")
    to_acc = MagicMock(id="ACC-002")
    mock_audit = MagicMock()

    # Act
    transfer_funds(from_acc, to_acc, 500, mock_audit)

    # Assert: mock verifies interactions
    from_acc.debit.assert_called_once_with(500)
    to_acc.credit.assert_called_once_with(500)
    mock_audit.record.assert_called_once_with("Transferred 500 from ACC-001 to ACC-002")

Fake

A fake has a working implementation but takes shortcuts. Common examples: in-memory databases, fake file systems, local email servers.

// JavaScript — Fake repository (in-memory implementation)
class FakeUserRepository {
    constructor() {
        this.users = new Map();
        this.nextId = 1;
    }

    save(user) {
        const id = this.nextId++;
        const saved = { ...user, id };
        this.users.set(id, saved);
        return saved;
    }

    findById(id) {
        return this.users.get(id) || null;
    }

    findAll() {
        return Array.from(this.users.values());
    }
}

test('user service creates and retrieves users', () => {
    const repo = new FakeUserRepository();

    const saved = repo.save({ name: 'Alice', email: 'alice@test.com' });
    const found = repo.findById(saved.id);

    expect(found.name).toBe('Alice');
    expect(found.id).toBe(1);
});

Research

Meszaros' xUnit Test Patterns (2007)

Gerard Meszaros' seminal work introduced the unified taxonomy of test doubles that the industry still uses today. Before this book, terms like "mock" and "stub" were used interchangeably. His classification — Dummy, Stub, Spy, Mock, Fake — provides precise language for discussing test isolation strategies. Martin Fowler's article "Mocks Aren't Stubs" (2004) popularised the distinction between state verification (stubs) and behaviour verification (mocks).

Test Patterns Meszaros Fowler

Mocking Best Practices

Mocking is powerful but dangerous. Overuse creates brittle tests that break on every refactor. Here are the guiding principles:

Rule 1: Mock at Boundaries

Mock external dependencies — HTTP clients, databases, file systems, third-party APIs. Do not mock internal collaborators within the same module. If you find yourself mocking everything, your code probably has too many dependencies.

// GOOD: Mock the HTTP client (external boundary)
const axios = require('axios');
jest.mock('axios');

async function fetchUserProfile(userId) {
    const response = await axios.get(`/api/users/${userId}`);
    return { name: response.data.name, email: response.data.email };
}

test('fetchUserProfile extracts name and email', async () => {
    axios.get.mockResolvedValue({
        data: { name: 'Alice', email: 'alice@test.com', id: 1, role: 'admin' }
    });

    const profile = await fetchUserProfile(1);

    expect(profile).toEqual({ name: 'Alice', email: 'alice@test.com' });
});

Rule 2: Don't Over-Mock

// BAD: Mocking internal implementation details
function calculateTotal(items) {
    return items.reduce((sum, item) => sum + item.price * item.quantity, 0);
}

// This is WRONG — don't mock Array.reduce!
// Instead, just test with real data:
test('calculateTotal sums price * quantity', () => {
    const items = [
        { price: 10, quantity: 2 },
        { price: 5, quantity: 3 }
    ];
    expect(calculateTotal(items)).toBe(35);
});

Rule 3: Don't Mock What You Don't Own

If you mock a third-party library's internals, your test is coupled to that library's implementation. Instead, create a thin wrapper (adapter) around the library and mock your own wrapper.

# Python — Adapter pattern for testability
import requests

# Your adapter (you own this — safe to mock)
class WeatherClient:
    def __init__(self, base_url="https://api.weather.com"):
        self.base_url = base_url

    def get_temperature(self, city):
        response = requests.get(f"{self.base_url}/current?city={city}")
        response.raise_for_status()
        return response.json()["temperature"]

# Function under test
def should_wear_jacket(weather_client, city):
    temp = weather_client.get_temperature(city)
    return temp < 15

# Test — mock YOUR adapter, not requests library
from unittest.mock import MagicMock

def test_jacket_recommended_when_cold():
    mock_client = MagicMock()
    mock_client.get_temperature.return_value = 8

    assert should_wear_jacket(mock_client, "London") is True

def test_no_jacket_when_warm():
    mock_client = MagicMock()
    mock_client.get_temperature.return_value = 22

    assert should_wear_jacket(mock_client, "London") is False

Test-Driven Development (TDD)

TDD inverts the natural order: you write the test before the production code. Kent Beck formalised this approach in Test-Driven Development: By Example (2002), and it remains one of the most debated practices in software engineering.

The Red-Green-Refactor Cycle

TDD Cycle — Red-Green-Refactor

flowchart LR
    R["🔴 RED
Write a failing test"] --> G["🟢 GREEN
Write minimum code
to pass the test"]
    G --> RF["🔵 REFACTOR
Clean up code
while tests stay green"]
    RF --> R

Red — Write a test for the next piece of functionality. Run it. Watch it fail (red). This proves the test is actually testing something.
Green — Write the simplest code that makes the test pass. Don't over-engineer. Don't optimise. Just make it green.
Refactor — Now that you have a green test as a safety net, clean up the code. Remove duplication, improve names, extract methods. Run tests again to confirm they still pass.

The Three Laws of TDD (Robert C. Martin)

You may not write production code until you have written a failing unit test.
You may not write more of a unit test than is sufficient to fail (and not compiling counts as failing).
You may not write more production code than is sufficient to pass the currently failing test.

Step-by-Step TDD Example: FizzBuzz

Let's build FizzBuzz using strict TDD. We add one test at a time, write minimum code, then refactor.

# Step 1: RED — First test: return "1" for input 1
# test_fizzbuzz.py

def fizzbuzz(n):
    pass  # Not implemented yet

def test_returns_1_for_1():
    assert fizzbuzz(1) == "1"
    # FAILS: fizzbuzz returns None

# Step 2: GREEN — Simplest code to pass
def fizzbuzz(n):
    return str(n)

def test_returns_1_for_1():
    assert fizzbuzz(1) == "1"  # PASSES

def test_returns_2_for_2():
    assert fizzbuzz(2) == "2"  # PASSES (freebie)

# Step 3: RED — Add Fizz for multiples of 3
def fizzbuzz(n):
    return str(n)

def test_returns_fizz_for_3():
    assert fizzbuzz(3) == "Fizz"
    # FAILS: returns "3" instead of "Fizz"

# Step 4: GREEN — Handle multiples of 3
def fizzbuzz(n):
    if n % 3 == 0:
        return "Fizz"
    return str(n)

def test_returns_1_for_1():
    assert fizzbuzz(1) == "1"  # PASSES

def test_returns_fizz_for_3():
    assert fizzbuzz(3) == "Fizz"  # PASSES

def test_returns_fizz_for_6():
    assert fizzbuzz(6) == "Fizz"  # PASSES

# Step 5: RED → GREEN — Handle Buzz (multiples of 5)
def fizzbuzz(n):
    if n % 3 == 0:
        return "Fizz"
    if n % 5 == 0:
        return "Buzz"
    return str(n)

def test_returns_buzz_for_5():
    assert fizzbuzz(5) == "Buzz"  # PASSES

def test_returns_buzz_for_10():
    assert fizzbuzz(10) == "Buzz"  # PASSES

# Step 6: RED → GREEN — Handle FizzBuzz (multiples of both)
def fizzbuzz(n):
    if n % 15 == 0:
        return "FizzBuzz"
    if n % 3 == 0:
        return "Fizz"
    if n % 5 == 0:
        return "Buzz"
    return str(n)

def test_returns_fizzbuzz_for_15():
    assert fizzbuzz(15) == "FizzBuzz"  # PASSES

def test_returns_fizzbuzz_for_30():
    assert fizzbuzz(30) == "FizzBuzz"  # PASSES

# All tests pass — REFACTOR if needed (this is already clean)

                            
                            Key Insight: Notice how TDD forced us to handle n % 15 before the individual checks. Without TDD, many developers write the % 3 and % 5 checks first and forget that 15 satisfies both — leading to a bug. TDD's incremental approach naturally surfaces edge cases.
                        

TDD Benefits & Criticisms

Benefits

Design Pressure: TDD forces you to think about interfaces before implementation. Hard-to-test code signals design problems.
Regression Safety: Every feature has a test from birth. Refactoring is safe.
Living Documentation: Tests document what the code actually does, not what someone thinks it should do.
Smaller Increments: You build in tiny steps, catching errors immediately rather than debugging a 200-line function.
Confidence: The green test suite gives you psychological permission to refactor boldly.

Criticisms

Slow for Exploration: When you don't know what you're building (spike/prototype phase), writing tests first adds friction.
Over-Specification: Tests can become coupled to implementation details, making refactoring harder rather than easier.
The 100% Coverage Myth: TDD can push teams toward testing trivial code (getters, setters) to hit coverage targets.
Learning Curve: Writing testable code is a design skill that takes months to develop.
Not All Code Benefits Equally: CRUD operations, UI layouts, and data transformations often don't benefit from test-first.

                            
                            When TDD Works Best: Business logic, algorithms, state machines, parsers, validation rules — any code where correctness is critical and the interface is somewhat stable. When TDD works poorly: rapid prototypes, UI experiments, infrastructure glue code, scripts you'll run once.
                        

Code Coverage

Code coverage measures what percentage of your code is exercised by tests. It is a necessary but insufficient quality metric — high coverage doesn't guarantee quality, but low coverage guarantees blind spots.

Coverage Types

Type	What It Measures	Example
Statement	% of statements executed	Every line runs at least once
Branch	% of if/else branches taken	Both true and false paths tested
Function	% of functions called	Every exported function invoked
Line	% of executable lines hit	Similar to statement (multi-statement lines differ)
Condition	% of boolean sub-expressions evaluated both ways	`if (a && b)` — tests a=true/b=true AND a=true/b=false

Coverage Tools

# JavaScript — Run Jest with coverage
npx jest --coverage

# Output:
# ----------------------|---------|----------|---------|---------|
# File                  | % Stmts | % Branch | % Funcs | % Lines |
# ----------------------|---------|----------|---------|---------|
# calculator.js         |     100 |      100 |     100 |     100 |
# userService.js        |      87 |       75 |     100 |      87 |
# ----------------------|---------|----------|---------|---------|

# Python — Run pytest with coverage
pip install pytest-cov
pytest --cov=src --cov-report=term-missing

# Output:
# Name                  Stmts   Miss  Cover   Missing
# ---------------------------------------------------
# src/calculator.py        12      0   100%
# src/user_service.py      28      4    86%   34-37
# ---------------------------------------------------
# TOTAL                    40      4    90%

Coverage as a Metric vs Coverage as a Goal

                            
                            The 80% Guideline: Most mature teams target 80% line coverage as a minimum threshold, not a target. The last 20% often covers error handling paths, edge cases in third-party integrations, and code that would require extensive mocking for marginal benefit. The goal is not "100% coverage" — it's "meaningful tests that catch real bugs."
                        

Coverage is useful for identifying untested areas, not for proving quality. A function with 100% coverage can still have logic errors if the assertions are weak. Conversely, well-written tests at 80% coverage often catch more bugs than poorly-written tests at 100%.

Property-Based Testing

Traditional unit tests use specific examples: "add(2, 3) should return 5". Property-based testing generates hundreds of random inputs and verifies that properties (invariants) always hold.

# Python — Property-based testing with Hypothesis
from hypothesis import given
from hypothesis.strategies import integers

def add(a, b):
    return a + b

# Property: addition is commutative
@given(integers(), integers())
def test_addition_is_commutative(a, b):
    assert add(a, b) == add(b, a)

# Property: adding zero is identity
@given(integers())
def test_adding_zero_is_identity(a):
    assert add(a, 0) == a

# Property: addition is associative
@given(integers(), integers(), integers())
def test_addition_is_associative(a, b, c):
    assert add(add(a, b), c) == add(a, add(b, c))

// JavaScript — Property-based testing with fast-check
const fc = require('fast-check');

function sort(arr) {
    return [...arr].sort((a, b) => a - b);
}

// Property: sorted output has same length as input
test('sort preserves array length', () => {
    fc.assert(
        fc.property(fc.array(fc.integer()), (arr) => {
            return sort(arr).length === arr.length;
        })
    );
});

// Property: sorted output is ordered
test('sort produces ordered output', () => {
    fc.assert(
        fc.property(fc.array(fc.integer()), (arr) => {
            const sorted = sort(arr);
            for (let i = 1; i < sorted.length; i++) {
                if (sorted[i] < sorted[i - 1]) return false;
            }
            return true;
        })
    );
});

// Property: sorted output contains same elements
test('sort preserves elements', () => {
    fc.assert(
        fc.property(fc.array(fc.integer()), (arr) => {
            const sorted = sort(arr);
            return JSON.stringify([...arr].sort()) === JSON.stringify(sorted.sort());
        })
    );
});

When to use property-based testing: parsers (parse(serialize(x)) === x), mathematical functions (commutativity, associativity), data structures (invariants), serialization/deserialization roundtrips.

Mutation Testing

Mutation testing answers the question: "If I introduce a bug, will my tests catch it?" The tool creates small modifications (mutants) to your production code — changing + to -, > to >=, true to false — and runs your test suite. If tests still pass, the mutant survived, revealing a weakness in your tests.

Tool	Language	Mutation Types
Stryker	JavaScript/TypeScript	Arithmetic, conditional, string, array, block removal
PIT (Pitest)	Java/Kotlin	Conditional boundary, negation, void method, return values
mutmut	Python	Operator replacement, keyword mutation, number changes
infection	PHP	50+ mutation operators

# Run Stryker on a JavaScript project
npx stryker run

# Output:
# Mutation score: 82%
# Killed: 45  Survived: 10  Timeout: 2  No coverage: 3
# 
# Survived mutants:
#   src/discount.js:12 — Changed >= to > (boundary mutation)
#   src/validator.js:8 — Removed return statement

                            
                            Key Insight: A 100% code coverage score with a 60% mutation score means your tests execute all lines but don't actually verify the results. Mutation testing reveals assertion quality, not just execution coverage.
                        

Test Organization

When a project grows to hundreds or thousands of tests, organization becomes critical for maintainability.

File Structure Conventions

# Convention 1: Mirror source structure (JavaScript/TypeScript)
src/
  services/
    userService.js
    orderService.js
  utils/
    validator.js
tests/
  services/
    userService.test.js
    orderService.test.js
  utils/
    validator.test.js

# Convention 2: Co-located tests (common in React, Go)
src/
  services/
    userService.js
    userService.test.js
    orderService.js
    orderService.test.js

# Convention 3: Python standard layout
src/
  my_package/
    services/
      user_service.py
      order_service.py
tests/
  test_user_service.py
  test_order_service.py
  conftest.py            # Shared fixtures

Test Fixtures and Setup

# Python — conftest.py for shared fixtures
import pytest

@pytest.fixture
def sample_user():
    """Reusable test fixture for user data."""
    return {
        "id": 1,
        "name": "Alice",
        "email": "alice@example.com",
        "role": "admin"
    }

@pytest.fixture
def sample_order(sample_user):
    """Fixture that depends on another fixture."""
    return {
        "id": 101,
        "user_id": sample_user["id"],
        "items": [
            {"product": "Widget", "price": 25.00, "qty": 2},
            {"product": "Gadget", "price": 49.99, "qty": 1}
        ],
        "total": 99.99
    }

# Any test file can use these fixtures by name:
def test_order_belongs_to_user(sample_order, sample_user):
    assert sample_order["user_id"] == sample_user["id"]

// JavaScript (Jest) — Setup and teardown
describe('UserService', () => {
    let service;
    let mockRepo;

    // Runs before each test in this describe block
    beforeEach(() => {
        mockRepo = {
            findById: jest.fn(),
            save: jest.fn(),
            delete: jest.fn()
        };
        service = new UserService(mockRepo);
    });

    // Runs after each test (cleanup)
    afterEach(() => {
        jest.clearAllMocks();
    });

    test('getUser returns user from repository', () => {
        mockRepo.findById.mockReturnValue({ id: 1, name: 'Alice' });
        const user = service.getUser(1);
        expect(user.name).toBe('Alice');
    });

    test('deleteUser calls repository delete', () => {
        service.deleteUser(1);
        expect(mockRepo.delete).toHaveBeenCalledWith(1);
    });
});

Common Unit Testing Anti-Patterns

Anti-Pattern	Problem	Solution
Testing Implementation	Tests break on every refactor because they verify how code works internally	Test behaviour (inputs → outputs), not method call sequences
Shared Mutable State	Tests affect each other through global variables or singletons	Create fresh state in each test; avoid module-level variables
Test Interdependence	Test B depends on Test A running first (ordering issues)	Each test must stand alone; use setup/teardown properly
Slow Tests	Unit tests hitting network/DB/filesystem take seconds each	Replace external calls with test doubles; move slow tests to integration suite
No Assertions	Test runs code but never verifies results ("happy path only")	Every test must have at least one meaningful assertion
Testing Trivial Code	Testing getters/setters/constructors wastes time	Test behaviour and logic, not data containers
The Giant Test	One test validates 10 different things — unclear what failed	One logical assertion per test; split into focused tests
Copy-Paste Tests	Duplicated setup across dozens of tests	Use fixtures, builders, or parameterised tests

Parameterised Tests — Eliminating Duplication

# Python — Parameterized tests with pytest
import pytest

def is_palindrome(s):
    cleaned = s.lower().replace(" ", "")
    return cleaned == cleaned[::-1]

@pytest.mark.parametrize("input_str,expected", [
    ("racecar", True),
    ("hello", False),
    ("A man a plan a canal Panama", True),
    ("", True),
    ("ab", False),
    ("aba", True),
])
def test_is_palindrome(input_str, expected):
    assert is_palindrome(input_str) == expected

// JavaScript (Jest) — Parameterized tests with test.each
function isEven(n) {
    return n % 2 === 0;
}

test.each([
    [2, true],
    [3, false],
    [0, true],
    [-4, true],
    [7, false],
])('isEven(%i) returns %s', (input, expected) => {
    expect(isEven(input)).toBe(expected);
});

Exercises

Exercise 1

Write Unit Tests for a Shopping Cart

Implement a ShoppingCart class with methods: addItem(name, price, quantity), removeItem(name), getTotal(), and applyDiscount(percentage). Write at least 8 unit tests covering: adding items, removing items, calculating totals, applying discounts, edge cases (empty cart, negative quantities, discount > 100%).

AAA Pattern Edge Cases Assertions

Exercise 2

Practice TDD — Build a String Calculator

Using strict TDD (Red-Green-Refactor), build a stringCalculator(input) function: (1) Empty string returns 0, (2) Single number returns that number, (3) Two comma-separated numbers returns their sum, (4) Handles newlines as delimiters, (5) Throws on negative numbers. Write one test at a time, make it pass, then add the next.

TDD Incremental Red-Green-Refactor

Exercise 3

Refactor Legacy Code with Tests

Take any function with complex logic (a tax calculator, a URL parser, or a date formatter). First, write characterisation tests that document the current behaviour. Then refactor the implementation while keeping all tests green. Document which tests you added before vs after refactoring.

Characterisation Tests Refactoring Safety Net

Exercise 4

Find Coverage Gaps with Mutation Testing

Take a project with >90% line coverage and run Stryker (JS) or mutmut (Python). Identify at least 3 surviving mutants. For each, explain: (1) What mutation was introduced, (2) Why existing tests didn't catch it, (3) What test you would add to kill it. Write the missing tests.

Mutation Testing Coverage Quality Stryker

Conclusion & Next Steps

Unit testing is the foundation — fast, isolated, deterministic tests that give you instant feedback on every code change. Test doubles let you isolate units from their dependencies. TDD provides a disciplined workflow that improves design and catches bugs early. Coverage and mutation testing help you identify weak spots in your test suite.

But unit tests alone are insufficient. They verify that components work in isolation — they say nothing about whether components work together. In the next article, we move up the testing pyramid to integration tests, consumer-driven contracts, and API testing.

Next in the Series

In Part 20: Integration, Contract & API Testing, we'll explore testing real dependencies with Testcontainers, consumer-driven contracts with Pact, API testing strategies, and service virtualisation.

Previous Part 18: Testing Fundamentals Next Part 20: Integration & Contract Testing

Cookie Consent