Complete Database Mastery Part 5: Transactions, Concurrency & ACID

Introduction: Why Transactions Matter

Transactions are the foundation of reliable database operations. They ensure that your data remains consistent even when multiple users access it simultaneously, or when failures occur mid-operation.

                        
                        Series Context: This is Part 5 of 15 in the Complete Database Mastery series. We're diving into the critical concepts of data integrity and concurrency.
                    

Database Mastery

Your 15-step learning path • Currently on Step 5

1

5

Transactions & Concurrency

ACID, isolation levels, locking, MVCC

You Are Here

6

Query Optimization & Indexing

EXPLAIN plans, index design, performance

7

Data Modeling & Normalization

ERDs, normal forms, schema design

8

MongoDB & Document Databases

NoSQL, aggregation, sharding

9

Redis & Caching Strategies

Data structures, caching patterns, pub/sub

10

Database Administration & Migrations

Backup, versioning, maintenance

11

Scaling & Distributed Systems

Replication, sharding, CAP theorem

12

Cloud Databases & Managed Services

AWS, Azure, GCP database offerings

13

Database Security & Governance

Encryption, access control, compliance

14

Data Warehousing & Analytics

OLAP, star schemas, columnar DBs

15

Capstone Projects

Portfolio-ready database implementations

Think of a transaction like an ATM withdrawal: either the money comes out AND your balance is updated, OR neither happens. Partial completion is unacceptable.

Real-World Transaction Example

Consider a bank transfer of $100 from Account A to Account B:

Read balance of Account A (has $500)
Subtract $100 from Account A ($400)
Read balance of Account B (has $200)
Add $100 to Account B ($300)

What if the system crashes after step 2? Without transactions, Account A loses $100 but Account B never receives it!

                        
                        Key Insight: Transactions group multiple operations into a single "all-or-nothing" unit. Either all operations succeed, or they all roll back.
                    

ACID Principles

ACID is the set of properties that guarantee reliable transaction processing. Every relational database implements these guarantees.

Four pillars diagram showing ACID properties: Atomicity, Consistency, Isolation, Durability — The four ACID properties: Atomicity (all-or-nothing), Consistency (valid states), Isolation (concurrent safety), Durability (permanent commits)

Atomicity: All or Nothing

A transaction is atomic—indivisible. If any part fails, the entire transaction rolls back as if nothing happened.

-- Atomicity example
BEGIN TRANSACTION;

-- Step 1: Deduct from sender
UPDATE accounts SET balance = balance - 100 WHERE id = 1;

-- Step 2: Add to receiver
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

-- If both succeed:
COMMIT;

-- If anything fails:
ROLLBACK;

-- Using savepoints for partial rollback
BEGIN TRANSACTION;

UPDATE inventory SET quantity = quantity - 5 WHERE product_id = 100;
SAVEPOINT after_inventory;

UPDATE orders SET status = 'shipped' WHERE order_id = 500;
-- Oops, something went wrong with the order update
ROLLBACK TO SAVEPOINT after_inventory;
-- Inventory change is still pending, order change is undone

COMMIT;  -- Only inventory change is committed

Consistency: Valid State to Valid State

A transaction brings the database from one valid state to another valid state. All constraints, triggers, and rules are enforced.

-- Consistency enforced by constraints
CREATE TABLE accounts (
    id INT PRIMARY KEY,
    balance DECIMAL(10,2) CHECK (balance >= 0)  -- No negative balances!
);

-- This transaction would FAIL consistency:
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 1000 WHERE id = 1;
-- If balance was only $500, CHECK constraint fails
-- Transaction rolls back automatically
COMMIT;

Isolation: Concurrent Independence

Isolation ensures concurrent transactions don't interfere with each other. Each transaction sees a consistent snapshot of the database.

Concurrency Problems Without Isolation

Problem	Description	Example
Dirty Read	Reading uncommitted data	Txn A sees Txn B's changes before B commits
Non-Repeatable Read	Same query, different results	Txn A reads row, B updates it, A reads different value
Phantom Read	New rows appear	Txn A runs query, B inserts rows, A's second query shows new rows
Lost Update	Overwrites another's changes	Both A and B read same row, both update, one change lost

Durability: Committed = Permanent

Once a transaction is committed, it's permanent—even if the system crashes immediately after.

-- Durability is ensured by Write-Ahead Logging (WAL)
-- The database writes changes to a log BEFORE applying them

-- PostgreSQL: Check WAL settings
SHOW wal_level;          -- minimal, replica, or logical
SHOW synchronous_commit; -- on = wait for disk write

-- MySQL: InnoDB durability settings
-- innodb_flush_log_at_trx_commit = 1  (safest, every commit syncs to disk)
-- innodb_flush_log_at_trx_commit = 2  (sync once per second, slight risk)
-- innodb_flush_log_at_trx_commit = 0  (no sync, fast but risky)

                        
                        Trade-off: Full durability (syncing every commit to disk) is slower. Some applications trade durability for performance with batch commits or async replication.
                    

Transaction Isolation Levels

Databases offer different isolation levels—trade-offs between consistency and performance. Higher isolation = more correctness but slower.

Spectrum diagram showing four isolation levels from Read Uncommitted to Serializable with anomaly prevention — Transaction isolation levels spectrum: trading performance for stronger consistency guarantees

Isolation Levels Comparison

Level	Dirty Read	Non-Repeatable	Phantom	Performance
Read Uncommitted	⚠️ Possible	⚠️ Possible	⚠️ Possible	Fastest
Read Committed	✅ Prevented	⚠️ Possible	⚠️ Possible	Fast
Repeatable Read	✅ Prevented	✅ Prevented	⚠️ Possible*	Medium
Serializable	✅ Prevented	✅ Prevented	✅ Prevented	Slowest

*InnoDB actually prevents phantoms at Repeatable Read using next-key locking

Read Uncommitted (Chaos Mode)

The weakest level—transactions can see uncommitted changes from other transactions. Almost never used in production.

-- Session 1: Update but don't commit
BEGIN;
UPDATE products SET price = 0.01 WHERE id = 1;  -- Was $100
-- Don't commit yet!

-- Session 2: With Read Uncommitted, sees the $0.01
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
BEGIN;
SELECT price FROM products WHERE id = 1;  -- Returns 0.01 (dirty read!)

-- Session 1: Oops, rollback!
ROLLBACK;

-- Session 2 read a value that never existed!

Read Committed (PostgreSQL/Oracle Default)

Only sees committed data. Each query sees the latest committed snapshot at query execution time.

-- Set isolation level
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

-- Session 1
BEGIN;
SELECT balance FROM accounts WHERE id = 1;  -- Returns 1000

-- Session 2 commits a change
-- UPDATE accounts SET balance = 500 WHERE id = 1; COMMIT;

-- Session 1 (same transaction!)
SELECT balance FROM accounts WHERE id = 1;  -- Returns 500!
-- Non-repeatable read: same query, different result
COMMIT;

Repeatable Read (MySQL Default)

A transaction sees a consistent snapshot from when it started. Same query always returns the same result within a transaction.

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

BEGIN;
SELECT balance FROM accounts WHERE id = 1;  -- Returns 1000

-- Another session commits: UPDATE accounts SET balance = 500...

SELECT balance FROM accounts WHERE id = 1;  -- Still returns 1000!
-- Snapshot isolation: sees the world as it was at BEGIN

COMMIT;
-- After commit, new transaction will see 500

Serializable (Maximum Safety)

Transactions execute as if running one at a time. Prevents all anomalies but has significant performance cost.

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

-- PostgreSQL uses Serializable Snapshot Isolation (SSI)
-- Detects conflicts and aborts transactions if needed

BEGIN;
SELECT SUM(balance) FROM accounts;  -- $10,000 total

-- Another session tries to transfer between accounts
-- It might get aborted with serialization failure

INSERT INTO audit_log VALUES ('Total checked: $10,000');
COMMIT;

-- If conflict detected:
-- ERROR: could not serialize access due to read/write dependencies
-- Solution: Retry the transaction

                        
                        Best Practice: Use READ COMMITTED for most OLTP applications. Use SERIALIZABLE only when absolute correctness is required (financial reconciliation, inventory counts).
                    

Locking Mechanisms

Locks prevent concurrent transactions from conflicting. The database automatically acquires locks, but understanding them helps you avoid performance problems.

Diagram comparing row-level and table-level locks with shared and exclusive lock compatibility — Database locking granularity: row-level locks for high concurrency vs table-level locks for bulk operations

Row-Level Locks (Fine-Grained)

Lock only the specific rows being accessed. Allows high concurrency—different transactions can modify different rows simultaneously.

-- Shared lock (read): Multiple transactions can hold simultaneously
SELECT * FROM orders WHERE id = 1 FOR SHARE;
-- Also: FOR KEY SHARE (PostgreSQL - weakest, allows FK checks)

-- Exclusive lock (write): Only one transaction can hold
SELECT * FROM orders WHERE id = 1 FOR UPDATE;
-- Blocks other FOR UPDATE and FOR SHARE on same row

-- FOR UPDATE with SKIP LOCKED (job queues!)
-- Grab the next available row without waiting
SELECT * FROM jobs WHERE status = 'pending'
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED;

-- NOWAIT: Fail immediately if can't get lock
SELECT * FROM inventory WHERE product_id = 100
FOR UPDATE NOWAIT;
-- Returns error instead of waiting if row is locked

Table-Level Locks (Coarse-Grained)

Lock the entire table. Simpler but reduces concurrency. Used for DDL operations or bulk loads.

-- PostgreSQL table locks
LOCK TABLE orders IN ACCESS SHARE MODE;      -- Read only, blocks drops
LOCK TABLE orders IN ROW SHARE MODE;         -- SELECT FOR UPDATE allowed
LOCK TABLE orders IN ROW EXCLUSIVE MODE;     -- UPDATE/DELETE allowed
LOCK TABLE orders IN SHARE UPDATE EXCLUSIVE; -- VACUUM, create index concurrently
LOCK TABLE orders IN SHARE MODE;             -- Read-only, blocks writes
LOCK TABLE orders IN SHARE ROW EXCLUSIVE;    -- Blocks writes and share mode
LOCK TABLE orders IN EXCLUSIVE MODE;         -- Blocks almost everything
LOCK TABLE orders IN ACCESS EXCLUSIVE MODE;  -- Blocks everything (DDL)

-- MySQL table locks
LOCK TABLES orders WRITE;  -- Exclusive, blocks all access
LOCK TABLES orders READ;   -- Shared, blocks writes
UNLOCK TABLES;

-- InnoDB intention locks (internal)
-- IS (intention shared): Transaction wants row shared locks
-- IX (intention exclusive): Transaction wants row exclusive locks

Lock Compatibility Matrix

Existing Lock	Shared (Read)	Exclusive (Write)
None	✅ Granted	✅ Granted
Shared	✅ Granted	❌ Wait
Exclusive	❌ Wait	❌ Wait

Deadlocks

A deadlock occurs when two or more transactions are waiting for each other's locks—neither can proceed.

Circular wait diagram showing two transactions each holding a lock the other needs, creating a deadlock — Deadlock circular wait: Transaction A holds Row 1 and waits for Row 2, while Transaction B holds Row 2 and waits for Row 1

Classic Deadlock Scenario

Time	Transaction A	Transaction B
T1	Lock Row 1 ✓	Lock Row 2 ✓
T2	Request Lock Row 2... ⏳	Request Lock Row 1... ⏳
T3	DEADLOCK! Both waiting forever

Deadlock Detection

-- PostgreSQL: View current locks
SELECT 
    l.pid,
    a.usename,
    l.mode,
    l.granted,
    a.query
FROM pg_locks l
JOIN pg_stat_activity a ON l.pid = a.pid
WHERE NOT l.granted;  -- Waiting for locks

-- Find blocked queries
SELECT 
    blocked.pid AS blocked_pid,
    blocked.query AS blocked_query,
    blocking.pid AS blocking_pid,
    blocking.query AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_locks blocked_locks ON blocked.pid = blocked_locks.pid
JOIN pg_locks blocking_locks ON blocked_locks.locktype = blocking_locks.locktype
    AND blocked_locks.relation = blocking_locks.relation
JOIN pg_stat_activity blocking ON blocking_locks.pid = blocking.pid
WHERE NOT blocked_locks.granted
    AND blocking_locks.granted;

-- MySQL: Show InnoDB locks
SELECT * FROM performance_schema.data_locks;
SELECT * FROM performance_schema.data_lock_waits;
SHOW ENGINE INNODB STATUS\G  -- Includes deadlock info

Deadlock Prevention Strategies

-- Strategy 1: Consistent ordering - always lock in same order
-- BAD: Random order
-- Transaction A: Lock accounts(1), then accounts(2)
-- Transaction B: Lock accounts(2), then accounts(1)

-- GOOD: Always lock lower ID first
BEGIN;
SELECT * FROM accounts WHERE id IN (1, 2) ORDER BY id FOR UPDATE;
-- Both transactions lock 1 first, then 2

-- Strategy 2: Lock timeout
SET lock_timeout = '5s';  -- PostgreSQL
SET innodb_lock_wait_timeout = 5;  -- MySQL (seconds)

-- Strategy 3: NOWAIT - fail immediately
SELECT * FROM orders WHERE id = 1 FOR UPDATE NOWAIT;
-- Raises error immediately if can't get lock

-- Strategy 4: SKIP LOCKED - for job queues
SELECT * FROM jobs WHERE status = 'pending' 
LIMIT 1 FOR UPDATE SKIP LOCKED;
-- If row locked, just grab the next one

-- Strategy 5: Reduce transaction duration
-- Keep transactions short!
-- Do expensive computation OUTSIDE the transaction

                        
                        When Deadlock Occurs: The database automatically detects it and kills one transaction (the "victim"). Your application must be ready to RETRY the aborted transaction!
                    

MVCC (Multi-Version Concurrency Control)

MVCC is how PostgreSQL, MySQL (InnoDB), and Oracle handle concurrent access without heavy locking. Instead of locking rows, they keep multiple versions of each row.

Timeline diagram showing MVCC row versioning with transaction snapshots reading different versions simultaneously — MVCC in action: multiple transactions see consistent snapshots while writers create new row versions

MVCC Concurrent Transaction Flow

sequenceDiagram
    participant A as Transaction A
    participant DB as Database
    participant B as Transaction B

    A->>DB: BEGIN (Snapshot at T1)
    B->>DB: BEGIN (Snapshot at T2)

    B->>DB: UPDATE row SET val = new
    Note over DB: Creates new version (T2)
    B->>DB: COMMIT

    A->>DB: SELECT row
    Note over A,DB: Still sees T1 snapshot version
    A->>DB: COMMIT

    Note over A,B: Readers never block writers

How MVCC Works

Write: Instead of overwriting, create a new version of the row
Read: See the version that was committed before your transaction started
Cleanup: Garbage collector removes old versions no longer needed

Result: Readers never block writers, writers never block readers!

-- PostgreSQL MVCC implementation
-- Each row has hidden system columns:
-- xmin: Transaction ID that created this row version
-- xmax: Transaction ID that deleted/updated (0 if current)

-- See hidden columns
SELECT xmin, xmax, * FROM products LIMIT 5;

-- Visibility rules:
-- A row version is visible if:
-- 1. xmin is committed AND before your snapshot
-- 2. xmax is 0 (not deleted) OR xmax is not committed OR after your snapshot

-- Check for bloat (old row versions not yet cleaned)
SELECT 
    relname,
    n_dead_tup AS dead_rows,
    n_live_tup AS live_rows,
    round(n_dead_tup * 100.0 / nullif(n_live_tup + n_dead_tup, 0), 2) AS dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

-- Force cleanup
VACUUM ANALYZE products;  -- Normal cleanup
VACUUM FULL products;     -- Reclaim disk space (locks table!)

                        
                        MVCC Trade-off: Great concurrency but requires regular VACUUM to clean old row versions. PostgreSQL's autovacuum handles this, but busy tables may need tuning.
                    

Distributed Transactions Overview

When transactions span multiple databases or services, maintaining ACID becomes much harder. The classic solution is the Two-Phase Commit (2PC) protocol.

Two-Phase Commit (2PC)

Phase 1 - Prepare:

Coordinator sends "prepare to commit" to all participants
Each participant writes changes to log, responds "ready" or "abort"

Phase 2 - Commit:

If all say "ready": Coordinator sends "commit" to all
If any say "abort": Coordinator sends "rollback" to all

-- PostgreSQL prepared transactions (for 2PC)
-- Enable in postgresql.conf: max_prepared_transactions = 100

BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
PREPARE TRANSACTION 'transfer_123';  -- Phase 1: Prepare

-- Later, after all participants ready:
COMMIT PREPARED 'transfer_123';  -- Phase 2: Commit
-- OR if any failed:
ROLLBACK PREPARED 'transfer_123';

-- Check pending prepared transactions
SELECT * FROM pg_prepared_xacts;

                        
                        2PC Problems: Blocking (if coordinator fails), performance overhead, doesn't scale well. Modern systems often use Saga pattern (compensating transactions) or eventual consistency instead.
                    

Conclusion & Next Steps

Understanding transactions and concurrency control is essential for building reliable database applications. These concepts protect data integrity in real-world scenarios with concurrent users and potential failures.

Cookie Consent

Cookie Preferences