Back to Technology

Part 13: Deadlocks & Prevention

January 31, 2026 Wasil Zafar 25 min read

Understand deadlock conditions, learn detection algorithms, and master prevention strategies including Banker's algorithm.

Table of Contents

  1. Introduction
  2. Coffman Conditions
  3. Resource Allocation Graph
  4. Deadlock Prevention
  5. Deadlock Avoidance
  6. Banker's Algorithm
  7. Deadlock Detection
  8. Recovery from Deadlock
  9. Conclusion & Next Steps

Introduction

A deadlock occurs when processes are waiting for resources held by each other, creating a circular dependency that prevents any progress. Understanding deadlocks is crucial for building reliable systems.

Series Context: This is Part 13 of 24 in the Computer Architecture & Operating Systems Mastery series. Building on synchronization, we now explore the dangerous deadlock condition.

Computer Architecture & OS Mastery

Your 24-step learning path • Currently on Step 13
1
Part 1: Foundations of Computer Systems
System overview, architectures, OS role
2
Digital Logic & CPU Building Blocks
Gates, registers, datapath, microarchitecture
3
Instruction Set Architecture (ISA)
RISC vs CISC, instruction formats, addressing
4
Assembly Language & Machine Code
Registers, stack, calling conventions
5
Assemblers, Linkers & Loaders
Object files, ELF, dynamic linking
6
Compilers & Program Translation
Lexing, parsing, code generation
7
CPU Execution & Pipelining
Fetch-decode-execute, hazards, prediction
8
OS Architecture & Kernel Design
Monolithic, microkernel, system calls
9
Processes & Program Execution
Process lifecycle, PCB, fork/exec
10
Threads & Concurrency
Threading models, pthreads, race conditions
11
CPU Scheduling Algorithms
FCFS, RR, CFS, real-time scheduling
12
Synchronization & Coordination
Locks, semaphores, classic problems
13
Deadlocks & Prevention
Coffman conditions, Banker's algorithm
You Are Here
14
Memory Hierarchy & Cache
L1/L2/L3, cache coherence, NUMA
15
Memory Management Fundamentals
Address spaces, fragmentation, allocation
16
Virtual Memory & Paging
Page tables, TLB, demand paging
17
File Systems & Storage
Inodes, journaling, ext4, NTFS
18
I/O Systems & Device Drivers
Interrupts, DMA, disk scheduling
19
Multiprocessor Systems
SMP, NUMA, cache coherence
20
OS Security & Protection
Privilege levels, ASLR, sandboxing
21
Virtualization & Containers
Hypervisors, namespaces, cgroups
22
Advanced Kernel Internals
Linux subsystems, kernel debugging
23
Case Studies
Linux vs Windows vs macOS
24
Capstone Projects
Shell, thread pool, paging simulator
The Deadlock Problem: Two cars meet on a narrow bridge. Neither can proceed because each is waiting for the other to back up. This is deadlock—and it happens in computer systems too!

Coffman Conditions

Deadlock can only occur when all four conditions hold simultaneously. Break any one condition to prevent deadlock.

The Four Coffman Conditions

Coffman Conditions for Deadlock (1971):
══════════════════════════════════════════════════════════════

1. MUTUAL EXCLUSION
   • At least one resource must be non-shareable
   • Only one process can use resource at a time
   Example: Printer can only print one job at a time

2. HOLD AND WAIT
   • Process holds at least one resource while waiting
     for additional resources held by others
   Example: Hold lock A, wait for lock B

3. NO PREEMPTION
   • Resources cannot be forcibly taken away
   • Process must voluntarily release resources
   Example: Cannot force a process to release a mutex

4. CIRCULAR WAIT
   • Chain of processes: P1 waits for P2, P2 waits for P3,
     ..., Pn waits for P1
   Example: Dining philosophers all holding left fork


ALL FOUR must be true for deadlock to occur!
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
┌────────────┐     ┌────────────┐
│ Process A  │────→│ Resource 1 │ (holds)
│ waits for  │     └────────────┘
│ Resource 2 │            ↑
└────────────┘     ┌──────┴─────┐
      │            │ Process B  │ (holds)
      └───────────→│ waits for  │
                   │ Resource 1 │
                   └────────────┘
      
DEADLOCK! Circular wait exists.

Resource Allocation Graph

A Resource Allocation Graph (RAG) visualizes the state of resource allocation and can detect potential deadlocks.

Resource Allocation Graph Notation:
══════════════════════════════════════════════════════════════

Nodes:
  ○ Process (circle)
  ▢ Resource type (rectangle with dots for instances)

Edges:
  ○ ──→ ▢  Request edge: Process requests resource
  ▢ ──→ ○  Assignment edge: Resource assigned to process


Example RAG - No Deadlock:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

     ┌──────────────────────────┐
     ↓                          │
    ○ P1     ○ P2 ───────→ ▢ R2 ●
     │        ↑              │
     │        │              │
     ▼        └──────────────┘
   ▢ R1 ● ───────────────→ P3 ○

P1 holds R1, P2 waits for R2
R2 assigned to P3 → P2 can proceed when P3 releases R2
No cycle → No deadlock


Example RAG - With Deadlock:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

   ○ P1 ───────→ ▢ R1 ● ───────→ ○ P2
     ↑                             │
     │                             ▼
     └────────── ▢ R2 ● ←──────────┘

P1 holds R2, requests R1
P2 holds R1, requests R2
Cycle exists → DEADLOCK!


Detection Rule:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
• Single instance per resource type: Cycle = Deadlock
• Multiple instances: Cycle necessary but not sufficient
  (may still be able to satisfy from other instances)
Graph Reduction: Remove processes that can complete (have all needed resources). If all processes can be removed, no deadlock. If reduction gets stuck with remaining processes—deadlock!

Deadlock Prevention

Prevent deadlock by ensuring at least one Coffman condition can never hold.

Breaking Each Condition

Deadlock Prevention Strategies:
══════════════════════════════════════════════════════════════

1. BREAK MUTUAL EXCLUSION
   • Make resources shareable where possible
   • Example: Read-only files, spooling (virtual printers)
   ✗ Often not practical - many resources inherently exclusive

2. BREAK HOLD AND WAIT
   Option A: Request ALL resources at once before starting
      • process_start(need: [R1, R2, R3])
      ✗ Poor utilization, process may not know all needs
   
   Option B: Release all resources before requesting new ones
      • release_all(); request(new_resources);
      ✗ May lose work, expensive to re-acquire

3. BREAK NO PREEMPTION
   • If process can't get resource, release what it holds
   • OS can forcibly take resources
   ✗ Only works for state-saveable resources (CPU, memory)
   ✗ Doesn't work for printers, mutex locks

4. BREAK CIRCULAR WAIT (Most Practical!)
   • Impose total ordering on all resource types
   • Processes must request resources in order
   
   Example: R1 < R2 < R3 < R4
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   Process holding R3 can only request R4 (higher)
   Process holding R4 cannot request R1, R2, R3 (lower)
   
   ✓ No circular wait possible!
   ✗ Must define ordering, may be inconvenient
/* Resource Ordering to Prevent Circular Wait */

/* Define resource ordering: LOCK_A < LOCK_B < LOCK_C */
#define LOCK_A 1
#define LOCK_B 2
#define LOCK_C 3

pthread_mutex_t locks[4];

void safe_acquire(int resource1, int resource2) {
    /* Always acquire in order to prevent circular wait */
    int first = (resource1 < resource2) ? resource1 : resource2;
    int second = (resource1 < resource2) ? resource2 : resource1;
    
    pthread_mutex_lock(&locks[first]);
    pthread_mutex_lock(&locks[second]);
}

void safe_release(int resource1, int resource2) {
    pthread_mutex_unlock(&locks[resource1]);
    pthread_mutex_unlock(&locks[resource2]);
}

/* Usage */
void thread_func() {
    safe_acquire(LOCK_B, LOCK_A);  /* Automatically acquires A then B */
    /* ... critical section ... */
    safe_release(LOCK_A, LOCK_B);
}

Deadlock Avoidance

Prevention is restrictive. Avoidance allows more flexibility—dynamically check if granting a request could lead to deadlock.

Deadlock Avoidance Concept:
══════════════════════════════════════════════════════════════

SAFE STATE:
• System can allocate resources to each process (up to max)
  in SOME order and avoid deadlock
• Exists at least one "safe sequence" of execution

UNSAFE STATE:
• No guaranteed safe sequence
• Deadlock is POSSIBLE (not certain, but might occur)


State Transitions:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

      ┌─────────────────────┐
      │                     │
      │     SAFE STATE      │──── All requests considered
      │                     │     carefully before granting
      └─────────────────────┘
               │
      Request would make │
      state unsafe       │
               ↓
      ┌─────────────────────┐
      │                     │
      │    UNSAFE STATE     │──── Deadlock may or may not
      │                     │     occur
      └─────────────────────┘
               │
      Processes make bad  │
      decisions           │
               ↓
      ┌─────────────────────┐
      │                     │
      │      DEADLOCK       │──── Guaranteed stuck
      │                     │
      └─────────────────────┘


Avoidance Strategy:
• Only grant request if resulting state is SAFE
• If granting would lead to unsafe state → make process wait
Key Insight: An unsafe state doesn't mean deadlock has occurred—it means the system can no longer guarantee deadlock won't occur. The system might still complete successfully if processes happen to request resources in a favorable order.

Banker's Algorithm

Dijkstra's Banker's Algorithm determines if granting a request leaves the system in a safe state. Named after how a banker decides loan requests.

Banker's Algorithm Example

Banker's Algorithm:
══════════════════════════════════════════════════════════════

Data Structures (n processes, m resource types):
• Available[m]:    Available instances of each resource
• Max[n][m]:       Maximum demand of each process
• Allocation[n][m]: Currently allocated to each process
• Need[n][m]:      Remaining need (Max - Allocation)


Example: 5 processes (P0-P4), 3 resource types (A, B, C)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Available: A=3, B=3, C=2

         Allocation    Max        Need
         A  B  C      A  B  C    A  B  C
   P0    0  1  0      7  5  3    7  4  3
   P1    2  0  0      3  2  2    1  2  2
   P2    3  0  2      9  0  2    6  0  0
   P3    2  1  1      2  2  2    0  1  1
   P4    0  0  2      4  3  3    4  3  1


Safety Algorithm - Find Safe Sequence:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Work = Available = [3, 3, 2]
Finish = [false, false, false, false, false]

Round 1: Find process with Need ≤ Work
  P0: Need[7,4,3] ≤ [3,3,2]? NO
  P1: Need[1,2,2] ≤ [3,3,2]? YES ✓
  
  Execute P1: Work = Work + Allocation[P1] = [3,3,2]+[2,0,0] = [5,3,2]
  Finish[P1] = true
  Sequence: <P1>

Round 2: Work = [5, 3, 2]
  P0: Need[7,4,3] ≤ [5,3,2]? NO
  P3: Need[0,1,1] ≤ [5,3,2]? YES ✓
  
  Execute P3: Work = [5,3,2]+[2,1,1] = [7,4,3]
  Finish[P3] = true
  Sequence: <P1, P3>

Round 3: Work = [7, 4, 3]
  P0: Need[7,4,3] ≤ [7,4,3]? YES ✓
  
  Execute P0: Work = [7,4,3]+[0,1,0] = [7,5,3]
  Sequence: <P1, P3, P0>

Round 4: Work = [7, 5, 3]
  P2: Need[6,0,0] ≤ [7,5,3]? YES ✓
  
  Execute P2: Work = [7,5,3]+[3,0,2] = [10,5,5]
  Sequence: <P1, P3, P0, P2>

Round 5: Work = [10, 5, 5]
  P4: Need[4,3,1] ≤ [10,5,5]? YES ✓
  
  Execute P4
  Sequence: <P1, P3, P0, P2, P4>

SAFE! All processes can complete.
# Banker's Algorithm Implementation
def is_safe_state(available, max_need, allocation):
    n = len(allocation)  # Number of processes
    m = len(available)   # Number of resource types
    
    # Calculate need matrix
    need = [[max_need[i][j] - allocation[i][j] 
             for j in range(m)] for i in range(n)]
    
    work = available[:]
    finish = [False] * n
    safe_sequence = []
    
    while len(safe_sequence) < n:
        found = False
        for i in range(n):
            if not finish[i]:
                # Check if Need[i] <= Work
                if all(need[i][j] <= work[j] for j in range(m)):
                    # Process can complete
                    for j in range(m):
                        work[j] += allocation[i][j]
                    finish[i] = True
                    safe_sequence.append(f"P{i}")
                    found = True
                    break
        
        if not found:
            # No process can proceed - unsafe!
            return False, []
    
    return True, safe_sequence

# Example usage
available = [3, 3, 2]
max_need = [[7,5,3], [3,2,2], [9,0,2], [2,2,2], [4,3,3]]
allocation = [[0,1,0], [2,0,0], [3,0,2], [2,1,1], [0,0,2]]

safe, sequence = is_safe_state(available, max_need, allocation)
print(f"Safe: {safe}, Sequence: {sequence}")
# Output: Safe: True, Sequence: ['P1', 'P3', 'P0', 'P2', 'P4']
Practical Limitations: Banker's requires knowing maximum resource needs in advance (often unknown), runs in O(n²m) per request (expensive), and assumes fixed number of processes/resources.

Deadlock Detection

Instead of preventing/avoiding, allow deadlock to occur but detect it and recover. May be more efficient if deadlocks are rare.

Deadlock Detection Algorithm:
══════════════════════════════════════════════════════════════

Similar to Safety Algorithm, but:
• Use current Request matrix instead of Max
• Optimistic: assume processes will release resources

Algorithm:
1. Work = Available
2. For each process not known to be involved in deadlock:
   - If Request[i] ≤ Work, assume it will complete
   - Work = Work + Allocation[i]
   - Mark process as able to finish
3. Repeat until no more processes can finish
4. Any unmarked processes are DEADLOCKED


When to Run Detection?
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
• On every resource request (expensive, immediate detection)
• Periodically (e.g., every 5 minutes)
• When CPU utilization drops below threshold
• When resource request fails

Trade-off: Detection frequency vs overhead
/* Simple deadlock detection via wait-for graph */
#include <stdbool.h>

#define MAX_PROCS 100

/* Wait-for graph: waitfor[i][j] = 1 if process i waits for j */
int waitfor[MAX_PROCS][MAX_PROCS];
bool visited[MAX_PROCS];
bool rec_stack[MAX_PROCS];  /* Recursion stack for cycle detection */

/* DFS to detect cycle (deadlock) */
bool detect_cycle(int proc, int n) {
    visited[proc] = true;
    rec_stack[proc] = true;
    
    for (int j = 0; j < n; j++) {
        if (waitfor[proc][j]) {  /* proc waits for j */
            if (!visited[j]) {
                if (detect_cycle(j, n))
                    return true;  /* Cycle found */
            } else if (rec_stack[j]) {
                return true;  /* Back edge = cycle */
            }
        }
    }
    
    rec_stack[proc] = false;
    return false;
}

bool deadlock_exists(int n) {
    for (int i = 0; i < n; i++) {
        visited[i] = false;
        rec_stack[i] = false;
    }
    
    for (int i = 0; i < n; i++) {
        if (!visited[i] && detect_cycle(i, n))
            return true;
    }
    return false;
}

Recovery from Deadlock

Once deadlock is detected, the system must break it. Options range from gentle to drastic.

Recovery Strategies

Deadlock Recovery Options:
══════════════════════════════════════════════════════════════

1. PROCESS TERMINATION
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   Option A: Abort ALL deadlocked processes
      ✓ Guaranteed to break deadlock
      ✗ Expensive - lose all work
   
   Option B: Abort ONE at a time until deadlock broken
      ✓ Less work lost
      ✗ Multiple detection cycles needed
   
   Which to abort? Consider:
   • Priority of process
   • How long it has computed
   • How much longer to complete
   • Resources it holds
   • Resources it needs to complete
   • Type of process (interactive vs batch)


2. RESOURCE PREEMPTION
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   Temporarily take resources from a process
   
   Selecting victim:
   • Choose process that loses least work
   • Consider cost of rollback
   
   Rollback:
   • Roll back process to safe state (checkpoint)
   • Or roll back completely and restart
   
   Starvation prevention:
   • Limit how many times same process can be victim
   • Include rollback count in victim selection


3. CHECKPOINT AND ROLLBACK
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   Periodically save process state
   On deadlock: roll back one process to earlier checkpoint
   
   ┌─────────────────────────────────────────────────┐
   │ Process execution                               │
   │ ─────●─────────●─────────●──────────►          │
   │    chkpt1    chkpt2    chkpt3                  │
   │                                     ↑           │
   │              Deadlock detected here─┘           │
   │                        │                        │
   │              Roll back to chkpt2               │
   └─────────────────────────────────────────────────┘
Real-World Deadlock Handling:
══════════════════════════════════════════════════════════════

Operating Systems:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
• Most ignore deadlock (ostrich algorithm)!
• Assume deadlocks are rare
• User reboot is the "recovery"
• Windows: Waits with timeouts

Databases:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
• Actively detect deadlocks (wait-for graphs)
• ABORT one transaction (victim selection)
• Transaction automatically retries
• PostgreSQL: deadlock_timeout parameter

Distributed Systems:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
• Much harder - no global state!
• Timeout-based detection
• Two-phase locking with deadlock prevention
• Distributed deadlock detection algorithms
Practical Reality: Most systems use a combination—prevention for easy cases (resource ordering), timeouts for detection, and termination for recovery. Pure avoidance (Banker's) is rarely used due to overhead.

Conclusion & Next Steps

Deadlocks are a fundamental challenge in concurrent systems. We've covered:

  • Coffman Conditions: The four necessary conditions for deadlock
  • Resource Allocation Graphs: Visual deadlock analysis
  • Prevention: Break Coffman conditions (especially circular wait)
  • Avoidance: Dynamic safety checking (Banker's algorithm)
  • Detection & Recovery: Allow deadlock, then fix it
Key Takeaway: Resource ordering to prevent circular wait is the most practical prevention strategy. For systems where deadlocks are rare, detection and recovery may be more efficient than prevention overhead.

Next in the Series

In Part 14: Memory Hierarchy & Cache, we'll explore how memory systems are organized for performance—from CPU registers through L1/L2/L3 caches to main memory and beyond.