Process Lifecycle
A process is a running program — an ELF binary loaded into memory with its own address space, file descriptor table, and CPU state. On Linux, every process except PID 1 (systemd/init) is created by another process. The family tree of processes stretches back to PID 1, which is the ancestor of all user-space processes.
stateDiagram-v2
[*] --> Created: fork() called
Created --> Running: Scheduler selects process
Running --> Sleeping_Interruptible: Wait for I/O or event (e.g., read())
Running --> Sleeping_Uninterruptible: Wait for hardware I/O (cannot be interrupted)
Running --> Stopped: SIGSTOP received (Ctrl+Z)
Stopped --> Running: SIGCONT received (fg/bg)
Sleeping_Interruptible --> Running: Event occurs or signal received
Sleeping_Uninterruptible --> Running: I/O completes
Running --> Zombie: Process calls exit()
Zombie --> [*]: Parent calls wait() — entry removed from process table
fork() and exec()
Linux creates new processes using two system calls in sequence:
- fork(): Creates an exact copy of the calling process. The child gets a new PID but inherits the parent's memory, file descriptors, and signal handlers. Memory is copied lazily via Copy-on-Write (CoW) — pages are shared read-only until one process writes to them, at which point a private copy is made.
- exec(): Replaces the current process's address space with a new program. The PID stays the same; the code, data, stack, and heap are replaced with the new program's.
# Observe the fork/exec pattern in your shell
# When you type a command, the shell:
# 1. Calls fork() — creates a copy of itself
# 2. The child calls exec() — replaces itself with the command binary
# 3. The parent (shell) calls wait() — blocks until child exits
# See all system calls a command makes (including fork/exec)
strace -e trace=process ls /tmp 2>&1 | head -20
# Look for: execve("/usr/bin/ls", ...) — that's the exec() call
# See the parent/child relationship
echo "Shell PID: $$"
bash -c 'echo "Child PID: $$, Parent PID: $PPID"'
# Child's PPID == Shell's PID
# Observe fork in Python
python3 -c "
import os
pid = os.fork()
if pid == 0:
print(f'Child: PID={os.getpid()}, PPID={os.getppid()}')
os._exit(0)
else:
print(f'Parent: PID={os.getpid()}, child_pid={pid}')
os.wait() # Wait for child to finish
"
wait() and Zombie Processes
When a child process exits, its entry in the kernel's process table is not immediately removed. It becomes a zombie — it holds only its exit status, waiting for the parent to call wait() to retrieve it. Once the parent calls wait(), the zombie is fully removed.
wait() (a common bug), zombie processes accumulate. They don't consume CPU or memory (just a PID table slot), but if enough accumulate, the system runs out of PIDs and can't create new processes. This is why container runtimes need a proper init process that reaps zombies — it's one of the reasons tini exists as a Docker init wrapper.
# Observe zombie processes
# Z = zombie in the STAT column
ps aux | awk '$8 == "Z"'
# Create a zombie deliberately (academic example — exits immediately)
python3 -c "
import os, time
pid = os.fork()
if pid == 0:
os._exit(0) # Child exits immediately
else:
print(f'Child {pid} is now a zombie (parent not calling wait yet)')
time.sleep(5) # During these 5s, run: ps aux | grep Z
os.wait() # Now reap the zombie
print('Zombie reaped')
" &
sleep 1 && ps aux | grep defunct | head -3
Process States
| State | Code | Meaning | Example |
|---|---|---|---|
| Running | R | Currently executing on CPU or in run queue | Active computation |
| Interruptible Sleep | S | Waiting for event; can be woken by signal | Waiting for network I/O, user input |
| Uninterruptible Sleep | D | Waiting for hardware I/O; cannot be interrupted | Disk read in progress, NFS stall |
| Stopped | T | Paused by SIGSTOP signal | Ctrl+Z in terminal |
| Zombie | Z | Exited but parent hasn't called wait() | Bug in parent process |
| Idle | I | Kernel thread with no work to do | kworker threads when idle |
# See process states with ps
ps aux | awk 'NR==1 || $8 ~ /[RSDT]/' | head -20
# STAT column: R=running, S=sleeping, D=disk wait, T=stopped, Z=zombie
# Additional flags: s=session leader, <=high priority, N=low priority, l=multithreaded, +=foreground
# Count processes in each state
ps -e -o stat= | sort | uniq -c | sort -rn
# Processes in uninterruptible sleep (D state) — potential I/O bottleneck
ps aux | awk '$8 == "D"'
The CFS Scheduler
Linux's CFS (Completely Fair Scheduler) tries to give every process an equal share of CPU time. It tracks each process's virtual runtime (vruntime) — the amount of CPU time the process has used, normalised by its priority. The scheduler always picks the process with the lowest vruntime — the one that has received the least CPU time relative to its fair share.
Process priority is controlled by nice values, ranging from -20 (highest priority) to +19 (lowest priority). Higher priority (lower nice value) means the process's vruntime accumulates more slowly, so it gets selected more often.
# View nice values of processes
ps -eo pid,ni,cmd | head -20
# NI column = nice value; PR = actual scheduler priority
# Start a CPU-intensive process at low priority (nice +10)
nice -n 10 python3 -c "
while True: pass # 100% CPU, but nice=10 so it yields to others
" &
LOW_PRI_PID=$!
# Start another at normal priority
python3 -c "
import time; time.sleep(5) # Not CPU intensive, for comparison
" &
# Change nice value of running process
renice -n 15 -p $LOW_PRI_PID # Make it even lower priority
# Kill our test processes
kill $LOW_PRI_PID 2>/dev/null
# View CPU time consumed per process (via /proc)
cat /proc/$$/status | grep -E "VmRSS|Threads|voluntary"
# cgroups CPU quota (preview — covered fully in Part 21)
# cat /sys/fs/cgroup/cpu/docker//cpu.cfs_quota_us
cgroups CPU Limits
cgroups (control groups) allow the kernel to limit, account for, and isolate the resource usage (CPU, memory, I/O) of groups of processes. Docker container CPU limits are implemented via cgroups. When you run docker run --cpus 0.5, Docker sets the container's cgroup cpu.cfs_quota_us to half the period, restricting it to 50% of one CPU core — regardless of nice values.
Signals
Signals are asynchronous notifications sent to a process. They're the primary way the OS and other processes communicate with a running process about events (user requesting exit, terminal resize, timer expiry, etc.).
| Signal | Number | Default Action | Common Use |
|---|---|---|---|
| SIGHUP | 1 | Terminate | Daemon config reload (nginx, sshd) |
| SIGINT | 2 | Terminate | Ctrl+C — user interrupt |
| SIGQUIT | 3 | Core dump | Ctrl+\ — quit with core dump |
| SIGKILL | 9 | Terminate (uncatchable) | Force kill — cannot be caught or ignored |
| SIGSEGV | 11 | Core dump | Segmentation fault — invalid memory access |
| SIGTERM | 15 | Terminate | Graceful shutdown — catchable |
| SIGSTOP | 19 | Stop (uncatchable) | Pause process (Ctrl+Z) |
| SIGCONT | 18 | Continue | Resume stopped process |
| SIGCHLD | 17 | Ignore | Child process changed state |
| SIGUSR1/2 | 10/12 | Terminate | User-defined — app-specific signals |
# Send signals to processes
kill -TERM PID # Graceful shutdown (default kill)
kill -HUP $(pgrep nginx) # Reload nginx config
kill -9 PID # Force kill (last resort only)
kill -0 PID # Check if process exists (no signal sent)
# Signal handling in a shell script
cleanup() {
echo "SIGTERM received — cleaning up..."
rm -f /tmp/my-lockfile
exit 0
}
trap cleanup SIGTERM SIGINT # Register handler
# Demonstrate signal handling in Python
python3 -c "
import signal, time
def handler(signum, frame):
print(f'Caught signal {signum} — shutting down gracefully')
exit(0)
signal.signal(signal.SIGTERM, handler)
signal.signal(signal.SIGINT, handler)
print(f'PID {__import__(\"os\").getpid()} — send SIGTERM to test')
while True: time.sleep(1)
" &
PID=$!
sleep 2
kill -TERM $PID # Should print graceful shutdown message
wait $PID
Inter-Process Communication
Processes are isolated by design — they can't access each other's memory. IPC mechanisms provide controlled, safe ways for processes to communicate and synchronise.
# === 1. Anonymous Pipes — unidirectional, parent-child only ===
# The | operator in your shell creates anonymous pipes
ls -la | grep ".conf" # ls stdout → pipe → grep stdin
# Named pipes (FIFOs) — work between unrelated processes
mkfifo /tmp/my-pipe
echo "hello from writer" > /tmp/my-pipe & # Writer blocks until reader connects
cat /tmp/my-pipe # Reader — unblocks both
rm /tmp/my-pipe
# === 2. Unix Domain Sockets — bidirectional, same machine ===
# Used by: Docker (dockerd socket), systemd, X11, DBus, Postgres
ls -la /var/run/*.sock 2>/dev/null # List Unix sockets on system
file /var/run/docker.sock 2>/dev/null
# "socket" type file
# Query Docker via its Unix socket directly
curl --unix-socket /var/run/docker.sock http://localhost/version 2>/dev/null | python3 -m json.tool | head -10
# === 3. Shared Memory ===
# Fastest IPC — both processes map the same physical pages
# Used by: databases (PostgreSQL shared_buffers), Redis, high-perf apps
ipcs -m # List POSIX/SysV shared memory segments
# Or view via /dev/shm
ls -lh /dev/shm/
# === 4. Message Queues — async, buffered ===
ipcs -q # List SysV message queues
# === Summary: Which IPC to use? ===
# Pipe: simple parent-child, one direction, streaming
# Unix socket: bidirectional, unrelated processes, same machine
# Shared memory: highest throughput, needs synchronisation (mutexes)
# Network socket: across machines (TCP/UDP — covered in Part 11)
The Docker Daemon Uses a Unix Socket
When you run docker ps, the Docker CLI doesn't talk to dockerd over TCP — it sends HTTP requests over the Unix domain socket at /var/run/docker.sock. This is fast (no TCP overhead, no TLS), stays on the machine, and can be permission-controlled via file permissions on the socket file. When you mount -v /var/run/docker.sock:/var/run/docker.sock into a container, you're granting it full control over the Docker daemon — equivalent to root on the host. This is a major security concern in CI/CD pipelines.
Exercises
# Exercise 1: Explore the process tree
pstree -p | head -30 # See parent-child relationships
# Note: systemd(1) is the root of all user processes
# Exercise 2: Observe fork() in shell
echo "My PID: $$"
(echo "Subshell PID: $$, Parent: $PPID")
# Subshell PPID should match parent PID
# Exercise 3: Find zombie and D-state processes
ps aux | awk '$8 == "Z" {print "ZOMBIE:", $0}'
ps aux | awk '$8 == "D" {print "DISK WAIT:", $0}'
# Exercise 4: Test signal handling
sleep 60 &
SLEEP_PID=$!
echo "Sleep PID: $SLEEP_PID"
kill -STOP $SLEEP_PID # Pause it
ps aux | grep $SLEEP_PID | grep -v grep # Should show T state
kill -CONT $SLEEP_PID # Resume it
kill $SLEEP_PID # Terminate it
# Exercise 5: View IPC resources on your system
ipcs # Show all IPC resources (shared memory, semaphores, message queues)
ls /dev/shm/ # POSIX shared memory
Conclusion & Next Steps
Processes are the fundamental unit of execution in Linux. The fork/exec model creates all user processes from PID 1 down. The CFS scheduler shares CPU time fairly while respecting priority. Signals provide asynchronous communication. IPC mechanisms — pipes, Unix sockets, shared memory — enable coordination without breaking process isolation. All of this is the substrate on which containers, web servers, databases, and every other system software runs.