The Storage Problem
Kubernetes was built for stateless workloads — containers that can be created, destroyed, and rescheduled without consequence. But real applications need to persist data: databases store records, applications cache uploads, message queues hold undelivered messages. Without storage abstractions, every pod restart means total data loss.
Ephemeral Containers vs Persistent Data
Every container starts with a fresh copy of its image's filesystem. Any files written during execution exist only in a thin writable layer (using a union filesystem like OverlayFS). This layer is tied to the container's lifecycle — when the container stops, the layer is garbage collected.
# Demonstrate ephemeral storage — write a file, delete the pod, observe data loss
kubectl run ephemeral-demo --image=busybox --restart=Never -- sh -c "echo 'important data' > /tmp/data.txt && sleep 3600"
# Verify data exists
kubectl exec ephemeral-demo -- cat /tmp/data.txt
# Output: important data
# Delete and recreate — data is gone
kubectl delete pod ephemeral-demo
kubectl run ephemeral-demo --image=busybox --restart=Never -- sh -c "cat /tmp/data.txt 2>/dev/null || echo 'FILE NOT FOUND'"
# Output: FILE NOT FOUND
When You Need Persistence
Not every workload needs persistent storage. Here's the decision framework:
| Workload Type | Storage Needed? | Examples |
|---|---|---|
| Stateless APIs | No | REST APIs, GraphQL servers, microservices |
| Databases | Yes — Critical | PostgreSQL, MySQL, MongoDB, Redis (with AOF) |
| Message Queues | Yes — Critical | Kafka, RabbitMQ, NATS JetStream |
| File Uploads | Yes | User uploads, media processing pipelines |
| ML Training | Yes | Model checkpoints, training datasets |
| Caching | Maybe | Redis (ephemeral OK), local disk caches |
| Batch Jobs | Maybe | Scratch space for intermediate results |
Volume Types
Kubernetes provides multiple volume types, each with different lifecycle guarantees and use cases. A volume is declared in the pod spec and mounted into one or more containers at specified paths.
emptyDir — Shared Scratch Space
emptyDir volume is created when a pod is assigned to a node, exists for the lifetime of that pod, and is deleted when the pod is removed. It survives container restarts within the same pod but not pod deletion or rescheduling.
# emptyDir — shared between containers in a pod
apiVersion: v1
kind: Pod
metadata:
name: shared-scratch
spec:
containers:
- name: writer
image: busybox
command: ["sh", "-c", "while true; do date >> /scratch/log.txt; sleep 5; done"]
volumeMounts:
- name: scratch-volume
mountPath: /scratch
- name: reader
image: busybox
command: ["sh", "-c", "tail -f /data/log.txt"]
volumeMounts:
- name: scratch-volume
mountPath: /data
volumes:
- name: scratch-volume
emptyDir: {}
# emptyDir with memory backing (tmpfs) for sensitive data:
# emptyDir:
# medium: Memory
# sizeLimit: 128Mi
hostPath — Node Filesystem Access
hostPath volumes expose the node's filesystem to pods, creating significant security risks. They bypass namespace isolation, can access sensitive system files, and are generally discouraged in production. Pod Security Standards at the Baseline level restrict hostPath usage.
# hostPath — mounts a path from the host node (use sparingly!)
apiVersion: v1
kind: Pod
metadata:
name: hostpath-demo
spec:
containers:
- name: log-reader
image: busybox
command: ["sh", "-c", "tail -f /host-logs/syslog"]
volumeMounts:
- name: host-logs
mountPath: /host-logs
readOnly: true
volumes:
- name: host-logs
hostPath:
path: /var/log
type: Directory # Options: DirectoryOrCreate, File, FileOrCreate, Socket, CharDevice, BlockDevice
configMap & secret Volumes
ConfigMaps and Secrets can be projected as volumes, exposing their keys as files in the container filesystem. This enables applications to read configuration from files without environment variable pollution.
# ConfigMap mounted as files
apiVersion: v1
kind: Pod
metadata:
name: config-volume-demo
spec:
containers:
- name: app
image: nginx
volumeMounts:
- name: config-vol
mountPath: /etc/app-config
readOnly: true
- name: secret-vol
mountPath: /etc/app-secrets
readOnly: true
volumes:
- name: config-vol
configMap:
name: app-settings
items: # Optional: project specific keys
- key: database.conf
path: db.conf # Mounted as /etc/app-config/db.conf
- name: secret-vol
secret:
secretName: app-credentials
defaultMode: 0400 # Read-only for owner
Projected & DownwardAPI Volumes
Projected volumes combine multiple sources (configMap, secret, downwardAPI, serviceAccountToken) into a single mount point. The Downward API exposes pod and container metadata as files.
# Projected volume — multiple sources in one mount
apiVersion: v1
kind: Pod
metadata:
name: projected-demo
labels:
app: myapp
version: v2
spec:
containers:
- name: app
image: busybox
command: ["sh", "-c", "ls -la /etc/pod-info/ && cat /etc/pod-info/labels && sleep 3600"]
volumeMounts:
- name: pod-info
mountPath: /etc/pod-info
volumes:
- name: pod-info
projected:
sources:
- downwardAPI:
items:
- path: labels
fieldRef:
fieldPath: metadata.labels
- path: cpu-limit
resourceFieldRef:
containerName: app
resource: limits.cpu
- configMap:
name: app-settings
items:
- key: config.yaml
path: config.yaml
- secret:
name: app-credentials
items:
- key: api-key
path: api-key
Persistent Volumes & Claims
The Persistent Volume (PV) subsystem provides an API for provisioning and consuming durable storage. It separates storage provisioning (admin concern) from storage consumption (developer concern) through a binding model.
PV Lifecycle
stateDiagram-v2
[*] --> Available: PV Created
(Static or Dynamic)
Available --> Bound: PVC claims PV
(matching capacity & access mode)
Bound --> Released: PVC deleted
(pod no longer needs storage)
Released --> Available: Recycle policy
(deprecated)
Released --> Deleted: Delete policy
(underlying storage removed)
Released --> Available: Admin manually
clears claimRef (Retain)
Bound --> Bound: Pod using PVC
(data read/write)
# Static PV — admin manually provisions
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-nfs-data
labels:
type: nfs
environment: production
spec:
capacity:
storage: 100Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
server: 10.0.1.50
path: /exports/data
mountOptions:
- hard
- nfsvers=4.1
---
# PVC — developer requests storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-claim
namespace: production
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Gi
selector:
matchLabels:
type: nfs
environment: production
# Inspect PV/PVC binding
kubectl get pv
# NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS
# pv-nfs-data 100Gi RWX Retain Bound production/data-claim
kubectl get pvc -n production
# NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS
# data-claim Bound pv-nfs-data 100Gi RWX
# Describe PV for detailed binding info
kubectl describe pv pv-nfs-data | grep -A5 "Claim"
Access Modes
| Mode | Abbreviation | Description | Use Case |
|---|---|---|---|
ReadWriteOnce |
RWO | Read-write by a single node | Databases, single-instance apps |
ReadOnlyMany |
ROX | Read-only by many nodes | Shared config, static assets |
ReadWriteMany |
RWX | Read-write by many nodes | Shared uploads, CMS, NFS workloads |
ReadWriteOncePod |
RWOP | Read-write by a single pod (K8s 1.27+) | Strict single-writer guarantee |
ReadWriteOncePod (RWOP) when you need true single-pod exclusive access.
Reclaim Policies
| Policy | Behavior When PVC Deleted | Data Safety | Use When |
|---|---|---|---|
Retain |
PV moves to Released; data preserved; admin must manually clean up | Safe | Production databases, compliance requirements |
Delete |
PV and underlying storage asset (e.g., EBS volume) deleted | Risky | Dynamically provisioned volumes in dev/test |
Recycle |
Basic scrub (rm -rf /volume/*) then PV made Available again |
Deprecated | Legacy — use dynamic provisioning instead |
StorageClasses & Dynamic Provisioning
Static PV provisioning doesn't scale — admins can't pre-create volumes for every possible PVC. StorageClasses enable dynamic provisioning: when a PVC references a StorageClass, Kubernetes automatically creates a matching PV from the underlying storage backend.
How Dynamic Provisioning Works
sequenceDiagram
participant Dev as Developer
participant API as API Server
participant SC as StorageClass Controller
participant Prov as Provisioner
(CSI Driver)
participant Cloud as Cloud Provider
(AWS/GCP/Azure)
Dev->>API: Create PVC
(storageClassName: fast-ssd)
API->>SC: PVC pending — no matching PV
SC->>Prov: Provision volume
(100Gi, gp3, encrypted)
Prov->>Cloud: Create EBS volume
(API call)
Cloud-->>Prov: Volume ID: vol-0abc123
Prov->>API: Create PV object
(bound to PVC)
API-->>Dev: PVC Bound ✓
Note over Dev,Cloud: Pod can now mount PVC
# StorageClass — defines HOW storage is provisioned
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
annotations:
storageclass.kubernetes.io/is-default-class: "false"
provisioner: ebs.csi.aws.com
parameters:
type: gp3
iops: "5000"
throughput: "250" # MB/s
encrypted: "true"
kmsKeyId: "arn:aws:kms:us-east-1:123456789:key/abc-123"
reclaimPolicy: Delete
allowVolumeExpansion: true # Allow PVC resize
volumeBindingMode: WaitForFirstConsumer
mountOptions:
- noatime
- nodiratime
Provisioners & Parameters
| Provider | Provisioner | Access Modes | Max IOPS | RWX Support |
|---|---|---|---|---|
| AWS EBS (gp3) | ebs.csi.aws.com |
RWO | 16,000 | No |
| AWS EFS | efs.csi.aws.com |
RWX, ROX | Elastic | Yes |
| GCP Persistent Disk | pd.csi.storage.gke.io |
RWO, ROX | 100,000 | No (ROX only) |
| Azure Disk | disk.csi.azure.com |
RWO | 160,000 | No |
| Azure Files | file.csi.azure.com |
RWX, ROX | 100,000 | Yes |
| Ceph RBD | rbd.csi.ceph.com |
RWO, ROX | Hardware-dependent | No |
| CephFS | cephfs.csi.ceph.com |
RWX, ROX | Hardware-dependent | Yes |
| NFS (external) | nfs.csi.k8s.io |
RWX, ROX | Network-limited | Yes |
# GCP PD StorageClass — SSD with regional replication
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: regional-ssd
provisioner: pd.csi.storage.gke.io
parameters:
type: pd-ssd
replication-type: regional-pd
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
- matchLabelExpressions:
- key: topology.gke.io/zone
values:
- us-central1-a
- us-central1-b
---
# Azure Disk StorageClass — Premium SSD v2
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: premium-ssd-v2
provisioner: disk.csi.azure.com
parameters:
skuName: PremiumV2_LRS
DiskIOPSReadWrite: "5000"
DiskMBpsReadWrite: "200"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
Volume Binding Modes
| Mode | Behavior | When to Use |
|---|---|---|
Immediate |
PV created and bound as soon as PVC is created — before pod scheduling | Storage accessible from all zones; simple setups |
WaitForFirstConsumer |
PV creation delayed until a pod using the PVC is scheduled | Zone-aware storage (EBS, PD); topology constraints; multi-AZ clusters |
WaitForFirstConsumer for cloud block storage. With Immediate binding, a volume might be provisioned in zone-a while the pod gets scheduled to zone-b — resulting in a scheduling failure since block devices can't cross zones.
Container Storage Interface (CSI)
Before CSI, storage drivers were compiled directly into the Kubernetes binary ("in-tree" plugins). This forced storage vendors to contribute code to the Kubernetes core repo and align with the Kubernetes release cycle. CSI decouples storage plugins from the core, allowing independent development and deployment.
CSI Architecture
flowchart TD
subgraph Control Plane
API[API Server]
SC[StorageClass]
EC[External Controller
csi-provisioner
csi-attacher
csi-snapshotter]
end
subgraph CSI Driver
CP[Controller Plugin
CreateVolume
DeleteVolume
ControllerPublish]
NP[Node Plugin
NodeStage
NodePublish
NodeGetInfo]
end
subgraph Worker Node
Kubelet[kubelet]
Pod[Pod with
mounted volume]
Device[Block Device
/dev/xvdf]
end
subgraph Storage Backend
Cloud[Cloud API
AWS/GCP/Azure/Ceph]
Disk[Physical
Storage]
end
API --> EC
EC --> CP
CP --> Cloud
Cloud --> Disk
Kubelet --> NP
NP --> Device
Device --> Pod
SC -.->|references| CP
- Controller Plugin — Runs as a Deployment; handles volume lifecycle (create, delete, attach, detach, snapshot)
- Node Plugin — Runs as a DaemonSet on every node; handles mount/unmount, filesystem format, device staging
- Sidecar Containers — Kubernetes-maintained helpers (csi-provisioner, csi-attacher, csi-resizer, csi-snapshotter) that bridge Kubernetes API events to CSI RPC calls
CSI Driver Examples
# List installed CSI drivers
kubectl get csidriver
# NAME ATTACHREQUIRED PODINFOONMOUNT STORAGECAPACITY
# ebs.csi.aws.com true false false
# efs.csi.aws.com false false false
# Check CSI driver pods (typically DaemonSet + Deployment)
kubectl get pods -n kube-system -l app=ebs-csi-controller
kubectl get pods -n kube-system -l app=ebs-csi-node
# Verify a CSI node plugin is running on all nodes
kubectl get csinodes
# NAME DRIVERS AGE
# node-1 1 30d
# node-2 1 30d
# node-3 1 30d
# Install AWS EBS CSI driver via Helm
helm repo add aws-ebs-csi-driver https://kubernetes-sigs.github.io/aws-ebs-csi-driver
helm install aws-ebs-csi-driver aws-ebs-csi-driver/aws-ebs-csi-driver \
--namespace kube-system \
--set controller.serviceAccount.annotations."eks\.amazonaws\.com/role-arn"="arn:aws:iam::123456789:role/ebs-csi-role"
Volume Snapshots
CSI enables volume snapshots — point-in-time copies of persistent volumes that can be used for backups, migrations, or cloning.
# VolumeSnapshotClass — defines snapshot provider
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: ebs-snapshot-class
driver: ebs.csi.aws.com
deletionPolicy: Retain
---
# Take a snapshot of an existing PVC
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: db-snapshot-2026-05-14
namespace: production
spec:
volumeSnapshotClassName: ebs-snapshot-class
source:
persistentVolumeClaimName: postgres-data
---
# Restore from snapshot — create new PVC from snapshot
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-data-restored
namespace: production
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: fast-ssd
dataSource:
name: db-snapshot-2026-05-14
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
# Check snapshot status
kubectl get volumesnapshot -n production
# NAME READYTOUSE SOURCEPVC RESTORESIZE SNAPSHOTCLASS AGE
# db-snapshot-2026-05-14 true postgres-data 100Gi ebs-snapshot-class 2m
# List snapshot contents (the actual storage-level snapshot reference)
kubectl get volumesnapshotcontent
# NAME READYTOUSE RESTORESIZE DELETIONPOLICY DRIVER VOLUMESNAPSHOTCLASS
# snapcontent-abc123 true 107374182400 Retain ebs.csi.aws.com ebs-snapshot-class
StatefulSets & Storage
StatefulSets are the Kubernetes workload controller designed for stateful applications. Unlike Deployments (where pods are interchangeable), StatefulSet pods have stable identities, ordered lifecycle operations, and — critically — dedicated persistent storage via volumeClaimTemplates.
volumeClaimTemplates — Per-Pod PVCs
{volumeClaimTemplate.name}-{statefulset.name}-{ordinal}. When a pod is rescheduled to a different node, it reattaches to the same PVC. When a StatefulSet is scaled down, the PVCs are not automatically deleted — preserving data for scale-up scenarios.
# StatefulSet with volumeClaimTemplates — each pod gets dedicated storage
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
namespace: production
spec:
serviceName: postgres-headless
replicas: 3
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
terminationGracePeriodSeconds: 60
containers:
- name: postgres
image: postgres:16
ports:
- containerPort: 5432
name: postgres
env:
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
- name: wal
mountPath: /var/lib/postgresql/wal
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values: ["postgres"]
topologyKey: topology.kubernetes.io/zone
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 100Gi
- metadata:
name: wal
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 20Gi
# Observe per-pod PVCs created by StatefulSet
kubectl get pvc -n production -l app=postgres
# NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS
# data-postgres-0 Bound pvc-abc123 100Gi RWO fast-ssd
# data-postgres-1 Bound pvc-def456 100Gi RWO fast-ssd
# data-postgres-2 Bound pvc-ghi789 100Gi RWO fast-ssd
# wal-postgres-0 Bound pvc-jkl012 20Gi RWO fast-ssd
# wal-postgres-1 Bound pvc-mno345 20Gi RWO fast-ssd
# wal-postgres-2 Bound pvc-pqr678 20Gi RWO fast-ssd
# Scale down — PVCs are preserved
kubectl scale statefulset postgres -n production --replicas=2
kubectl get pvc -n production -l app=postgres
# data-postgres-2 and wal-postgres-2 still exist (not deleted!)
# Scale back up — pod reattaches to existing PVCs
kubectl scale statefulset postgres -n production --replicas=3
# postgres-2 pod gets its original data-postgres-2 and wal-postgres-2 PVCs back
Ordered Attach/Detach
StatefulSets guarantee ordering for volume operations:
- Scale up: Pods are created in order (0, 1, 2). Each pod's PVC is bound before the pod starts. Pod N+1 doesn't start until Pod N is Running and Ready.
- Scale down: Pods are terminated in reverse order (2, 1, 0). Each pod must fully terminate (and its volume cleanly unmounted) before the next pod is stopped.
- Updates: Rolling updates proceed in reverse ordinal order. The old pod's volume is unmounted, the new pod re-mounts the same PVC.
kubectl delete pod --force --grace-period=0) unless you're certain the pod is truly dead. Force-deleting a pod that still has an active volume mount can cause split-brain scenarios where two pods write to the same volume simultaneously, corrupting data.
Headless Services & Stable DNS
StatefulSets require a headless Service (clusterIP: None) for stable DNS names. Each pod gets a predictable DNS entry: {pod-name}.{service-name}.{namespace}.svc.cluster.local.
# Headless Service for StatefulSet DNS
apiVersion: v1
kind: Service
metadata:
name: postgres-headless
namespace: production
spec:
clusterIP: None # Headless — no virtual IP
selector:
app: postgres
ports:
- port: 5432
targetPort: 5432
name: postgres
# Stable DNS entries for StatefulSet pods
# postgres-0.postgres-headless.production.svc.cluster.local
# postgres-1.postgres-headless.production.svc.cluster.local
# postgres-2.postgres-headless.production.svc.cluster.local
# Verify DNS resolution from within the cluster
kubectl run dns-test --image=busybox --rm -it -- nslookup postgres-0.postgres-headless.production.svc.cluster.local
# Address: 10.244.2.15
# Applications connect to specific replicas by DNS name:
# Primary: postgres-0.postgres-headless.production.svc.cluster.local:5432
# Replica: postgres-1.postgres-headless.production.svc.cluster.local:5432
Storage Best Practices
Backup Strategies
| Strategy | Tool | Pros | Cons |
|---|---|---|---|
| Volume Snapshots | CSI VolumeSnapshot API | Fast, storage-native, minimal CPU overhead | Crash-consistent only (not application-consistent) |
| Application Backup | pg_dump, mysqldump, mongodump | Application-consistent, portable, well-tested | CPU/memory overhead, longer backup time |
| Velero | VMware Velero | Full cluster backup (resources + volumes), DR migration | Complex setup, large storage requirements |
| Continuous Replication | WAL-G, Litestream, Debezium | Near-zero RPO, point-in-time recovery | Complex, requires application support |
Capacity Planning & Multi-AZ
Storage capacity planning in Kubernetes requires understanding both the storage backend limits and the pod scheduling topology.
| Consideration | Recommendation |
|---|---|
| Initial sizing | Provision 2-3× expected usage; use allowVolumeExpansion: true for growth |
| IOPS vs throughput | Databases need IOPS (gp3/io2); analytics needs throughput (st1/sc1); choose based on I/O pattern |
| Multi-AZ | Use WaitForFirstConsumer + topology constraints; consider regional PDs for HA |
| Monitoring | Alert on PVC usage >80%; use kubelet_volume_stats_used_bytes metric |
| Encryption | Always enable encryption at rest via StorageClass parameters; use KMS for key management |
| Cleanup | Audit orphaned PVs/PVCs regularly; automate with Retain + cronjob cleanup scripts |
# Monitor PVC usage with kubectl
kubectl get pvc -A -o custom-columns=\
NAMESPACE:.metadata.namespace,\
NAME:.metadata.name,\
STATUS:.status.phase,\
CAPACITY:.status.capacity.storage,\
STORAGECLASS:.spec.storageClassName
# Find orphaned PVs (Released state — not bound to any PVC)
kubectl get pv --field-selector status.phase=Released
# Check actual disk usage inside a pod
kubectl exec postgres-0 -n production -- df -h /var/lib/postgresql/data
# Filesystem Size Used Avail Use% Mounted on
# /dev/xvdf 100G 45G 55G 45% /var/lib/postgresql/data
# Expand a PVC (StorageClass must have allowVolumeExpansion: true)
kubectl patch pvc data-postgres-0 -n production -p '{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'
# Note: Some CSI drivers require pod restart for filesystem resize
Exercises
Objective: Create a static NFS PersistentVolume, bind it to a PVC, and mount it in a pod.
- Create a PV with 10Gi capacity, ReadWriteMany access mode, and Retain reclaim policy pointing to an NFS share
- Create a PVC requesting 5Gi with a label selector matching the PV
- Deploy a pod that writes a timestamp file every 5 seconds to the mounted volume
- Delete the pod, recreate it, and verify the timestamps persisted
- Delete the PVC and observe the PV's status change to Released
Objective: Configure dynamic provisioning and observe automatic PV creation.
- Create two StorageClasses:
standard(HDD, Delete policy) andpremium(SSD, Retain policy) - Create PVCs using each StorageClass and observe PVs being dynamically created
- Deploy pods using each PVC and verify mount points
- Delete the
standardPVC and verify the underlying volume is deleted - Delete the
premiumPVC and verify the PV is retained
Objective: Deploy a StatefulSet with volumeClaimTemplates and verify per-pod storage persistence.
- Create a 3-replica StatefulSet with a volumeClaimTemplate (10Gi each)
- Write unique data to each pod's volume (pod identity as content)
- Scale down to 1 replica and verify PVCs for pods 1 and 2 still exist
- Scale back to 3 replicas and verify each pod reattaches to its original data
- Delete pod-1 and observe it automatically recovering with the same PVC
Objective: Create volume snapshots and restore data from them.
- Deploy a pod writing incrementing counters to a PVC
- Create a VolumeSnapshotClass for your CSI driver
- Take a snapshot when counter reaches 100
- Continue writing until counter reaches 200
- Create a new PVC from the snapshot and mount it — verify it contains data only up to 100
- Compare the original PVC (counter=200) with the restored PVC (counter=100)
Conclusion
Kubernetes storage transforms the ephemeral nature of containers into a platform capable of running the most demanding stateful workloads. The layered abstraction — Volumes for pod-level storage, PVs/PVCs for persistent lifecycle management, StorageClasses for automated provisioning, and CSI for pluggable backend support — gives operators fine-grained control while keeping developers focused on their applications.
The key principles to carry forward:
- Match storage to workload: emptyDir for scratch, PVCs for persistence, RWX for shared access
- Always use WaitForFirstConsumer in multi-AZ environments to prevent zone mismatches
- StatefulSets + volumeClaimTemplates = the correct pattern for databases and stateful services
- Backup is not optional: snapshots, application dumps, and cross-region replication form your safety net
- CSI is the future: in-tree plugins are deprecated; migrate to CSI drivers for all storage backends
In Part 11, we'll explore Kubernetes Internals — the control loop architecture, etcd's role as the source of truth, how the scheduler makes placement decisions, and the admission controller pipeline that gates every API request.