The Container Revolution
Containers represent the most significant shift in application packaging and deployment since virtual machines. By providing lightweight, portable, and reproducible environments, containers have fundamentally changed how we build, ship, and run software at scale.
Containers vs Virtual Machines
Understanding the architectural difference between containers and VMs is fundamental to appreciating why containers have become the dominant deployment model:
flowchart LR
subgraph VM["Virtual Machine Stack"]
direction TB
HW1[Physical Hardware]
HV[Hypervisor]
G1[Guest OS 1]
G2[Guest OS 2]
A1[App A + Libs]
A2[App B + Libs]
HW1 --> HV
HV --> G1
HV --> G2
G1 --> A1
G2 --> A2
end
subgraph CT["Container Stack"]
direction TB
HW2[Physical Hardware]
HOS[Host OS]
CR[Container Runtime]
C1[App A + Libs]
C2[App B + Libs]
C3[App C + Libs]
HW2 --> HOS
HOS --> CR
CR --> C1
CR --> C2
CR --> C3
end
| Aspect | Virtual Machines | Containers |
|---|---|---|
| Isolation | Full hardware virtualization | OS-level process isolation |
| Size | GBs (includes full OS) | MBs (shares host kernel) |
| Startup Time | Minutes | Seconds (or less) |
| Density | 10-20 per host | 100s-1000s per host |
| Resource Overhead | High (dedicated OS per VM) | Minimal (shared kernel) |
| Portability | Hypervisor-dependent | Any Linux/Windows host |
| Security | Strong isolation (separate kernels) | Shared kernel (namespace isolation) |
| Use Case | Multi-tenant, different OS needs | Microservices, CI/CD, scaling |
Container Ecosystem Overview
flowchart TD
DEV[Developer] --> DF[Dockerfile]
DF --> BUILD[docker build]
BUILD --> IMG[Container Image]
IMG --> REG[Registry
Docker Hub / ECR / ACR / GCR]
REG --> PULL[docker pull]
PULL --> RUN[Container Runtime]
RUN --> ORCH[Orchestrator
Kubernetes / Swarm / ECS]
ORCH --> PROD[Production Workloads]
The container ecosystem spans development, building, distribution, and orchestration — each layer with purpose-built tools and standards (OCI specifications) ensuring interoperability.
Docker Fundamentals
Docker remains the most widely used container platform. Understanding its architecture and tooling is essential for working with containers in any environment.
flowchart LR
CLI[Docker CLI
docker build/run/push] -->|REST API| DAEMON[Docker Daemon
dockerd]
DAEMON --> IMAGES[Images]
DAEMON --> CONTAINERS[Containers]
DAEMON --> NETWORKS[Networks]
DAEMON --> VOLUMES[Volumes]
DAEMON -->|pull/push| REGISTRY[Container Registry]
Dockerfile Anatomy
A Dockerfile is a text file containing instructions to build a container image. Each instruction creates a layer in the image:
# syntax=docker/dockerfile:1
# Base image - always start with FROM
FROM node:20-alpine
# Set metadata
LABEL maintainer="dev@example.com"
LABEL version="1.0"
# Set working directory inside container
WORKDIR /app
# Copy dependency files first (layer caching optimization)
COPY package.json package-lock.json ./
# Install dependencies
RUN npm ci --only=production
# Copy application source code
COPY src/ ./src/
COPY public/ ./public/
# Create non-root user for security
RUN addgroup -g 1001 appuser && \
adduser -u 1001 -G appuser -s /bin/sh -D appuser
USER appuser
# Document the port the app uses
EXPOSE 3000
# Health check
HEALTHCHECK --interval=30s --timeout=3s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1
# Define the command to run
CMD ["node", "src/server.js"]
CMD provides default arguments that can be overridden at runtime. ENTRYPOINT defines the executable that always runs. Combine them: ENTRYPOINT ["node"] with CMD ["server.js"] — allowing users to override just the script name.
Building and Running Containers
# Build an image from a Dockerfile
docker build -t myapp:1.0 .
# Build with a specific Dockerfile and build context
docker build -f Dockerfile.prod -t myapp:1.0-prod ./app
# Run a container in detached mode with port mapping
docker run -d \
--name my-web-app \
-p 8080:3000 \
-e NODE_ENV=production \
-v app-data:/app/data \
--restart unless-stopped \
myapp:1.0
# View running containers
docker ps
# View container logs (follow mode)
docker logs -f my-web-app
# Execute a command inside a running container
docker exec -it my-web-app /bin/sh
# Stop and remove a container
docker stop my-web-app
docker rm my-web-app
# Remove all stopped containers
docker container prune
Container Lifecycle
# Full lifecycle commands
docker create --name app myapp:1.0 # Create (not started)
docker start app # Start a created/stopped container
docker pause app # Pause all processes
docker unpause app # Resume processes
docker stop app # Graceful shutdown (SIGTERM)
docker kill app # Force kill (SIGKILL)
docker restart app # Stop + start
docker rm app # Remove container
docker rm -f app # Force remove (even if running)
# Inspect container details
docker inspect app
docker stats app # Live resource usage
docker top app # Running processes
Multi-Stage Builds
Multi-stage builds dramatically reduce image size by separating build dependencies from the runtime image:
# Stage 1: Build
FROM golang:1.22-alpine AS builder
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o /app/server ./cmd/server
# Stage 2: Runtime (minimal image)
FROM alpine:3.19
# Install ca-certificates for HTTPS calls
RUN apk --no-cache add ca-certificates
# Create non-root user
RUN adduser -D -u 1001 appuser
USER appuser
WORKDIR /app
COPY --from=builder /app/server .
EXPOSE 8080
ENTRYPOINT ["./server"]
Dockerfile Best Practices
# ✅ GOOD: Use specific tags, not :latest
FROM node:20.11-alpine3.19
# ✅ GOOD: Combine RUN commands to reduce layers
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*
# ✅ GOOD: Copy dependency files before source (caching)
COPY package.json package-lock.json ./
RUN npm ci --only=production
COPY . .
# ✅ GOOD: Use .dockerignore to exclude unnecessary files
# .dockerignore content:
# node_modules
# .git
# *.md
# .env
# tests/
# ✅ GOOD: Run as non-root
USER 1001
# ❌ BAD: Running as root (security risk)
# ❌ BAD: Using :latest tag (non-reproducible)
# ❌ BAD: COPY . . before installing dependencies (breaks cache)
# ❌ BAD: Storing secrets in image layers
Container Networking
Docker provides multiple network drivers to support different use cases, from isolated development environments to multi-host production deployments.
flowchart TD
subgraph BRIDGE["Bridge Network (default)"]
B1[Container A
172.17.0.2] <--> BR[docker0 bridge]
B2[Container B
172.17.0.3] <--> BR
BR <--> HOST1[Host eth0]
end
subgraph CUSTOM["User-Defined Bridge"]
C1[Container C
app-net] <--> CBR[custom-bridge]
C2[Container D
app-net] <--> CBR
CBR <--> HOST2[Host eth0]
end
subgraph HOSTNET["Host Network"]
H1[Container E
shares host network stack]
end
| Network Type | Use Case | Container-to-Container | External Access |
|---|---|---|---|
| bridge (default) | Standalone containers on same host | Via IP only | Port mapping (-p) |
| User-defined bridge | Application stacks needing DNS | Via container name (DNS) | Port mapping (-p) |
| host | Performance-critical, no NAT | Via localhost | Direct (no mapping needed) |
| overlay | Multi-host (Swarm/K8s) | Cross-host communication | Via routing mesh |
| none | Complete network isolation | Not possible | Not possible |
Creating Networks and Connecting Containers
# Create a user-defined bridge network
docker network create --driver bridge app-network
# Run containers on the custom network
docker run -d --name api-server \
--network app-network \
-e DB_HOST=postgres-db \
myapi:latest
docker run -d --name postgres-db \
--network app-network \
-e POSTGRES_PASSWORD=secret \
postgres:16-alpine
# Containers can now communicate by name:
# api-server can reach postgres-db at "postgres-db:5432"
# List networks
docker network ls
# Inspect network (shows connected containers)
docker network inspect app-network
# Connect an existing container to a network
docker network connect app-network existing-container
# Disconnect a container from a network
docker network disconnect app-network existing-container
Container Volumes & Storage
Containers are ephemeral by design — when a container is removed, its filesystem is gone. Volumes provide persistent storage that survives container lifecycle events.
| Storage Type | Location | Managed By | Use Case |
|---|---|---|---|
| Named Volumes | /var/lib/docker/volumes/ | Docker | Database data, application state |
| Bind Mounts | Any host path | User | Development (live code reload) |
| tmpfs Mounts | Host memory only | Kernel | Secrets, temp files (non-persistent) |
Data Persistence Patterns
# Create a named volume
docker volume create postgres-data
# Run with named volume (recommended for production)
docker run -d --name db \
-v postgres-data:/var/lib/postgresql/data \
-e POSTGRES_PASSWORD=mysecret \
postgres:16-alpine
# Bind mount for development (host directory mapped into container)
docker run -d --name dev-app \
-v $(pwd)/src:/app/src \
-v /app/node_modules \
-p 3000:3000 \
myapp:dev
# tmpfs mount (in-memory, not persisted)
docker run -d --name secure-app \
--tmpfs /app/secrets:rw,size=64m \
myapp:latest
# List volumes
docker volume ls
# Inspect a volume
docker volume inspect postgres-data
# Backup a volume
docker run --rm \
-v postgres-data:/source:ro \
-v $(pwd):/backup \
alpine tar czf /backup/postgres-backup.tar.gz -C /source .
# Remove unused volumes
docker volume prune
Docker Compose
Docker Compose defines and runs multi-container applications using a declarative YAML file. It solves the problem of coordinating multiple containers that form a single application stack.
Complete 3-Tier Application
# docker-compose.yml - Complete 3-tier application
services:
# Frontend - React application
frontend:
build:
context: ./frontend
dockerfile: Dockerfile
ports:
- "3000:80"
depends_on:
api:
condition: service_healthy
environment:
- REACT_APP_API_URL=http://api:8080
networks:
- frontend-net
# Backend API - Node.js
api:
build:
context: ./api
dockerfile: Dockerfile
ports:
- "8080:8080"
environment:
- NODE_ENV=production
- DATABASE_URL=postgresql://appuser:secret@db:5432/myapp
- REDIS_URL=redis://cache:6379
depends_on:
db:
condition: service_healthy
cache:
condition: service_started
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
networks:
- frontend-net
- backend-net
restart: unless-stopped
# Database - PostgreSQL
db:
image: postgres:16-alpine
environment:
POSTGRES_DB: myapp
POSTGRES_USER: appuser
POSTGRES_PASSWORD: secret
volumes:
- postgres-data:/var/lib/postgresql/data
- ./init-scripts:/docker-entrypoint-initdb.d
healthcheck:
test: ["CMD-SHELL", "pg_isready -U appuser -d myapp"]
interval: 5s
timeout: 3s
retries: 5
networks:
- backend-net
restart: unless-stopped
# Cache - Redis
cache:
image: redis:7-alpine
command: redis-server --maxmemory 128mb --maxmemory-policy allkeys-lru
volumes:
- redis-data:/data
networks:
- backend-net
restart: unless-stopped
volumes:
postgres-data:
driver: local
redis-data:
driver: local
networks:
frontend-net:
driver: bridge
backend-net:
driver: bridge
Essential Compose Commands
# Start all services (build if needed)
docker compose up -d --build
# View running services
docker compose ps
# View logs for all services (follow)
docker compose logs -f
# View logs for specific service
docker compose logs -f api
# Scale a service
docker compose up -d --scale api=3
# Execute command in a running service
docker compose exec api npm run migrate
# Stop all services (preserve volumes)
docker compose down
# Stop and remove volumes (DESTRUCTIVE)
docker compose down -v
# Rebuild a single service
docker compose build api
docker compose up -d api
# View resource usage
docker compose top
Container Registries
Container registries store and distribute container images. Choosing the right registry depends on your cloud provider, security requirements, and team workflow.
| Registry | Provider | Free Tier | Scanning | Best For |
|---|---|---|---|---|
| Docker Hub | Docker | 1 private repo | Basic | Public images, OSS |
| Amazon ECR | AWS | 500 MB/month | Built-in | AWS workloads |
| Azure ACR | Microsoft | None (paid) | Microsoft Defender | Azure workloads |
| Google Artifact Registry | GCP | 500 MB/month | Built-in | GCP workloads |
| GitHub Container Registry | GitHub | Public unlimited | Dependabot | GitHub Actions CI/CD |
Image Tagging Strategies
# Tag with semantic version
docker tag myapp:latest registry.example.com/myapp:1.2.3
docker tag myapp:latest registry.example.com/myapp:1.2
docker tag myapp:latest registry.example.com/myapp:1
# Tag with git SHA (immutable, traceable)
GIT_SHA=$(git rev-parse --short HEAD)
docker tag myapp:latest registry.example.com/myapp:${GIT_SHA}
# Tag with build metadata
docker tag myapp:latest registry.example.com/myapp:1.2.3-build.456
# Push to registry
docker push registry.example.com/myapp:1.2.3
docker push registry.example.com/myapp:${GIT_SHA}
# Pull from registry
docker pull registry.example.com/myapp:1.2.3
Image Scanning for Vulnerabilities
# Scan with Docker Scout (built-in)
docker scout cves myapp:latest
# Scan with Trivy (open-source, comprehensive)
trivy image myapp:latest
# Scan with Trivy - fail on HIGH/CRITICAL
trivy image --severity HIGH,CRITICAL --exit-code 1 myapp:latest
# Scan in CI pipeline (example output)
# myapp:latest (alpine 3.19.1)
# Total: 2 (HIGH: 1, CRITICAL: 1)
# ┌───────────────┬────────────────┬──────────┬─────────────────┐
# │ Library │ Vulnerability │ Severity │ Fixed Version │
# ├───────────────┼────────────────┼──────────┼─────────────────┤
# │ libssl3 │ CVE-2024-XXXX │ CRITICAL │ 3.1.5-r0 │
# │ curl │ CVE-2024-YYYY │ HIGH │ 8.5.0-r0 │
# └───────────────┴────────────────┴──────────┴─────────────────┘
Kubernetes Architecture
Kubernetes (K8s) is the industry-standard container orchestration platform. It automates deployment, scaling, and management of containerized applications across clusters of machines.
flowchart TD
subgraph CP["Control Plane"]
API[API Server]
ETCD[(etcd
cluster state)]
SCHED[Scheduler]
CM[Controller Manager]
API <--> ETCD
API <--> SCHED
API <--> CM
end
subgraph W1["Worker Node 1"]
KL1[kubelet]
KP1[kube-proxy]
CR1[Container Runtime]
P1[Pod A]
P2[Pod B]
KL1 --> CR1
CR1 --> P1
CR1 --> P2
end
subgraph W2["Worker Node 2"]
KL2[kubelet]
KP2[kube-proxy]
CR2[Container Runtime]
P3[Pod C]
P4[Pod D]
KL2 --> CR2
CR2 --> P3
CR2 --> P4
end
API --> KL1
API --> KL2
• API Server — Front door for all operations (REST API, kubectl)
• etcd — Distributed key-value store holding all cluster state
• Scheduler — Assigns pods to nodes based on resources and constraints
• Controller Manager — Runs control loops (ReplicaSet, Deployment, Node controllers)
Worker Node Components
• kubelet — Agent ensuring containers run in pods as specified
• kube-proxy — Network proxy implementing Service abstractions (iptables/IPVS)
• Container Runtime — containerd or CRI-O (Docker is deprecated as runtime)
kubectl Essentials
# Cluster info
kubectl cluster-info
kubectl get nodes -o wide
# Namespace operations
kubectl get namespaces
kubectl create namespace staging
# Get resources (pods, deployments, services)
kubectl get pods -n default
kubectl get deployments -o wide
kubectl get services --all-namespaces
# Describe a resource (detailed info + events)
kubectl describe pod my-pod-name
kubectl describe node worker-1
# Apply a manifest (create or update)
kubectl apply -f deployment.yaml
kubectl apply -f ./k8s/ # Apply all YAML in directory
# Delete resources
kubectl delete -f deployment.yaml
kubectl delete pod my-pod-name
# View logs
kubectl logs my-pod-name
kubectl logs -f my-pod-name --tail=100 # Follow with tail
kubectl logs my-pod-name -c sidecar # Specific container
# Execute command in pod
kubectl exec -it my-pod-name -- /bin/sh
# Port forward (local debugging)
kubectl port-forward svc/my-service 8080:80
# Watch resources in real-time
kubectl get pods -w
Kubernetes Core Objects
Pods & Deployments
A Pod is the smallest deployable unit — one or more containers sharing network and storage. A Deployment manages pod replicas and rolling updates.
# deployment.yaml - Complete Deployment manifest
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-api
namespace: production
labels:
app: web-api
version: v1.2.3
spec:
replicas: 3
selector:
matchLabels:
app: web-api
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # Max pods above desired during update
maxUnavailable: 0 # Zero downtime
template:
metadata:
labels:
app: web-api
version: v1.2.3
spec:
containers:
- name: api
image: registry.example.com/web-api:1.2.3
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
- name: LOG_LEVEL
valueFrom:
configMapKeyRef:
name: app-config
key: log-level
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
restartPolicy: Always
Services & Ingress
flowchart TD
CLIENT[External Client] --> ING[Ingress Controller
nginx / ALB]
ING -->|/api| SVC1[Service: api
ClusterIP]
ING -->|/web| SVC2[Service: frontend
ClusterIP]
SVC1 --> P1[Pod api-1]
SVC1 --> P2[Pod api-2]
SVC1 --> P3[Pod api-3]
SVC2 --> P4[Pod web-1]
SVC2 --> P5[Pod web-2]
# service.yaml - ClusterIP Service (internal)
apiVersion: v1
kind: Service
metadata:
name: web-api
namespace: production
spec:
type: ClusterIP
selector:
app: web-api
ports:
- port: 80
targetPort: 8080
protocol: TCP
---
# service-lb.yaml - LoadBalancer Service (external)
apiVersion: v1
kind: Service
metadata:
name: web-api-public
namespace: production
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
spec:
type: LoadBalancer
selector:
app: web-api
ports:
- port: 443
targetPort: 8080
protocol: TCP
---
# ingress.yaml - Ingress resource (path-based routing)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
namespace: production
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
ingressClassName: nginx
tls:
- hosts:
- api.example.com
secretName: api-tls
rules:
- host: api.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: web-api
port:
number: 80
- path: /
pathType: Prefix
backend:
service:
name: frontend
port:
number: 80
ConfigMaps & Secrets
# configmap.yaml - Non-sensitive configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: production
data:
log-level: "info"
max-connections: "100"
feature-flags: |
{
"new-ui": true,
"beta-api": false
}
---
# secret.yaml - Sensitive data (base64 encoded)
apiVersion: v1
kind: Secret
metadata:
name: db-credentials
namespace: production
type: Opaque
data:
url: cG9zdGdyZXNxbDovL3VzZXI6cGFzc0BkYjo1NDMyL215YXBw # base64 encoded
password: c3VwZXJzZWNyZXQ= # base64 encoded
# Create secret from command line (easier than manual base64)
kubectl create secret generic db-credentials \
--from-literal=url='postgresql://user:pass@db:5432/myapp' \
--from-literal=password='supersecret' \
-n production
# Create configmap from file
kubectl create configmap nginx-config \
--from-file=nginx.conf \
-n production
Managed Kubernetes Services
Managed Kubernetes services abstract away control plane management, letting teams focus on deploying workloads rather than maintaining cluster infrastructure.
| Feature | AWS EKS | Azure AKS | GCP GKE |
|---|---|---|---|
| Control Plane Cost | $0.10/hr (~$73/mo) | Free | Free (Autopilot) / $0.10/hr |
| Node Types | EC2, Fargate (serverless) | VMs, Virtual Nodes (ACI) | VMs, Autopilot (serverless) |
| Auto-scaling | Cluster Autoscaler, Karpenter | Cluster Autoscaler, KEDA | Node Auto-Provisioning |
| Networking | VPC CNI, Calico | Azure CNI, Kubenet | VPC-native, Dataplane V2 |
| Service Mesh | App Mesh, Istio | Istio, Open Service Mesh | Anthos Service Mesh |
| Registry | ECR | ACR | Artifact Registry |
| Max Nodes | 5,000 | 5,000 | 15,000 |
| GPU Support | Yes (P4, A100) | Yes (T4, A100) | Yes (T4, A100, H100) |
Terraform Deployment — AWS EKS
# providers.tf
terraform {
required_version = ">= 1.5"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = "us-east-1"
}
# eks-cluster.tf
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.0"
cluster_name = "production-cluster"
cluster_version = "1.29"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
cluster_endpoint_public_access = true
eks_managed_node_groups = {
general = {
instance_types = ["m6i.large"]
min_size = 2
max_size = 10
desired_size = 3
labels = {
workload-type = "general"
}
}
spot = {
instance_types = ["m6i.large", "m5.large", "m5a.large"]
capacity_type = "SPOT"
min_size = 0
max_size = 20
desired_size = 2
labels = {
workload-type = "batch"
}
taints = [{
key = "spot"
value = "true"
effect = "NO_SCHEDULE"
}]
}
}
tags = {
Environment = "production"
ManagedBy = "terraform"
}
}
Terraform — Azure AKS
# aks-cluster.tf
resource "azurerm_kubernetes_cluster" "main" {
name = "production-aks"
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
dns_prefix = "prod-aks"
kubernetes_version = "1.29"
default_node_pool {
name = "system"
vm_size = "Standard_D4s_v3"
node_count = 3
min_count = 2
max_count = 10
enable_auto_scaling = true
os_disk_size_gb = 100
vnet_subnet_id = azurerm_subnet.aks.id
}
identity {
type = "SystemAssigned"
}
network_profile {
network_plugin = "azure"
load_balancer_sku = "standard"
service_cidr = "10.0.0.0/16"
dns_service_ip = "10.0.0.10"
}
tags = {
Environment = "production"
ManagedBy = "terraform"
}
}
resource "azurerm_kubernetes_cluster_node_pool" "worker" {
name = "worker"
kubernetes_cluster_id = azurerm_kubernetes_cluster.main.id
vm_size = "Standard_D8s_v3"
min_count = 1
max_count = 20
enable_auto_scaling = true
os_disk_size_gb = 200
node_labels = {
"workload-type" = "application"
}
}
Production Patterns
Horizontal Pod Autoscaler (HPA)
# hpa.yaml - Auto-scale based on CPU and custom metrics
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-api-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-api
minReplicas: 3
maxReplicas: 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleUp:
stabilizationWindowSeconds: 30
policies:
- type: Percent
value: 100
periodSeconds: 30
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
Liveness & Readiness Probes
# Comprehensive probe configuration
spec:
containers:
- name: api
image: myapp:latest
# Liveness: Is the container alive? Restart if not
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 30 # Wait for app startup
periodSeconds: 10 # Check every 10s
timeoutSeconds: 3 # Timeout per check
failureThreshold: 3 # 3 failures = restart
# Readiness: Can it serve traffic? Remove from LB if not
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 2
failureThreshold: 3
# Startup: Is it still starting? Protect slow-starting containers
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30 # 30 * 10s = 5 min max startup
periodSeconds: 10
• Liveness — Detects deadlocks. Keep simple (don't check dependencies).
• Readiness — Checks dependencies (DB, cache). Removes pod from service load balancer.
• Startup — Use for slow-starting apps to prevent premature liveness kills.
Rolling Updates and Rollbacks
# Update deployment image (triggers rolling update)
kubectl set image deployment/web-api \
api=registry.example.com/web-api:1.3.0 \
-n production
# Watch rollout status
kubectl rollout status deployment/web-api -n production
# View rollout history
kubectl rollout history deployment/web-api -n production
# Rollback to previous version
kubectl rollout undo deployment/web-api -n production
# Rollback to specific revision
kubectl rollout undo deployment/web-api --to-revision=3 -n production
# Pause/resume rollout (for canary testing)
kubectl rollout pause deployment/web-api -n production
kubectl rollout resume deployment/web-api -n production
Helm Charts & GitOps
# Install a Helm chart
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
# Install nginx ingress controller
helm install ingress-nginx ingress-nginx/ingress-nginx \
--namespace ingress-nginx \
--create-namespace \
--set controller.replicaCount=2
# Install with custom values file
helm install my-app ./charts/my-app \
-f values-production.yaml \
--namespace production
# Upgrade an existing release
helm upgrade my-app ./charts/my-app \
-f values-production.yaml \
--namespace production
# List releases
helm list --all-namespaces
# Rollback a release
helm rollback my-app 1 --namespace production
# Chart.yaml - Helm chart metadata
apiVersion: v2
name: web-api
description: Production web API deployment
version: 1.2.3
appVersion: "1.2.3"
# values.yaml - Default values
replicaCount: 3
image:
repository: registry.example.com/web-api
tag: "1.2.3"
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 80
ingress:
enabled: true
hosts:
- host: api.example.com
paths:
- path: /
pathType: Prefix
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 50
targetCPUUtilization: 70
Hands-On Exercises
Build and Run a Multi-Stage Docker Image
Create a Node.js application with a multi-stage Dockerfile that separates build and runtime dependencies.
- Create a simple Express.js API with a
/healthendpoint - Write a multi-stage Dockerfile: Stage 1 installs all deps + builds; Stage 2 copies only production artifacts
- Build the image and compare sizes:
docker images - Run the container with port mapping and verify the health endpoint
- Add a
.dockerignorefile and rebuild — note the smaller build context
Success Criteria: Final image is under 150MB. Health check passes. Container runs as non-root user.
Deploy a 3-Tier Application with Docker Compose
Build a complete application stack with frontend, API, database, and cache using Docker Compose.
- Create a
docker-compose.ymlwith 4 services: React frontend, Node.js API, PostgreSQL, Redis - Configure health checks for the API and database services
- Use
depends_onwith conditions to enforce startup order - Create named volumes for database persistence
- Use separate networks:
frontend-net(frontend ↔ API) andbackend-net(API ↔ DB/Redis) - Test with
docker compose up -dand verify all services are healthy
Success Criteria: All 4 services start in correct order. Database survives docker compose down && docker compose up. Frontend cannot directly access database.
Deploy to Kubernetes with kubectl
Deploy a containerized application to Kubernetes with proper production patterns.
- Create a Deployment manifest with 3 replicas, resource limits, and rolling update strategy
- Add liveness, readiness, and startup probes
- Create a ClusterIP Service and an Ingress resource for external access
- Store configuration in a ConfigMap and credentials in a Secret
- Apply all manifests:
kubectl apply -f k8s/ - Test rolling update: change image tag and watch rollout
- Practice rollback:
kubectl rollout undo
Success Criteria: Zero-downtime rolling update. Rollback completes in under 30s. Pod restarts when liveness fails.
Create Terraform for a Managed Kubernetes Cluster
Provision a production-ready managed Kubernetes cluster using Terraform.
- Choose a cloud provider (EKS, AKS, or GKE)
- Write Terraform for: VPC/networking, cluster control plane, managed node group (general purpose), spot/preemptible node pool (cost optimization)
- Configure cluster autoscaler with min/max node counts
- Enable RBAC and integrate with cloud IAM
- Output kubeconfig and verify cluster access:
kubectl get nodes - Deploy a sample workload using the Kubernetes Terraform provider
Success Criteria: terraform apply creates a functional cluster. Nodes auto-scale when load increases. Spot nodes have appropriate taints.
Conclusion & Coming Next
Containers and Kubernetes have become the foundation of modern application deployment. In this article, we covered the full journey from Docker fundamentals — images, networking, volumes, and Compose — through Kubernetes orchestration with its declarative object model, managed services, and production patterns like autoscaling, health probes, and Helm packaging.
• Containers provide lightweight, portable, reproducible application environments
• Docker is the foundation: master Dockerfiles, networking, volumes, and Compose
• Kubernetes orchestrates containers at scale with declarative desired-state management
• Managed services (EKS/AKS/GKE) eliminate control plane operations overhead
• Production requires proper probes, resource management, autoscaling, and GitOps workflows
Next in the Series
In Part 12: CI/CD Pipelines for Infrastructure, we explore GitHub Actions, GitLab CI, and Jenkins pipelines purpose-built for infrastructure automation. Learn how to lint Terraform, run security scans, deploy with approval gates, and implement full GitOps workflows for infrastructure changes.