Lab Environment Setup
Before deploying Prometheus, we need a Kubernetes cluster. For this track, we’ll use kind (Kubernetes in Docker) as our primary lab environment — it’s lightweight, fast to create, and closely mirrors production clusters. All examples in Parts 2–12 are tested against this lab setup.
Creating a kind Cluster
Our lab cluster needs multiple nodes to demonstrate real-world scenarios like node-level metrics, anti-affinity, and pod distribution:
# kind-cluster-config.yaml
# A multi-node cluster for Prometheus lab exercises
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: prometheus-lab
nodes:
- role: control-plane
kubeadmConfigPatches:
- |
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "topology.kubernetes.io/zone=us-east-1a"
extraPortMappings:
- containerPort: 30090
hostPort: 9090
protocol: TCP
- containerPort: 30093
hostPort: 9093
protocol: TCP
- containerPort: 30030
hostPort: 3000
protocol: TCP
- role: worker
kubeadmConfigPatches:
- |
kind: JoinConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "topology.kubernetes.io/zone=us-east-1a"
- role: worker
kubeadmConfigPatches:
- |
kind: JoinConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "topology.kubernetes.io/zone=us-east-1b"
- role: worker
kubeadmConfigPatches:
- |
kind: JoinConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "topology.kubernetes.io/zone=us-east-1b"
# Create the cluster
kind create cluster --config kind-cluster-config.yaml
# Verify nodes are ready
kubectl get nodes -o wide
# NAME STATUS ROLES AGE VERSION
# prometheus-lab-control-plane Ready control-plane 45s v1.30.0
# prometheus-lab-worker Ready <none> 30s v1.30.0
# prometheus-lab-worker2 Ready <none> 30s v1.30.0
# prometheus-lab-worker3 Ready <none> 30s v1.30.0
# Install metrics-server for resource metrics
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
kubectl patch deployment metrics-server -n kube-system --type='json' \
-p='[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--kubelet-insecure-tls"}]'
Minikube Alternative
If you prefer minikube or are on a resource-constrained machine:
# Start minikube with sufficient resources for the Prometheus stack
minikube start \
--cpus=4 \
--memory=8192 \
--disk-size=40g \
--kubernetes-version=v1.30.0 \
--nodes=3 \
--driver=docker
# Enable required addons
minikube addons enable metrics-server
minikube addons enable default-storageclass
minikube addons enable storage-provisioner
Prerequisites & Tools
Required Tools for This Track
| Tool | Version | Purpose | Install |
|---|---|---|---|
| kubectl | ≥ 1.28 | Kubernetes CLI | brew install kubectl |
| helm | ≥ 3.14 | Helm chart management | brew install helm |
| kind | ≥ 0.22 | Local K8s clusters | brew install kind |
| jq | ≥ 1.7 | JSON processing | brew install jq |
| promtool | ≥ 2.53 | Prometheus rule validation | Bundled with Prometheus binary |
| Docker | ≥ 24.0 | Container runtime for kind | brew install --cask docker |
Understanding the Prometheus Operator
The Operator Pattern
The Prometheus Operator extends Kubernetes with Custom Resource Definitions (CRDs) that let you declare your monitoring configuration as Kubernetes-native YAML. Instead of manually editing prometheus.yml, you create Kubernetes resources that the operator watches and automatically translates into Prometheus configuration.
flowchart TD
subgraph CRDs["Custom Resources (You Write)"]
SM["ServiceMonitor"]
PM["PodMonitor"]
PR["PrometheusRule"]
P["Prometheus"]
AM["AlertmanagerConfig"]
end
subgraph Operator["Prometheus Operator (Watches)"]
RC["Reconciliation Controller"]
end
subgraph Generated["Generated Artifacts"]
CFG["prometheus.yml
(scrape configs)"]
RULES["rule_files/
(alert + recording rules)"]
SEC["secrets/
(TLS certs, bearer tokens)"]
end
subgraph Running["Running Workloads"]
PROM["Prometheus StatefulSet"]
AMGR["Alertmanager StatefulSet"]
end
SM --> RC
PM --> RC
PR --> RC
P --> RC
AM --> RC
RC --> CFG
RC --> RULES
RC --> SEC
CFG --> PROM
RULES --> PROM
SEC --> PROM
AM --> AMGR
Custom Resource Definitions
The Prometheus Operator introduces these CRDs to your cluster:
| CRD | Purpose | Generates |
|---|---|---|
| Prometheus | Defines a Prometheus server instance | StatefulSet + ConfigMap + Service |
| ServiceMonitor | Declares scrape targets via Kubernetes Services | scrape_configs entries |
| PodMonitor | Declares scrape targets directly from Pods | scrape_configs entries |
| PrometheusRule | Recording rules and alerting rules | Rule files mounted into Prometheus |
| Alertmanager | Defines an Alertmanager instance | StatefulSet + ConfigMap |
| AlertmanagerConfig | Per-namespace alert routing config | Alertmanager configuration sections |
| ScrapeConfig | Generic scrape targets (static, DNS, HTTP SD) | scrape_configs entries |
| PrometheusAgent | Prometheus in agent mode (remote-write only) | StatefulSet in agent mode |
Reconciliation Loop
/-/reload HTTP endpoint — all without restarting the Prometheus pod. Configuration reload typically takes 1–3 seconds.
Deploying with Helm
Key Helm Values
The kube-prometheus-stack chart bundles the Prometheus Operator, Prometheus, Alertmanager, Grafana, Node Exporter, and kube-state-metrics into a single deployment. Here’s a production-oriented values file:
# values-prometheus-lab.yaml
# Production-oriented values for kube-prometheus-stack
# Adjust resource limits based on your cluster size
prometheus:
prometheusSpec:
# Retention and storage
retention: 15d
retentionSize: "45GB"
# Resource limits - sized for ~50,000 active series
resources:
requests:
cpu: 500m
memory: 2Gi
limits:
cpu: 2000m
memory: 4Gi
# Persistent storage
storageSpec:
volumeClaimTemplate:
spec:
storageClassName: standard
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 50Gi
# ServiceMonitor selector - watch all namespaces
serviceMonitorSelectorNilUsesHelmValues: false
podMonitorSelectorNilUsesHelmValues: false
ruleSelectorNilUsesHelmValues: false
# Scrape interval and timeout
scrapeInterval: "30s"
scrapeTimeout: "10s"
evaluationInterval: "30s"
# Enable remote write for future long-term storage
# remoteWrite:
# - url: "http://mimir-distributor:8080/api/v1/push"
# Additional scrape configs (for targets without ServiceMonitors)
additionalScrapeConfigs: []
# Expose via NodePort for lab access
service:
type: NodePort
nodePort: 30090
alertmanager:
alertmanagerSpec:
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
storage:
volumeClaimTemplate:
spec:
storageClassName: standard
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 5Gi
service:
type: NodePort
nodePort: 30093
grafana:
adminPassword: "prom-lab-2026"
persistence:
enabled: true
size: 5Gi
service:
type: NodePort
nodePort: 30030
sidecar:
dashboards:
enabled: true
searchNamespace: ALL
datasources:
enabled: true
# Node Exporter - deploy on all nodes
nodeExporter:
enabled: true
# kube-state-metrics - Kubernetes object metrics
kubeStateMetrics:
enabled: true
# Component scraping configuration
kubeApiServer:
enabled: true
kubeControllerManager:
enabled: true
kubeScheduler:
enabled: true
kubeEtcd:
enabled: true
kubelet:
enabled: true
serviceMonitor:
metricRelabelings:
# Drop high-cardinality kubelet metrics in lab environment
- sourceLabels: [__name__]
regex: 'kubelet_runtime_operations_duration_seconds_bucket'
action: drop
Installation Steps
# Add the Prometheus community Helm repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Create a dedicated monitoring namespace
kubectl create namespace monitoring
# Install kube-prometheus-stack with our custom values
helm install prometheus-stack prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--values values-prometheus-lab.yaml \
--version 62.3.0 \
--wait --timeout 10m
# Verify all pods are running
kubectl get pods -n monitoring -o wide
# NAME READY STATUS RESTARTS
# alertmanager-prometheus-stack-kube-prom-alertmanager-0 2/2 Running 0
# prometheus-prometheus-stack-kube-prom-prometheus-0 2/2 Running 0
# prometheus-stack-grafana-7d9f4c8b4-x2k9m 3/3 Running 0
# prometheus-stack-kube-prom-operator-5c8b9f6d4-p7h2n 1/1 Running 0
# prometheus-stack-kube-state-metrics-6c8b5d6f4-r9t3k 1/1 Running 0
# prometheus-stack-prometheus-node-exporter-abcde 1/1 Running 0
# prometheus-stack-prometheus-node-exporter-fghij 1/1 Running 0
# prometheus-stack-prometheus-node-exporter-klmno 1/1 Running 0
What Gets Deployed
flowchart TD
subgraph Helm["kube-prometheus-stack Chart"]
subgraph Core["Core Components"]
OP["Prometheus Operator
(Deployment)"]
PROM["Prometheus Server
(StatefulSet, 1 replica)"]
AM["Alertmanager
(StatefulSet, 1 replica)"]
end
subgraph Collectors["Metric Collectors"]
NE["Node Exporter
(DaemonSet, all nodes)"]
KSM["kube-state-metrics
(Deployment, 1 replica)"]
end
subgraph Viz["Visualization"]
GF["Grafana
(Deployment + dashboards)"]
end
subgraph CRDS["Pre-configured CRDs"]
SM1["ServiceMonitors
(API server, kubelet,
etcd, node-exporter, KSM)"]
PR1["PrometheusRules
(K8s alerts, node alerts,
Prometheus self-monitoring)"]
end
end
OP -->|manages| PROM
OP -->|manages| AM
SM1 -->|discovered by| PROM
PR1 -->|loaded by| PROM
NE -->|scraped by| PROM
KSM -->|scraped by| PROM
PROM -->|data source| GF
PROM -->|fires alerts| AM
ServiceMonitor & PodMonitor CRDs
ServiceMonitor Specification
A ServiceMonitor tells Prometheus to scrape pods behind a Kubernetes Service. It’s the most common way to add custom scrape targets:
# servicemonitor-example.yaml
# Monitor a custom application exposing metrics on port 8080
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: my-app-metrics
namespace: monitoring
labels:
# These labels must match the Prometheus serviceMonitorSelector
release: prometheus-stack
spec:
# Which namespaces to find the target Services in
namespaceSelector:
matchNames:
- production
- staging
# Select Services with these labels
selector:
matchLabels:
app.kubernetes.io/name: my-app
metrics: "enabled"
# Endpoint configuration (how to scrape)
endpoints:
- port: metrics # Named port on the Service
path: /metrics # Metrics endpoint path
interval: 30s # Override global scrape interval
scrapeTimeout: 10s # Timeout per scrape
scheme: https # Use HTTPS
tlsConfig:
insecureSkipVerify: false
caFile: /etc/prometheus/certs/ca.crt
bearerTokenSecret:
name: prometheus-token
key: token
# Relabeling - add custom labels to all metrics from this target
relabelings:
- sourceLabels: [__meta_kubernetes_namespace]
targetLabel: namespace
- sourceLabels: [__meta_kubernetes_pod_name]
targetLabel: pod
- sourceLabels: [__meta_kubernetes_pod_label_version]
targetLabel: app_version
# Metric relabeling - filter or modify metrics after scraping
metricRelabelings:
- sourceLabels: [__name__]
regex: 'go_gc_.*'
action: drop
PodMonitor Specification
Use PodMonitor when your pods don’t have a Service in front of them (sidecar containers, batch jobs, DaemonSets with host-network):
# podmonitor-example.yaml
# Monitor pods directly without a Service
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: envoy-sidecar-metrics
namespace: monitoring
labels:
release: prometheus-stack
spec:
namespaceSelector:
matchNames:
- production
# Select pods directly by label
selector:
matchLabels:
sidecar.istio.io/inject: "true"
podMetricsEndpoints:
- port: http-envoy-prom # Named port on the Pod spec
path: /stats/prometheus
interval: 15s
relabelings:
- sourceLabels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
regex: "true"
action: keep
Namespace Selection Strategies
serviceMonitorSelectorNilUsesHelmValues: false in the Helm values to discover ServiceMonitors across all namespaces. Without this, newly created ServiceMonitors in application namespaces will be silently ignored.
# Three namespace selection strategies:
# Strategy 1: Monitor ALL namespaces (recommended for most clusters)
spec:
namespaceSelector:
any: true
# Strategy 2: Specific namespaces (multi-tenant isolation)
spec:
namespaceSelector:
matchNames:
- team-a-production
- team-a-staging
# Strategy 3: Namespace labels (dynamic, scales with new namespaces)
spec:
namespaceSelector:
matchLabels:
monitoring: enabled
# Then label your namespaces:
# kubectl label namespace production monitoring=enabled
Storage & Resource Sizing
Persistent Volume Configuration
Without persistent storage, Prometheus loses all data when the pod restarts. For any environment beyond throwaway testing, configure a PersistentVolumeClaim:
# The storageSpec in the Prometheus CRD
prometheus:
prometheusSpec:
storageSpec:
volumeClaimTemplate:
spec:
storageClassName: gp3 # AWS EBS gp3 for production
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Gi
# Optional: node affinity for storage locality
selector:
matchLabels:
prometheus-storage: "true"
Resource Calculation Formula
- Memory: ~3KB per active time series × 2 (headroom) =
active_series × 6KB - Disk (per day): ~1.5 bytes per sample × samples/day =
series × (86400 / scrape_interval) × 1.5B - Disk (total):
daily_bytes × retention_days × 1.2(compaction overhead) - CPU: Driven by scrape frequency, PromQL query complexity, and rule evaluation
Sizing Guidelines by Scale
Resource Sizing Guidelines
| Scale | Active Series | Memory | CPU | Disk (15d) | Scrape Interval |
|---|---|---|---|---|---|
| Small (dev/staging) | < 50K | 2–4 GB | 0.5–1 core | 20–50 GB | 30s |
| Medium (single team) | 50K–500K | 4–16 GB | 1–4 cores | 50–200 GB | 15–30s |
| Large (platform) | 500K–5M | 16–64 GB | 4–16 cores | 200 GB–1 TB | 15s |
| XL (requires sharding) | > 5M | 64+ GB per shard | 16+ cores | 1+ TB per shard | 15s |
RBAC Configuration
Prometheus RBAC Requirements
Prometheus needs specific RBAC permissions to discover and scrape targets via the Kubernetes API. The Helm chart creates these automatically, but understanding them is critical for troubleshooting:
# ClusterRole created by kube-prometheus-stack
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus-stack-kube-prom-prometheus
rules:
# Service discovery - find endpoints to scrape
- apiGroups: [""]
resources: ["nodes", "nodes/metrics", "services", "endpoints", "pods"]
verbs: ["get", "list", "watch"]
# Read configmaps for service discovery
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get"]
# Access to networking resources for ingress SD
- apiGroups: ["networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get", "list", "watch"]
# Non-resource URLs (kubelet /metrics, API server /metrics)
- nonResourceURLs: ["/metrics", "/metrics/cadvisor"]
verbs: ["get"]
Cross-Namespace Scraping
If Prometheus runs in the monitoring namespace but needs to scrape pods in production, it needs a ClusterRoleBinding (not a namespace-scoped RoleBinding):
# Verify the ClusterRoleBinding exists
kubectl get clusterrolebinding | grep prometheus
# If scrape targets show "403 Forbidden" in Prometheus targets page:
kubectl auth can-i get pods --as=system:serviceaccount:monitoring:prometheus-stack-kube-prom-prometheus -n production
# Should return: yes
Accessing & Verifying
Port Forwarding
# Port-forward Prometheus UI
kubectl port-forward -n monitoring svc/prometheus-stack-kube-prom-prometheus 9090:9090 &
# Port-forward Alertmanager UI
kubectl port-forward -n monitoring svc/prometheus-stack-kube-prom-alertmanager 9093:9093 &
# Port-forward Grafana
kubectl port-forward -n monitoring svc/prometheus-stack-grafana 3000:80 &
# Access in browser:
# Prometheus: http://localhost:9090
# Alertmanager: http://localhost:9093
# Grafana: http://localhost:3000 (admin / prom-lab-2026)
Verifying Scrape Targets
Navigate to Status → Targets in the Prometheus UI. You should see all configured targets with their state:
flowchart LR
subgraph Healthy["UP (Healthy Targets)"]
N["node-exporter
4/4 targets"]
K["kubelet
4/4 targets"]
KSM["kube-state-metrics
1/1 targets"]
API["apiserver
1/1 targets"]
AM["alertmanager
1/1 targets"]
P["prometheus
1/1 targets"]
end
subgraph Issues["Common Issues"]
ETD["etcd
0/1 DOWN"]
SCH["scheduler
0/1 DOWN"]
CM["controller-mgr
0/1 DOWN"]
end
Issues -.->|"kind/minikube:
bind to 127.0.0.1"| FIX["Fix: expose on
0.0.0.0 or skip"]
127.0.0.1 by default in local clusters. This is normal for lab environments. In production clusters with proper network configuration, these targets will be UP.
Your First PromQL Query
# Verify Prometheus is collecting data - run these in the Prometheus UI Expression Browser
# Count total active time series
prometheus_tsdb_head_series
# List all scrape jobs
count by (job) (up)
# Check scrape durations
scrape_duration_seconds{job="node-exporter"}
# Node memory usage (from node-exporter)
node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100
# Container CPU usage (from kubelet/cAdvisor)
rate(container_cpu_usage_seconds_total{container!=""}[5m])
If these queries return data, your Prometheus deployment is working correctly. We’ll explore PromQL in much greater depth in Part 4.
Conclusion & What’s Next
You now have a fully operational Prometheus stack running in Kubernetes with:
- The Prometheus Operator managing configuration as Kubernetes-native CRDs
- Persistent storage protecting metrics data across pod restarts
- ServiceMonitors auto-discovering workloads across all namespaces
- Grafana pre-configured with Kubernetes dashboards
- Node Exporter and kube-state-metrics providing infrastructure and object-level metrics
This lab environment will be our foundation for the rest of the Prometheus deep dive track. Keep it running — we’ll add to it incrementally.
Next in the Series
In Part 3: The Prometheus Data Model & TSDB, we’ll open the hood on the Prometheus Time Series Database — understanding the WAL, head blocks, chunk encoding, compaction, and the index structure that makes PromQL queries fast.