Back to Monitoring & Observability Series

Prometheus Deep Dive Part 2: Deploying Prometheus to Kubernetes

June 15, 2026 Wasil Zafar 30 min read

From a blank Kubernetes cluster to a fully operational Prometheus stack in under 30 minutes — deploy the kube-prometheus-stack Helm chart, configure ServiceMonitor and PodMonitor CRDs, set up persistent storage, and build the lab environment we’ll use throughout this deep dive track.

Table of Contents

  1. Lab Environment Setup
  2. Understanding the Prometheus Operator
  3. Deploying with Helm
  4. ServiceMonitor & PodMonitor CRDs
  5. Storage & Resource Sizing
  6. RBAC Configuration
  7. Accessing & Verifying
  8. Conclusion & What’s Next

Lab Environment Setup

Before deploying Prometheus, we need a Kubernetes cluster. For this track, we’ll use kind (Kubernetes in Docker) as our primary lab environment — it’s lightweight, fast to create, and closely mirrors production clusters. All examples in Parts 2–12 are tested against this lab setup.

Creating a kind Cluster

Our lab cluster needs multiple nodes to demonstrate real-world scenarios like node-level metrics, anti-affinity, and pod distribution:

# kind-cluster-config.yaml
# A multi-node cluster for Prometheus lab exercises
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: prometheus-lab
nodes:
  - role: control-plane
    kubeadmConfigPatches:
      - |
        kind: InitConfiguration
        nodeRegistration:
          kubeletExtraArgs:
            node-labels: "topology.kubernetes.io/zone=us-east-1a"
    extraPortMappings:
      - containerPort: 30090
        hostPort: 9090
        protocol: TCP
      - containerPort: 30093
        hostPort: 9093
        protocol: TCP
      - containerPort: 30030
        hostPort: 3000
        protocol: TCP
  - role: worker
    kubeadmConfigPatches:
      - |
        kind: JoinConfiguration
        nodeRegistration:
          kubeletExtraArgs:
            node-labels: "topology.kubernetes.io/zone=us-east-1a"
  - role: worker
    kubeadmConfigPatches:
      - |
        kind: JoinConfiguration
        nodeRegistration:
          kubeletExtraArgs:
            node-labels: "topology.kubernetes.io/zone=us-east-1b"
  - role: worker
    kubeadmConfigPatches:
      - |
        kind: JoinConfiguration
        nodeRegistration:
          kubeletExtraArgs:
            node-labels: "topology.kubernetes.io/zone=us-east-1b"
# Create the cluster
kind create cluster --config kind-cluster-config.yaml

# Verify nodes are ready
kubectl get nodes -o wide
# NAME                          STATUS   ROLES           AGE   VERSION
# prometheus-lab-control-plane  Ready    control-plane   45s   v1.30.0
# prometheus-lab-worker         Ready    <none>          30s   v1.30.0
# prometheus-lab-worker2        Ready    <none>          30s   v1.30.0
# prometheus-lab-worker3        Ready    <none>          30s   v1.30.0

# Install metrics-server for resource metrics
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
kubectl patch deployment metrics-server -n kube-system --type='json' \
  -p='[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--kubelet-insecure-tls"}]'

Minikube Alternative

If you prefer minikube or are on a resource-constrained machine:

# Start minikube with sufficient resources for the Prometheus stack
minikube start \
  --cpus=4 \
  --memory=8192 \
  --disk-size=40g \
  --kubernetes-version=v1.30.0 \
  --nodes=3 \
  --driver=docker

# Enable required addons
minikube addons enable metrics-server
minikube addons enable default-storageclass
minikube addons enable storage-provisioner

Prerequisites & Tools

Prerequisites

Required Tools for This Track

ToolVersionPurposeInstall
kubectl≥ 1.28Kubernetes CLIbrew install kubectl
helm≥ 3.14Helm chart managementbrew install helm
kind≥ 0.22Local K8s clustersbrew install kind
jq≥ 1.7JSON processingbrew install jq
promtool≥ 2.53Prometheus rule validationBundled with Prometheus binary
Docker≥ 24.0Container runtime for kindbrew install --cask docker
SetupToolsLab

Understanding the Prometheus Operator

The Operator Pattern

The Prometheus Operator extends Kubernetes with Custom Resource Definitions (CRDs) that let you declare your monitoring configuration as Kubernetes-native YAML. Instead of manually editing prometheus.yml, you create Kubernetes resources that the operator watches and automatically translates into Prometheus configuration.

Prometheus Operator Reconciliation Flow
flowchart TD
    subgraph CRDs["Custom Resources (You Write)"]
        SM["ServiceMonitor"]
        PM["PodMonitor"]
        PR["PrometheusRule"]
        P["Prometheus"]
        AM["AlertmanagerConfig"]
    end

    subgraph Operator["Prometheus Operator (Watches)"]
        RC["Reconciliation Controller"]
    end

    subgraph Generated["Generated Artifacts"]
        CFG["prometheus.yml
(scrape configs)"] RULES["rule_files/
(alert + recording rules)"] SEC["secrets/
(TLS certs, bearer tokens)"] end subgraph Running["Running Workloads"] PROM["Prometheus StatefulSet"] AMGR["Alertmanager StatefulSet"] end SM --> RC PM --> RC PR --> RC P --> RC AM --> RC RC --> CFG RC --> RULES RC --> SEC CFG --> PROM RULES --> PROM SEC --> PROM AM --> AMGR

Custom Resource Definitions

The Prometheus Operator introduces these CRDs to your cluster:

CRDPurposeGenerates
PrometheusDefines a Prometheus server instanceStatefulSet + ConfigMap + Service
ServiceMonitorDeclares scrape targets via Kubernetes Servicesscrape_configs entries
PodMonitorDeclares scrape targets directly from Podsscrape_configs entries
PrometheusRuleRecording rules and alerting rulesRule files mounted into Prometheus
AlertmanagerDefines an Alertmanager instanceStatefulSet + ConfigMap
AlertmanagerConfigPer-namespace alert routing configAlertmanager configuration sections
ScrapeConfigGeneric scrape targets (static, DNS, HTTP SD)scrape_configs entries
PrometheusAgentPrometheus in agent mode (remote-write only)StatefulSet in agent mode

Reconciliation Loop

How the Operator Works: The operator runs a continuous reconciliation loop. When you create, update, or delete a ServiceMonitor, the operator detects the change, regenerates the Prometheus configuration, and triggers a configuration reload via Prometheus’ /-/reload HTTP endpoint — all without restarting the Prometheus pod. Configuration reload typically takes 1–3 seconds.

Deploying with Helm

Key Helm Values

The kube-prometheus-stack chart bundles the Prometheus Operator, Prometheus, Alertmanager, Grafana, Node Exporter, and kube-state-metrics into a single deployment. Here’s a production-oriented values file:

# values-prometheus-lab.yaml
# Production-oriented values for kube-prometheus-stack
# Adjust resource limits based on your cluster size

prometheus:
  prometheusSpec:
    # Retention and storage
    retention: 15d
    retentionSize: "45GB"

    # Resource limits - sized for ~50,000 active series
    resources:
      requests:
        cpu: 500m
        memory: 2Gi
      limits:
        cpu: 2000m
        memory: 4Gi

    # Persistent storage
    storageSpec:
      volumeClaimTemplate:
        spec:
          storageClassName: standard
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 50Gi

    # ServiceMonitor selector - watch all namespaces
    serviceMonitorSelectorNilUsesHelmValues: false
    podMonitorSelectorNilUsesHelmValues: false
    ruleSelectorNilUsesHelmValues: false

    # Scrape interval and timeout
    scrapeInterval: "30s"
    scrapeTimeout: "10s"
    evaluationInterval: "30s"

    # Enable remote write for future long-term storage
    # remoteWrite:
    #   - url: "http://mimir-distributor:8080/api/v1/push"

    # Additional scrape configs (for targets without ServiceMonitors)
    additionalScrapeConfigs: []

  # Expose via NodePort for lab access
  service:
    type: NodePort
    nodePort: 30090

alertmanager:
  alertmanagerSpec:
    resources:
      requests:
        cpu: 100m
        memory: 256Mi
      limits:
        cpu: 500m
        memory: 512Mi
    storage:
      volumeClaimTemplate:
        spec:
          storageClassName: standard
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 5Gi
  service:
    type: NodePort
    nodePort: 30093

grafana:
  adminPassword: "prom-lab-2026"
  persistence:
    enabled: true
    size: 5Gi
  service:
    type: NodePort
    nodePort: 30030
  sidecar:
    dashboards:
      enabled: true
      searchNamespace: ALL
    datasources:
      enabled: true

# Node Exporter - deploy on all nodes
nodeExporter:
  enabled: true

# kube-state-metrics - Kubernetes object metrics
kubeStateMetrics:
  enabled: true

# Component scraping configuration
kubeApiServer:
  enabled: true
kubeControllerManager:
  enabled: true
kubeScheduler:
  enabled: true
kubeEtcd:
  enabled: true
kubelet:
  enabled: true
  serviceMonitor:
    metricRelabelings:
      # Drop high-cardinality kubelet metrics in lab environment
      - sourceLabels: [__name__]
        regex: 'kubelet_runtime_operations_duration_seconds_bucket'
        action: drop

Installation Steps

# Add the Prometheus community Helm repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

# Create a dedicated monitoring namespace
kubectl create namespace monitoring

# Install kube-prometheus-stack with our custom values
helm install prometheus-stack prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --values values-prometheus-lab.yaml \
  --version 62.3.0 \
  --wait --timeout 10m

# Verify all pods are running
kubectl get pods -n monitoring -o wide
# NAME                                                     READY   STATUS    RESTARTS
# alertmanager-prometheus-stack-kube-prom-alertmanager-0    2/2     Running   0
# prometheus-prometheus-stack-kube-prom-prometheus-0        2/2     Running   0
# prometheus-stack-grafana-7d9f4c8b4-x2k9m                 3/3     Running   0
# prometheus-stack-kube-prom-operator-5c8b9f6d4-p7h2n      1/1     Running   0
# prometheus-stack-kube-state-metrics-6c8b5d6f4-r9t3k      1/1     Running   0
# prometheus-stack-prometheus-node-exporter-abcde           1/1     Running   0
# prometheus-stack-prometheus-node-exporter-fghij           1/1     Running   0
# prometheus-stack-prometheus-node-exporter-klmno           1/1     Running   0

What Gets Deployed

kube-prometheus-stack Components
flowchart TD
    subgraph Helm["kube-prometheus-stack Chart"]
        subgraph Core["Core Components"]
            OP["Prometheus Operator
(Deployment)"] PROM["Prometheus Server
(StatefulSet, 1 replica)"] AM["Alertmanager
(StatefulSet, 1 replica)"] end subgraph Collectors["Metric Collectors"] NE["Node Exporter
(DaemonSet, all nodes)"] KSM["kube-state-metrics
(Deployment, 1 replica)"] end subgraph Viz["Visualization"] GF["Grafana
(Deployment + dashboards)"] end subgraph CRDS["Pre-configured CRDs"] SM1["ServiceMonitors
(API server, kubelet,
etcd, node-exporter, KSM)"] PR1["PrometheusRules
(K8s alerts, node alerts,
Prometheus self-monitoring)"] end end OP -->|manages| PROM OP -->|manages| AM SM1 -->|discovered by| PROM PR1 -->|loaded by| PROM NE -->|scraped by| PROM KSM -->|scraped by| PROM PROM -->|data source| GF PROM -->|fires alerts| AM

ServiceMonitor & PodMonitor CRDs

ServiceMonitor Specification

A ServiceMonitor tells Prometheus to scrape pods behind a Kubernetes Service. It’s the most common way to add custom scrape targets:

# servicemonitor-example.yaml
# Monitor a custom application exposing metrics on port 8080
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: my-app-metrics
  namespace: monitoring
  labels:
    # These labels must match the Prometheus serviceMonitorSelector
    release: prometheus-stack
spec:
  # Which namespaces to find the target Services in
  namespaceSelector:
    matchNames:
      - production
      - staging

  # Select Services with these labels
  selector:
    matchLabels:
      app.kubernetes.io/name: my-app
      metrics: "enabled"

  # Endpoint configuration (how to scrape)
  endpoints:
    - port: metrics          # Named port on the Service
      path: /metrics         # Metrics endpoint path
      interval: 30s          # Override global scrape interval
      scrapeTimeout: 10s     # Timeout per scrape
      scheme: https          # Use HTTPS
      tlsConfig:
        insecureSkipVerify: false
        caFile: /etc/prometheus/certs/ca.crt
      bearerTokenSecret:
        name: prometheus-token
        key: token
      # Relabeling - add custom labels to all metrics from this target
      relabelings:
        - sourceLabels: [__meta_kubernetes_namespace]
          targetLabel: namespace
        - sourceLabels: [__meta_kubernetes_pod_name]
          targetLabel: pod
        - sourceLabels: [__meta_kubernetes_pod_label_version]
          targetLabel: app_version
      # Metric relabeling - filter or modify metrics after scraping
      metricRelabelings:
        - sourceLabels: [__name__]
          regex: 'go_gc_.*'
          action: drop

PodMonitor Specification

Use PodMonitor when your pods don’t have a Service in front of them (sidecar containers, batch jobs, DaemonSets with host-network):

# podmonitor-example.yaml
# Monitor pods directly without a Service
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: envoy-sidecar-metrics
  namespace: monitoring
  labels:
    release: prometheus-stack
spec:
  namespaceSelector:
    matchNames:
      - production

  # Select pods directly by label
  selector:
    matchLabels:
      sidecar.istio.io/inject: "true"

  podMetricsEndpoints:
    - port: http-envoy-prom   # Named port on the Pod spec
      path: /stats/prometheus
      interval: 15s
      relabelings:
        - sourceLabels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
          regex: "true"
          action: keep

Namespace Selection Strategies

Common Pitfall: By default, the Prometheus Operator only discovers ServiceMonitors in its own namespace. Set serviceMonitorSelectorNilUsesHelmValues: false in the Helm values to discover ServiceMonitors across all namespaces. Without this, newly created ServiceMonitors in application namespaces will be silently ignored.
# Three namespace selection strategies:

# Strategy 1: Monitor ALL namespaces (recommended for most clusters)
spec:
  namespaceSelector:
    any: true

# Strategy 2: Specific namespaces (multi-tenant isolation)
spec:
  namespaceSelector:
    matchNames:
      - team-a-production
      - team-a-staging

# Strategy 3: Namespace labels (dynamic, scales with new namespaces)
spec:
  namespaceSelector:
    matchLabels:
      monitoring: enabled
# Then label your namespaces:
# kubectl label namespace production monitoring=enabled

Storage & Resource Sizing

Persistent Volume Configuration

Without persistent storage, Prometheus loses all data when the pod restarts. For any environment beyond throwaway testing, configure a PersistentVolumeClaim:

# The storageSpec in the Prometheus CRD
prometheus:
  prometheusSpec:
    storageSpec:
      volumeClaimTemplate:
        spec:
          storageClassName: gp3    # AWS EBS gp3 for production
          accessModes: ["ReadWriteOnce"]
          resources:
            requests:
              storage: 100Gi
        # Optional: node affinity for storage locality
        selector:
          matchLabels:
            prometheus-storage: "true"

Resource Calculation Formula

Prometheus Resource Sizing Formulas:
  • Memory: ~3KB per active time series × 2 (headroom) = active_series × 6KB
  • Disk (per day): ~1.5 bytes per sample × samples/day = series × (86400 / scrape_interval) × 1.5B
  • Disk (total): daily_bytes × retention_days × 1.2 (compaction overhead)
  • CPU: Driven by scrape frequency, PromQL query complexity, and rule evaluation

Sizing Guidelines by Scale

Reference

Resource Sizing Guidelines

ScaleActive SeriesMemoryCPUDisk (15d)Scrape Interval
Small (dev/staging)< 50K2–4 GB0.5–1 core20–50 GB30s
Medium (single team)50K–500K4–16 GB1–4 cores50–200 GB15–30s
Large (platform)500K–5M16–64 GB4–16 cores200 GB–1 TB15s
XL (requires sharding)> 5M64+ GB per shard16+ cores1+ TB per shard15s
SizingCapacity PlanningResources

RBAC Configuration

Prometheus RBAC Requirements

Prometheus needs specific RBAC permissions to discover and scrape targets via the Kubernetes API. The Helm chart creates these automatically, but understanding them is critical for troubleshooting:

# ClusterRole created by kube-prometheus-stack
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-stack-kube-prom-prometheus
rules:
  # Service discovery - find endpoints to scrape
  - apiGroups: [""]
    resources: ["nodes", "nodes/metrics", "services", "endpoints", "pods"]
    verbs: ["get", "list", "watch"]
  # Read configmaps for service discovery
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["get"]
  # Access to networking resources for ingress SD
  - apiGroups: ["networking.k8s.io"]
    resources: ["ingresses"]
    verbs: ["get", "list", "watch"]
  # Non-resource URLs (kubelet /metrics, API server /metrics)
  - nonResourceURLs: ["/metrics", "/metrics/cadvisor"]
    verbs: ["get"]

Cross-Namespace Scraping

If Prometheus runs in the monitoring namespace but needs to scrape pods in production, it needs a ClusterRoleBinding (not a namespace-scoped RoleBinding):

# Verify the ClusterRoleBinding exists
kubectl get clusterrolebinding | grep prometheus

# If scrape targets show "403 Forbidden" in Prometheus targets page:
kubectl auth can-i get pods --as=system:serviceaccount:monitoring:prometheus-stack-kube-prom-prometheus -n production
# Should return: yes

Accessing & Verifying

Port Forwarding

# Port-forward Prometheus UI
kubectl port-forward -n monitoring svc/prometheus-stack-kube-prom-prometheus 9090:9090 &

# Port-forward Alertmanager UI
kubectl port-forward -n monitoring svc/prometheus-stack-kube-prom-alertmanager 9093:9093 &

# Port-forward Grafana
kubectl port-forward -n monitoring svc/prometheus-stack-grafana 3000:80 &

# Access in browser:
# Prometheus: http://localhost:9090
# Alertmanager: http://localhost:9093
# Grafana: http://localhost:3000 (admin / prom-lab-2026)

Verifying Scrape Targets

Navigate to Status → Targets in the Prometheus UI. You should see all configured targets with their state:

Expected Target States After Deployment
flowchart LR
    subgraph Healthy["UP (Healthy Targets)"]
        N["node-exporter
4/4 targets"] K["kubelet
4/4 targets"] KSM["kube-state-metrics
1/1 targets"] API["apiserver
1/1 targets"] AM["alertmanager
1/1 targets"] P["prometheus
1/1 targets"] end subgraph Issues["Common Issues"] ETD["etcd
0/1 DOWN"] SCH["scheduler
0/1 DOWN"] CM["controller-mgr
0/1 DOWN"] end Issues -.->|"kind/minikube:
bind to 127.0.0.1"| FIX["Fix: expose on
0.0.0.0 or skip"]
Expected in kind/minikube: The etcd, scheduler, and controller-manager targets may show as DOWN because they bind to 127.0.0.1 by default in local clusters. This is normal for lab environments. In production clusters with proper network configuration, these targets will be UP.

Your First PromQL Query

# Verify Prometheus is collecting data - run these in the Prometheus UI Expression Browser

# Count total active time series
prometheus_tsdb_head_series

# List all scrape jobs
count by (job) (up)

# Check scrape durations
scrape_duration_seconds{job="node-exporter"}

# Node memory usage (from node-exporter)
node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100

# Container CPU usage (from kubelet/cAdvisor)
rate(container_cpu_usage_seconds_total{container!=""}[5m])

If these queries return data, your Prometheus deployment is working correctly. We’ll explore PromQL in much greater depth in Part 4.

Conclusion & What’s Next

You now have a fully operational Prometheus stack running in Kubernetes with:

  • The Prometheus Operator managing configuration as Kubernetes-native CRDs
  • Persistent storage protecting metrics data across pod restarts
  • ServiceMonitors auto-discovering workloads across all namespaces
  • Grafana pre-configured with Kubernetes dashboards
  • Node Exporter and kube-state-metrics providing infrastructure and object-level metrics

This lab environment will be our foundation for the rest of the Prometheus deep dive track. Keep it running — we’ll add to it incrementally.

Next in the Series

In Part 3: The Prometheus Data Model & TSDB, we’ll open the hood on the Prometheus Time Series Database — understanding the WAL, head blocks, chunk encoding, compaction, and the index structure that makes PromQL queries fast.