Back to Monitoring & Observability Series

Prometheus Deep Dive Part 11: Extending Prometheus with Thanos

June 15, 2026 Wasil Zafar 30 min read

Thanos extends existing Prometheus deployments with unlimited retention, a global query view across clusters, and automatic downsampling — all without replacing Prometheus. Learn the sidecar model, deploy every Thanos component, and architect multi-cluster observability.

Table of Contents

  1. Thanos Overview & Philosophy
  2. Architecture
  3. Production Deployment
  4. Multi-Cluster Architecture
  5. Thanos vs Mimir
  6. Conclusion

Thanos Overview & Philosophy

Thanos is a CNCF Incubating project that extends Prometheus with long-term storage and global querying capabilities. Unlike Mimir or VictoriaMetrics, Thanos doesn’t replace Prometheus — it augments existing Prometheus deployments by:

  • Uploading TSDB blocks from Prometheus to object storage (S3/GCS/Azure) via a sidecar
  • Querying across multiple Prometheus instances transparently through a single endpoint
  • Downsampling historical data automatically (5m and 1h resolutions)
  • Deduplicating HA pair data at query time
Core Philosophy: Thanos treats Prometheus as the source of truth for recent data. It layers on top of existing Prometheus instances without requiring changes to your scrape configuration, alerting rules, or operational procedures. Prometheus continues to work exactly as before — Thanos adds capabilities around it.

Architecture

Thanos Component Architecture
flowchart TD
    subgraph Cluster1["Cluster: US-East"]
        P1[Prometheus + Sidecar]
        P2[Prometheus + Sidecar]
    end

    subgraph Cluster2["Cluster: EU-West"]
        P3[Prometheus + Sidecar]
        P4[Prometheus + Sidecar]
    end

    subgraph Thanos["Thanos Global Layer"]
        TQ[Thanos Query]
        SG[Store Gateway]
        TC[Compactor]
        TR[Ruler]
    end

    subgraph Storage["Object Store"]
        S3[(S3 Bucket)]
    end

    P1 & P2 -->|"upload blocks"| S3
    P3 & P4 -->|"upload blocks"| S3
    P1 & P2 & P3 & P4 -->|"StoreAPI gRPC"| TQ
    SG -->|"serves historical"| TQ
    SG --> S3
    TC -->|"compact + downsample"| S3
    TR --> TQ
    GF[Grafana] --> TQ
                            

Thanos Sidecar

The Sidecar runs alongside each Prometheus instance as a container in the same pod. It has two responsibilities:

  1. Block Upload: Watches Prometheus’s data directory and uploads completed TSDB blocks to object storage
  2. StoreAPI: Exposes a gRPC StoreAPI endpoint that Thanos Query can connect to for real-time data from Prometheus’s head block
# Thanos Sidecar container — added to Prometheus pod
containers:
  - name: thanos-sidecar
    image: quay.io/thanos/thanos:v0.35.1
    args:
      - sidecar
      - --tsdb.path=/prometheus
      - --prometheus.url=http://localhost:9090
      - --objstore.config-file=/etc/thanos/objstore.yaml
      - --grpc-address=0.0.0.0:10901
      - --http-address=0.0.0.0:10902
      # Ship blocks every 2 hours (matches Prometheus min-block-duration)
      - --shipper.upload-compacted
    ports:
      - containerPort: 10901
        name: grpc
      - containerPort: 10902
        name: http
    volumeMounts:
      - name: prometheus-storage
        mountPath: /prometheus
      - name: thanos-objstore-config
        mountPath: /etc/thanos
    resources:
      requests:
        cpu: 100m
        memory: 256Mi
      limits:
        memory: 512Mi
Critical Requirement: When using the sidecar, Prometheus MUST have --storage.tsdb.min-block-duration=2h and --storage.tsdb.max-block-duration=2h set to the same value. This disables Prometheus’s internal compaction and allows Thanos Compactor to handle it instead. Without this, both Prometheus and Thanos try to compact, causing data corruption.

Thanos Query

Thanos Query implements the Prometheus HTTP API and fans out queries to multiple StoreAPI endpoints (sidecars, store gateways, other query instances). It merges results, deduplicates HA replicas, and returns a unified response:

# Thanos Query deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: thanos-query
  namespace: monitoring
spec:
  replicas: 2
  template:
    spec:
      containers:
        - name: thanos-query
          image: quay.io/thanos/thanos:v0.35.1
          args:
            - query
            - --log.level=info
            - --query.replica-label=__replica__
            - --query.replica-label=prometheus_replica
            - --query.auto-downsampling
            - --query.max-concurrent=20
            - --query.timeout=2m
            # Discover stores via DNS
            - --store=dnssrv+_grpc._tcp.thanos-sidecar.monitoring.svc
            - --store=dnssrv+_grpc._tcp.thanos-store-gateway.monitoring.svc
            # Or static endpoints
            - --store=thanos-sidecar-us-east:10901
            - --store=thanos-sidecar-eu-west:10901
            - --store=thanos-store-gateway:10901
          ports:
            - containerPort: 10902
              name: http
            - containerPort: 10901
              name: grpc

Store Gateway

The Store Gateway serves historical TSDB blocks from object storage. It indexes block metadata and serves queries against them through the StoreAPI. It caches index data locally for fast lookups:

# Thanos Store Gateway — serves historical data from object store
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: thanos-store-gateway
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: thanos-store
          image: quay.io/thanos/thanos:v0.35.1
          args:
            - store
            - --data-dir=/data
            - --objstore.config-file=/etc/thanos/objstore.yaml
            - --index-cache-size=2GB
            - --chunk-pool-size=4GB
            - --grpc-address=0.0.0.0:10901
            - --http-address=0.0.0.0:10902
            # Time-based partitioning (optional for very large stores)
            # - --min-time=-720h  # Only serve last 30 days
            # - --max-time=-2h    # Don't serve very recent (sidecar handles it)
          volumeMounts:
            - name: data
              mountPath: /data
            - name: objstore-config
              mountPath: /etc/thanos
          resources:
            requests:
              cpu: "1"
              memory: 4Gi
            limits:
              memory: 8Gi
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 50Gi    # For index cache

Compactor & Downsampling

The Compactor runs as a singleton, continuously processing blocks in object storage:

  • Compaction: Merges small 2h blocks into larger blocks (up to the configured max), reducing object count and improving query performance
  • Downsampling: Creates 5-minute and 1-hour resolution versions of data older than configured thresholds
  • Retention: Deletes blocks exceeding the retention period
# Thanos Compactor — singleton deployment
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: thanos-compactor
spec:
  replicas: 1    # MUST be 1 — singleton
  template:
    spec:
      containers:
        - name: thanos-compact
          image: quay.io/thanos/thanos:v0.35.1
          args:
            - compact
            - --data-dir=/data
            - --objstore.config-file=/etc/thanos/objstore.yaml
            - --http-address=0.0.0.0:10902
            - --wait                    # Run continuously (not one-shot)
            - --wait-interval=5m
            # Retention configuration
            - --retention.resolution-raw=90d      # Keep raw data 90 days
            - --retention.resolution-5m=365d      # Keep 5m downsampled 1 year
            - --retention.resolution-1h=1825d     # Keep 1h downsampled 5 years
            # Downsampling
            - --downsample.concurrency=4
            # Compaction
            - --compact.concurrency=2
          volumeMounts:
            - name: data
              mountPath: /data
            - name: objstore-config
              mountPath: /etc/thanos
          resources:
            requests:
              cpu: "2"
              memory: 4Gi
            limits:
              memory: 8Gi
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 100Gi    # Scratch space for compaction

Thanos Ruler

Thanos Ruler evaluates recording and alerting rules against the global Thanos Query endpoint, enabling rules that span multiple Prometheus instances:

# Thanos Ruler — evaluates rules against global view
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: thanos-ruler
spec:
  replicas: 2    # HA pair
  template:
    spec:
      containers:
        - name: thanos-rule
          image: quay.io/thanos/thanos:v0.35.1
          args:
            - rule
            - --data-dir=/data
            - --objstore.config-file=/etc/thanos/objstore.yaml
            - --rule-file=/etc/thanos-rules/*.yaml
            - --query=dnssrv+_http._tcp.thanos-query.monitoring.svc
            - --alertmanagers.url=http://alertmanager:9093
            - --alert.label-drop=__replica__
            - --label=ruler_cluster="global"
            - --grpc-address=0.0.0.0:10901
            - --http-address=0.0.0.0:10902

Production Deployment

Object Store Configuration

# objstore.yaml — S3 configuration
type: S3
config:
  bucket: thanos-metrics-prod
  endpoint: s3.us-east-1.amazonaws.com
  region: us-east-1
  # Use IAM role (IRSA) rather than static credentials
  # access_key: ""
  # secret_key: ""
  insecure: false
  signature_version2: false
  http_config:
    idle_conn_timeout: 90s
    response_header_timeout: 2m
    tls_config:
      insecure_skip_verify: false
# GCS configuration alternative
type: GCS
config:
  bucket: thanos-metrics-prod
  # Uses workload identity or GOOGLE_APPLICATION_CREDENTIALS
  service_account: ""

Sidecar Deployment

# Complete Prometheus + Thanos Sidecar pod
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: prometheus
  namespace: monitoring
spec:
  replicas: 2    # HA pair
  template:
    metadata:
      labels:
        app: prometheus
        thanos-store-api: "true"
    spec:
      containers:
        # Prometheus container
        - name: prometheus
          image: prom/prometheus:v2.53.0
          args:
            - --config.file=/etc/prometheus/prometheus.yml
            - --storage.tsdb.path=/prometheus
            - --storage.tsdb.retention.time=48h     # Short local retention
            - --storage.tsdb.min-block-duration=2h  # Required for Thanos
            - --storage.tsdb.max-block-duration=2h  # Required for Thanos
            - --web.enable-lifecycle
            - --web.enable-admin-api
          ports:
            - containerPort: 9090
          volumeMounts:
            - name: storage
              mountPath: /prometheus
            - name: config
              mountPath: /etc/prometheus

        # Thanos Sidecar container
        - name: thanos-sidecar
          image: quay.io/thanos/thanos:v0.35.1
          args:
            - sidecar
            - --tsdb.path=/prometheus
            - --prometheus.url=http://localhost:9090
            - --objstore.config-file=/etc/thanos/objstore.yaml
            - --grpc-address=0.0.0.0:10901
            - --http-address=0.0.0.0:10902
          ports:
            - containerPort: 10901
              name: grpc
            - containerPort: 10902
              name: http
          volumeMounts:
            - name: storage
              mountPath: /prometheus
              readOnly: false
            - name: thanos-config
              mountPath: /etc/thanos

Query Layer

Query Fanout: Thanos Query discovers stores via three mechanisms: static --store flags, DNS SRV records (dnssrv+), or file-based SD (--store.sd-files). For Kubernetes, DNS SRV is the cleanest approach — create a headless Service selecting pods with the thanos-store-api: "true" label.
# Headless Service for Store API discovery
apiVersion: v1
kind: Service
metadata:
  name: thanos-sidecar
  namespace: monitoring
spec:
  type: ClusterIP
  clusterIP: None
  ports:
    - name: grpc
      port: 10901
      targetPort: grpc
  selector:
    thanos-store-api: "true"

Store Gateway Deployment

Store Gateway Sizing: Memory is proportional to the index size, not data size. With 10M unique series in object storage, expect ~2–4 GiB memory for index caching. The --index-cache-size and --chunk-pool-size flags control memory allocation. Local disk caches the downloaded index for faster restarts.

Compactor Deployment

Operations

Compactor Retention Strategy

ResolutionRetentionUse CaseStorage Impact
Raw90 daysRecent dashboards, debugging~1.5 bytes/sample
5-minute1 yearMonthly reports, trend analysis~1/20th of raw
1-hour5 yearsYearly capacity planning~1/240th of raw
RetentionCost

Multi-Cluster Architecture

Cross-Cluster Querying

Multi-Cluster Thanos Architecture
flowchart TD
    subgraph US["US-East Cluster"]
        PUS[Prometheus HA Pair
+ Sidecar] end subgraph EU["EU-West Cluster"] PEU[Prometheus HA Pair
+ Sidecar] end subgraph AP["AP-Southeast Cluster"] PAP[Prometheus HA Pair
+ Sidecar] end subgraph Global["Global Observability (Central)"] TQ[Thanos Query
Global] SG[Store Gateway] TC[Compactor] end subgraph OBJ["Object Store"] S3[(Shared S3 Bucket)] end PUS & PEU & PAP -->|"upload blocks"| S3 PUS & PEU & PAP -->|"StoreAPI
(cross-cluster gRPC)"| TQ SG --> S3 SG -->|"StoreAPI"| TQ TC --> S3 GF[Grafana] --> TQ
# Prometheus external_labels — MUST be unique per cluster + replica
global:
  external_labels:
    cluster: us-east-1       # Unique per cluster
    region: us-east
    __replica__: prom-0      # Unique per HA replica

Deduplication Strategies

# Thanos Query deduplicates by replica labels
# Both HA replicas produce nearly identical data — Query picks one

# Configure replica labels (can specify multiple)
thanos query \
  --query.replica-label=__replica__ \
  --query.replica-label=prometheus_replica

# Dedup algorithm: for each time window, picks the replica with
# fewer gaps (missing scrapes). Penalty-based selection ensures
# the "healthier" replica wins.

# Partial response handling — if one sidecar is down:
# --query.partial-response   (enabled by default)
# Returns available data with a warning header instead of failing

Thanos vs Mimir

Decision Guide

Thanos vs Grafana Mimir

AspectThanosGrafana Mimir
Data flowSidecar uploads TSDB blocksremote_write pushes samples
Prometheus changesAdd sidecar + disable compactionAdd remote_write config only
Recent data pathStoreAPI from sidecar (real-time)Ingester (near real-time)
Multi-tenancyBy external_labels (manual)Native per-request header
Operational modelDistributed components you deployDistributed components you deploy
HA dedupQuery-time (replica label)Query-time (replica label)
MaturityCNCF Incubating, proven at scaleGrafana-backed, Cortex successor
LicenseApache 2.0AGPLv3
ArchitectureDecision

When to Choose Thanos

Choose Thanos when:
  • You want to keep Prometheus local TSDB as the primary store (sidecar model)
  • You need cross-cluster querying over gRPC without a central ingest path
  • Apache 2.0 licensing is required
  • You prefer block-based object storage over stream-based ingestion
  • Downsampling with configurable resolution retention is important
  • You’re already familiar with the Thanos ecosystem

Conclusion

Key Takeaways:
  • Thanos is additive — it extends Prometheus without replacing any component
  • Sidecar is the bridge — uploads blocks and serves real-time data via StoreAPI
  • Disable Prometheus compaction — set min/max block duration to 2h when using sidecar
  • Compactor is a singleton — never run more than one instance per bucket
  • Downsampling saves costs — 5m and 1h resolutions drastically reduce long-term storage
  • External labels matter — they’re how Thanos identifies clusters, replicas, and tenants
  • Partial responses are ok — better to show available data than fail entirely