Remote Storage Overview
Why Remote Storage?
Prometheus’s local TSDB is optimized for recent data queries. It excels at last-few-hours dashboards but has inherent limitations for enterprise use cases:
Limitations of Local Storage:
- Single-node capacity — limited by one server’s disk and memory
- No global view — each Prometheus instance sees only its own data
- No multi-tenancy — cannot isolate teams’ data or apply per-tenant limits
- Retention limited — practically 30–90 days before disk/compaction issues
- No deduplication — HA pairs write duplicate data locally
- Backup complexity — snapshots are large and node-specific
Remote Write vs Remote Read
Remote Write vs Remote Read Data Flow
flowchart LR
subgraph Prometheus
TSDB[Local TSDB]
WAL[WAL]
end
subgraph RemoteWrite["Remote Write (push)"]
RW["Samples pushed
as they're ingested"]
end
subgraph RemoteRead["Remote Read (pull)"]
RR["Queries proxied
at query time"]
end
subgraph LTS["Long-Term Store"]
MIMIR[Grafana Mimir]
end
WAL -->|"continuous push"| RW --> MIMIR
TSDB -.->|"query-time proxy"| RR -.-> MIMIR
Remote Write vs Remote Read
| Aspect | Remote Write | Remote Read |
|---|---|---|
| Direction | Prometheus pushes to backend | Prometheus queries backend at query time |
| Latency | Near real-time (seconds) | Adds query latency (network round-trip) |
| Data scope | All ingested samples (or filtered) | Only what’s queried |
| Local retention | Can reduce to hours (data is in backend) | Still need local data for recent queries |
| Query path | Query backend directly (Grafana→Mimir) | Query Prometheus (merges local + remote) |
| Recommended | Yes — standard pattern | Rarely — adds complexity with little benefit |
Solution Landscape
Prometheus-Compatible Remote Storage Solutions
| Solution | License | Backed By | Key Differentiator |
|---|---|---|---|
| Grafana Mimir | AGPLv3 | Grafana Labs | Multi-tenant, object-store native, Cortex successor |
| VictoriaMetrics | Apache 2.0 | VictoriaMetrics Inc | High compression, MetricsQL, simple operations |
| Thanos | Apache 2.0 | CNCF | Sidecar model, uses existing Prometheus TSDB blocks |
| Cortex | Apache 2.0 | CNCF (archived) | Mimir’s predecessor — use Mimir instead |
| M3DB | Apache 2.0 | Uber | Originally for Uber scale; declining community |
Grafana Mimir
Architecture & Components
Grafana Mimir Architecture
flowchart TD
subgraph Write["Write Path"]
DIST[Distributor]
ING[Ingester
x3 replicas]
end
subgraph Read["Read Path"]
QF[Query Frontend]
QS[Query Scheduler]
Q[Querier]
end
subgraph Storage["Storage"]
OBJ[(Object Store
S3/GCS/Azure)]
SC[Store Gateway]
end
subgraph Ops["Operations"]
COMP[Compactor]
RUL[Ruler]
end
P[Prometheus
remote_write] -->|"push"| DIST
DIST -->|"hash ring"| ING
ING -->|"flush blocks"| OBJ
GF[Grafana] --> QF --> QS --> Q
Q --> ING
Q --> SC --> OBJ
COMP --> OBJ
RUL --> Q
Deployment (Monolithic & Microservices)
# Mimir monolithic mode — simplest deployment for <1M active series
# All components in a single binary
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mimir
namespace: monitoring
spec:
replicas: 3
serviceName: mimir
template:
spec:
containers:
- name: mimir
image: grafana/mimir:2.13.0
args:
- -target=all
- -config.file=/etc/mimir/mimir.yaml
ports:
- containerPort: 8080 # HTTP API
- containerPort: 9095 # gRPC (internal)
- containerPort: 7946 # Memberlist gossip
volumeMounts:
- name: config
mountPath: /etc/mimir
- name: data
mountPath: /data
resources:
requests:
cpu: "2"
memory: 8Gi
limits:
memory: 12Gi
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 50Gi
# mimir.yaml — monolithic configuration
multitenancy_enabled: true
server:
http_listen_port: 8080
grpc_listen_port: 9095
distributor:
ring:
kvstore:
store: memberlist
ingester:
ring:
kvstore:
store: memberlist
replication_factor: 3
blocks_storage:
backend: s3
s3:
endpoint: s3.us-east-1.amazonaws.com
bucket_name: mimir-blocks-prod
region: us-east-1
tsdb:
dir: /data/tsdb
bucket_store:
sync_dir: /data/tsdb-sync
compactor:
data_dir: /data/compactor
sharding_ring:
kvstore:
store: memberlist
store_gateway:
sharding_ring:
kvstore:
store: memberlist
ruler:
alertmanager_url: http://alertmanager:9093
rule_path: /data/rules
limits:
# Per-tenant limits
ingestion_rate: 100000 # samples/sec per tenant
ingestion_burst_size: 200000
max_global_series_per_user: 5000000 # 5M series per tenant
max_global_series_per_metric: 100000
compactor_blocks_retention_period: 365d # 1 year retention
memberlist:
join_members:
- mimir-0.mimir:7946
- mimir-1.mimir:7946
- mimir-2.mimir:7946
Multi-Tenancy
# Prometheus remote_write with tenant header
remote_write:
- url: https://mimir.internal/api/v1/push
headers:
X-Scope-OrgID: payments-team # Tenant identifier
queue_config:
max_samples_per_send: 2000
batch_send_deadline: 5s
# Grafana datasource configuration per tenant
# Each team queries only their own data
apiVersion: 1
datasources:
- name: Mimir-Payments
type: prometheus
url: https://mimir.internal/prometheus
jsonData:
httpHeaderName1: X-Scope-OrgID
secureJsonData:
httpHeaderValue1: payments-team
Configuration Deep Dive
# Per-tenant overrides (runtime configuration)
# File: /etc/mimir/runtime.yaml — hot-reloaded without restart
overrides:
# Default limits for all tenants
__default__:
ingestion_rate: 50000
max_global_series_per_user: 2000000
compactor_blocks_retention_period: 90d
# Override for high-volume tenant
platform-team:
ingestion_rate: 200000
max_global_series_per_user: 10000000
compactor_blocks_retention_period: 365d
# Restricted tenant (dev environment)
dev-team:
ingestion_rate: 10000
max_global_series_per_user: 500000
compactor_blocks_retention_period: 14d
VictoriaMetrics
Architecture (Single & Cluster)
VictoriaMetrics Cluster Architecture
flowchart TD
subgraph Insert["Insert Path"]
VMI[vminsert
Stateless]
end
subgraph Store["Storage"]
VMS1[vmstorage-0]
VMS2[vmstorage-1]
VMS3[vmstorage-2]
end
subgraph Query["Query Path"]
VMSL[vmselect
Stateless]
end
P[Prometheus
remote_write] --> VMI
VMI -->|"consistent hashing"| VMS1 & VMS2 & VMS3
GF[Grafana] --> VMSL
VMSL --> VMS1 & VMS2 & VMS3
Deployment
# VictoriaMetrics single-node — handles up to 10M active series
# Simplest possible deployment
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: victoriametrics
namespace: monitoring
spec:
replicas: 1
template:
spec:
containers:
- name: victoriametrics
image: victoriametrics/victoria-metrics:v1.101.0
args:
- -storageDataPath=/storage
- -retentionPeriod=12 # 12 months
- -httpListenAddr=:8428
- -search.maxUniqueTimeseries=10000000
- -dedup.minScrapeInterval=15s # Dedup HA pairs
ports:
- containerPort: 8428
volumeMounts:
- name: storage
mountPath: /storage
resources:
requests:
cpu: "4"
memory: 16Gi
volumeClaimTemplates:
- metadata:
name: storage
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 500Gi
# Prometheus remote_write to VictoriaMetrics
remote_write:
- url: http://victoriametrics:8428/api/v1/write
queue_config:
max_samples_per_send: 10000
capacity: 20000
max_shards: 30
Unique Features (MetricsQL, Downsampling)
# MetricsQL — VictoriaMetrics' PromQL superset
# Supports additional functions not in standard PromQL
# Range over range — rate calculation that handles counter resets better
rate(http_requests_total[5m]) # Standard PromQL
increase(http_requests_total[5m]) # VM handles resets more accurately
# Rollup functions with explicit window
rollup_rate(http_requests_total, 5m)
# Label manipulation
label_set(metric, "env", "prod")
label_del(metric, "instance")
label_copy(metric, "pod", "pod_name")
# Running aggregations
running_sum(rate(requests_total[1m])[1h:])
running_avg(cpu_usage[1h:5m])
# Built-in downsampling at query time
# No pre-configuration needed — VM automatically uses appropriate resolution
# Query a year of data efficiently:
avg_over_time(cpu_usage[1y:1h]) # 1-hour step over 1 year
Remote Write Configuration
Queue Tuning & Reliability
# Production remote_write configuration
remote_write:
- url: https://mimir.internal/api/v1/push
headers:
X-Scope-OrgID: production
# Queue tuning for high-volume (>500K samples/sec)
queue_config:
capacity: 10000 # Per-shard buffer (default: 2500)
max_shards: 50 # Max parallel writes (default: 200)
min_shards: 10 # Min parallel writes (faster startup)
max_samples_per_send: 5000 # Batch size (default: 2000)
batch_send_deadline: 5s # Max wait before partial batch send
min_backoff: 30ms # Initial retry backoff
max_backoff: 5s # Max retry backoff
retry_on_http_429: true # Retry on rate limiting
# Metadata configuration
metadata_config:
send: true
send_interval: 5m
# TLS for encrypted transport
tls_config:
cert_file: /etc/certs/client.crt
key_file: /etc/certs/client.key
ca_file: /etc/certs/ca.crt
Write-Path Relabeling
# Selective remote write — reduce cost by filtering
remote_write:
- url: https://mimir.internal/api/v1/push
write_relabel_configs:
# Only send recording rules and critical raw metrics
- source_labels: [__name__]
regex: '(namespace|job|cluster|slo):.*'
action: keep
# Drop high-cardinality debug metrics
- source_labels: [__name__]
regex: 'go_(gc|memstats)_.*'
action: drop
# Remove labels that only matter locally
- regex: '__replica__|prometheus_replica'
action: labeldrop
Exemplars Support
# Enable exemplars in remote write (links metrics to traces)
remote_write:
- url: https://mimir.internal/api/v1/push
send_exemplars: true # Forward exemplars to backend
# Mimir configuration to accept exemplars
limits:
max_exemplars_per_user: 100000 # Per-tenant exemplar limit
Mimir vs VictoriaMetrics
Feature Comparison
Grafana Mimir vs VictoriaMetrics
| Feature | Grafana Mimir | VictoriaMetrics |
|---|---|---|
| Multi-tenancy | Native (per-request header) | Enterprise only (or separate instances) |
| Storage backend | Object store (S3/GCS/Azure) | Local disk (cluster mode distributes) |
| Query language | Standard PromQL | MetricsQL (PromQL superset) |
| HA deduplication | Query-time (via replica label) | Write-time (dedup.minScrapeInterval) |
| Downsampling | Compactor-driven (5m, 1h) | Query-time (automatic) |
| Compression ratio | ~1.5 bytes/sample | ~0.7 bytes/sample (industry best) |
| Min viable deployment | 3 replicas (monolithic) | 1 instance (single-node) |
| Grafana integration | Native (same company) | Full PromQL datasource compatible |
| License | AGPLv3 | Apache 2.0 (enterprise features paid) |
Operational Comparison
Operational Reality:
- Mimir: More components to manage but scales to billions of series. Requires object store (S3/GCS). Better for large organizations with multi-tenant requirements.
- VictoriaMetrics: Operationally simpler (single binary possible). Better compression means less storage cost. Better for single-tenant or small-team deployments where simplicity matters.
Decision Guide
Choose Grafana Mimir when:
- You need native multi-tenancy with per-tenant limits
- You already use Grafana Cloud or the LGTM stack
- Object storage (S3/GCS) is your preferred backend
- You need built-in alerting rules evaluation (Ruler component)
- Scale exceeds 100M+ active series across many teams
Choose VictoriaMetrics when:
- Operational simplicity is the top priority
- Storage cost optimization matters (best compression)
- Single-tenant or separated-instance multi-tenancy is acceptable
- You want MetricsQL’s extended query capabilities
- Local disk is preferred over object store
- Faster query performance on single-node deployments
Conclusion
Key Takeaways:
- Remote write is the standard — use it for all production Prometheus deployments
- Filter at write time — use write_relabel_configs to reduce storage costs
- Mimir for multi-tenant enterprises — native isolation, object store, Grafana ecosystem
- VictoriaMetrics for simplicity — best compression, single binary, MetricsQL
- Thanos for existing deployments — if you already have TSDB blocks on disk (covered in Part 11)
- Monitor your remote write — track pending samples, lag, and failed sends