Back to Monitoring, Observability & Reliability Series

Part 3: Time Series Data, Prometheus & PromQL

May 14, 2026 Wasil Zafar 22 min read

Prometheus is the de facto standard for metrics collection in cloud-native environments. In this part you will understand how time series databases store data, master Prometheus architecture and configuration, and learn PromQL — the query language that turns raw metrics into operational insights.

Table of Contents

  1. Time Series Databases
  2. Prometheus Architecture
  3. Prometheus Configuration
  4. PromQL Deep Dive
  5. Alertmanager
  6. Conclusion & Next Steps

Time Series Databases

A time series database (TSDB) is a database optimised for storing and querying time-indexed data — sequences of (timestamp, value) pairs. Unlike general-purpose databases, TSDBs make aggressive trade-offs: they are optimised for high write throughput of sequential data and time-range queries, at the cost of flexibility in querying arbitrary dimensions.

Core TSDB Concepts

Understanding these concepts is essential for reasoning about metrics at scale:

ConceptDefinitionImplication
Time Series A sequence of (timestamp, value) pairs with a unique metric name + label set Each unique label combination is a separate series; cardinality = series count
Sample A single (timestamp, value) data point in a time series Prometheus default scrape interval: 15s → 4 samples/minute per series
Chunk A compressed block of consecutive samples for one series Prometheus uses XOR delta-of-delta compression: ~1.37 bytes/sample
Block A time-bounded directory of chunks + index (default 2h) Compaction merges blocks; read operations scan blocks in range

Sampling, Retention & Downsampling

Every metrics system must make decisions about how long to keep data at what resolution. The trade-off: finer resolution consumes more storage and memory; coarser resolution loses detail.

Storage Estimation: With 10,000 active time series scraped every 15 seconds, Prometheus stores ~10,000 × 4 samples/min = 40,000 samples/min. At ~1.37 bytes/sample (Prometheus compression), that is ~55 KB/minute = ~78 MB/day. For 30 days retention: ~2.3 GB. This scales linearly with series count.

Downsampling strategies:

  • Prometheus remote_write + Thanos/Mimir: Store full-resolution data in Prometheus, downsample to 5m/1h in long-term storage
  • Recording rules: Pre-aggregate high-cardinality queries into lower-cardinality pre-computed metrics
  • Retention policies: Prometheus default is 15 days; reduce for cost control

Prometheus Architecture

Prometheus Architecture
                                flowchart TD
                                    A[Targets\nApps & Infrastructure] -->|/metrics endpoint| B[Prometheus Server\nScrape Engine]
                                    B -->|stores| C[TSDB\nLocal Storage]
                                    B -->|evaluates| D[Alerting Rules]
                                    D -->|sends alerts| E[Alertmanager]
                                    E -->|routes| F[PagerDuty / Slack / Email]
                                    C -->|PromQL queries| G[Grafana]
                                    C -->|remote_write| H[Long-Term Storage\nThanos / Mimir / Cortex]
                                    I[Service Discovery\nK8s / Consul / EC2] -->|target list| B
                            

The Pull Model — Why Prometheus Scrapes

Most legacy monitoring systems use a push model: applications send metrics to the monitoring system. Prometheus uses the opposite — a pull model: Prometheus periodically scrapes (HTTP GETs) a /metrics endpoint on each target.

Advantages of the pull model:

  • Simple target health check: If Prometheus cannot scrape a target, it knows the target is down — no separate health check needed
  • Config in one place: Scrape targets are defined in Prometheus config, not scattered across every application
  • No backpressure: Prometheus controls the scrape rate; a misbehaving application cannot flood the metrics backend
  • Debugging: You can manually curl the /metrics endpoint to see exactly what data Prometheus is collecting
# Manually curl a Prometheus metrics endpoint
curl http://localhost:8080/metrics

# Example output:
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 42
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 2.84

TSDB Storage Internals

Prometheus's TSDB is a custom storage engine with excellent write performance. Key design decisions:

  • In-memory head block: Recent 2 hours of data kept in RAM for fast writes and reads
  • WAL (Write-Ahead Log): New samples written to WAL first for crash recovery
  • Chunk compression: XOR delta-of-delta encoding achieves ~1.37 bytes/sample (vs 16 bytes raw)
  • Compaction: Every 2 hours, head block flushed to disk; compaction merges smaller blocks into larger ones
  • Inverted index: Label-to-series mapping enables fast label-based queries

Service Discovery

In dynamic environments (Kubernetes, cloud), targets come and go constantly. Prometheus supports automatic service discovery from many sources:

# prometheus.yml — Kubernetes service discovery
scrape_configs:
  - job_name: 'kubernetes-pods'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      # Only scrape pods with the annotation prometheus.io/scrape: "true"
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      # Use the annotation prometheus.io/port if specified
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: (.+)
        replacement: ${1}
      # Add namespace and pod name as labels
      - source_labels: [__meta_kubernetes_namespace]
        target_label: namespace
      - source_labels: [__meta_kubernetes_pod_name]
        target_label: pod

Prometheus Configuration

Scrape Configuration

A complete Prometheus configuration file structure:

# prometheus.yml — complete configuration example
global:
  scrape_interval: 15s         # Default scrape interval
  evaluation_interval: 15s     # How often to evaluate rules
  external_labels:
    environment: production
    region: us-east-1

# Alertmanager targets
alerting:
  alertmanagers:
    - static_configs:
        - targets: ['alertmanager:9093']

# Rule files (recording rules + alerting rules)
rule_files:
  - "recording_rules.yml"
  - "alerting_rules.yml"

scrape_configs:
  # Scrape Prometheus itself
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  # Scrape Node Exporter on all servers
  - job_name: 'node'
    static_configs:
      - targets:
          - 'server1:9100'
          - 'server2:9100'
          - 'server3:9100'
    scrape_interval: 30s  # Override global for this job

  # Scrape an application with custom path
  - job_name: 'myapp'
    metrics_path: /internal/metrics
    scheme: https
    static_configs:
      - targets: ['myapp.example.com:443']
    tls_config:
      insecure_skip_verify: false

Recording Rules

Recording rules pre-compute expensive queries and store the result as a new time series. This is essential for:

  • Dashboard queries that would otherwise be too slow
  • Reducing cardinality of frequently-queried aggregations
  • Creating "summary" metrics that cross multiple jobs or services
# recording_rules.yml
groups:
  - name: http_request_rates
    interval: 30s
    rules:
      # Pre-compute 5-minute request rate by job and status
      - record: job:http_requests:rate5m
        expr: sum(rate(http_requests_total[5m])) by (job, status_code)

      # Pre-compute p99 latency per job
      - record: job:http_request_duration_p99:rate5m
        expr: |
          histogram_quantile(0.99,
            sum(rate(http_request_duration_seconds_bucket[5m])) by (job, le)
          )

      # Pre-compute error rate per job
      - record: job:http_error_rate:rate5m
        expr: |
          sum(rate(http_requests_total{status_code=~"5.."}[5m])) by (job)
          /
          sum(rate(http_requests_total[5m])) by (job)

Alerting Rules

Alerting rules evaluate PromQL expressions and fire alerts to Alertmanager when conditions are met:

# alerting_rules.yml
groups:
  - name: slo_alerts
    rules:
      # High error rate alert
      - alert: HighErrorRate
        expr: job:http_error_rate:rate5m > 0.05
        for: 5m
        labels:
          severity: critical
          team: platform
        annotations:
          summary: "High error rate on {{ $labels.job }}"
          description: "Error rate is {{ $value | humanizePercentage }} on {{ $labels.job }} (threshold: 5%)"
          runbook_url: "https://runbooks.example.com/high-error-rate"

      # High latency alert
      - alert: HighP99Latency
        expr: job:http_request_duration_p99:rate5m > 0.5
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "High p99 latency on {{ $labels.job }}"
          description: "p99 latency is {{ $value | humanizeDuration }} on {{ $labels.job }}"

      # Instance down alert
      - alert: InstanceDown
        expr: up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Instance {{ $labels.instance }} is down"
          description: "{{ $labels.job }} on {{ $labels.instance }} has been down for more than 1 minute"

PromQL Deep Dive

PromQL (Prometheus Query Language) is a functional query language designed specifically for time series data. Understanding it deeply is what separates Prometheus beginners from practitioners.

Selectors & Matchers

Selectors specify which time series to query. Matchers filter by label values:

# Exact match — select the metric with this exact label
http_requests_total{job="api", status_code="200"}

# Regex match — select all 5xx status codes
http_requests_total{status_code=~"5.."}

# Negative match — exclude 2xx codes
http_requests_total{status_code!~"2.."}

# Range vector selector — get data over last 5 minutes
http_requests_total{job="api"}[5m]

# Instant vector with offset — query value 1 hour ago
http_requests_total offset 1h

Key PromQL Functions

# rate() — per-second rate of increase for counters
# Use for counters. Handles counter resets.
rate(http_requests_total[5m])

# irate() — instant rate (last two samples only)
# More responsive but noisier than rate()
irate(http_requests_total[5m])

# increase() — total increase over a time range
# Useful for "requests in the last hour"
increase(http_requests_total[1h])

# delta() — change in value over range (for gauges)
delta(node_memory_Active_bytes[1h])

# avg_over_time() — average of a gauge over range
avg_over_time(node_cpu_utilization[30m])

# max_over_time() / min_over_time()
max_over_time(node_memory_Active_bytes[1h])

# histogram_quantile() — calculate percentiles from histograms
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))

Aggregation Operators

Aggregations reduce the dimensionality of your query results by grouping and combining series:

# sum() — add up values across dimensions
# Total requests per second across all instances
sum(rate(http_requests_total[5m]))

# sum by() — sum but keep specified labels
# Requests per second, broken down by job
sum by (job) (rate(http_requests_total[5m]))

# sum without() — sum but drop specified labels
# Sum everything except the instance label
sum without (instance) (rate(http_requests_total[5m]))

# avg() — average across dimensions
avg(node_cpu_utilization) by (datacenter)

# max() and min()
max by (job) (http_request_duration_p99)

# count() — count number of time series
count(up == 1) by (job)  # How many instances are up per job?

# topk() and bottomk() — top/bottom N series
topk(5, rate(http_requests_total[5m]))  # Top 5 highest-traffic services

Practical Query Patterns

Real-world PromQL queries you will use in production:

# SLO compliance: is error rate below 1%?
sum(rate(http_requests_total{status_code=~"5.."}[30m]))
  /
sum(rate(http_requests_total[30m]))
  < 0.01

# Availability: what % of requests succeeded?
sum(rate(http_requests_total{status_code!~"5.."}[1h]))
  /
sum(rate(http_requests_total[1h]))
  * 100

# CPU saturation across cluster
100 - avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance) * 100

# Memory pressure — warn when available is low
node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100 < 20

# Rate of change in error rate (is it getting worse?)
deriv(rate(http_requests_total{status_code=~"5.."}[10m])[30m:1m])

# Comparing current vs 1 week ago (week-over-week traffic)
rate(http_requests_total[5m])
  /
rate(http_requests_total[5m] offset 7d)
PromQL Learning Path: Start with basic selectors and rate(). Then learn aggregations (sum by, avg). Then recording rules to pre-compute expensive queries. Then histogram_quantile() for percentiles. Then multi-step calculations for SLO compliance. Each step builds on the previous.

Alertmanager — Routing, Grouping & Silencing

Alertmanager receives alerts from Prometheus, applies routing logic, and delivers notifications to the appropriate channels. Its three key features:

1. Grouping

Multiple related alerts are bundled into a single notification. Without grouping, a single database outage could fire 500 separate "instance down" alerts. With grouping, they are delivered as one notification: "500 instances in us-east-1 are down."

2. Inhibition

A critical alert can suppress related lower-priority alerts. If a cluster node is down (critical), Alertmanager inhibits all the "service not responding" alerts from pods on that node — since the root cause is known.

3. Silencing

During planned maintenance, silence alerts matching specific labels for a defined window. This prevents false alarm fatigue during expected downtime.

# alertmanager.yml — routing configuration
global:
  resolve_timeout: 5m

route:
  group_by: ['alertname', 'cluster', 'service']
  group_wait: 30s       # Wait 30s to collect related alerts before sending
  group_interval: 5m    # How long to wait before sending updates
  repeat_interval: 12h  # How long before re-alerting
  receiver: 'slack-default'
  routes:
    # Critical alerts go to PagerDuty immediately
    - match:
        severity: critical
      receiver: 'pagerduty'
      group_wait: 0s
    # Platform team alerts to their Slack channel
    - match:
        team: platform
      receiver: 'slack-platform'

receivers:
  - name: 'slack-default'
    slack_configs:
      - api_url: 'https://hooks.slack.com/services/...'
        channel: '#alerts'
        title: '{{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'
  - name: 'pagerduty'
    pagerduty_configs:
      - routing_key: 'your-pagerduty-routing-key'

Conclusion & Next Steps

Prometheus and PromQL form the bedrock of modern metrics systems. You now understand:

  • How time series databases store data efficiently using compression and block-based storage
  • Prometheus architecture: pull model, TSDB, Alertmanager, service discovery
  • How to configure Prometheus scrape jobs, recording rules, and alerting rules
  • PromQL fundamentals: selectors, functions (rate, histogram_quantile), aggregations
  • Practical query patterns for SLO compliance, availability, saturation, and anomaly detection