Part 3: Time Series Data, Prometheus & PromQL

Time Series Databases

A time series database (TSDB) is a database optimised for storing and querying time-indexed data — sequences of (timestamp, value) pairs. Unlike general-purpose databases, TSDBs make aggressive trade-offs: they are optimised for high write throughput of sequential data and time-range queries, at the cost of flexibility in querying arbitrary dimensions.

Core TSDB Concepts

Understanding these concepts is essential for reasoning about metrics at scale:

Concept	Definition	Implication
Time Series	A sequence of (timestamp, value) pairs with a unique metric name + label set	Each unique label combination is a separate series; cardinality = series count
Sample	A single (timestamp, value) data point in a time series	Prometheus default scrape interval: 15s → 4 samples/minute per series
Chunk	A compressed block of consecutive samples for one series	Prometheus uses XOR delta-of-delta compression: ~1.37 bytes/sample
Block	A time-bounded directory of chunks + index (default 2h)	Compaction merges blocks; read operations scan blocks in range

Sampling, Retention & Downsampling

Every metrics system must make decisions about how long to keep data at what resolution. The trade-off: finer resolution consumes more storage and memory; coarser resolution loses detail.

                            
                            Storage Estimation: With 10,000 active time series scraped every 15 seconds, Prometheus stores ~10,000 × 4 samples/min = 40,000 samples/min. At ~1.37 bytes/sample (Prometheus compression), that is ~55 KB/minute = ~78 MB/day. For 30 days retention: ~2.3 GB. This scales linearly with series count.
                        

Downsampling strategies:

Prometheus remote_write + Thanos/Mimir: Store full-resolution data in Prometheus, downsample to 5m/1h in long-term storage
Recording rules: Pre-aggregate high-cardinality queries into lower-cardinality pre-computed metrics
Retention policies: Prometheus default is 15 days; reduce for cost control

Prometheus Architecture

                                flowchart TD
                                    A[Targets\nApps & Infrastructure] -->|/metrics endpoint| B[Prometheus Server\nScrape Engine]
                                    B -->|stores| C[TSDB\nLocal Storage]
                                    B -->|evaluates| D[Alerting Rules]
                                    D -->|sends alerts| E[Alertmanager]
                                    E -->|routes| F[PagerDuty / Slack / Email]
                                    C -->|PromQL queries| G[Grafana]
                                    C -->|remote_write| H[Long-Term Storage\nThanos / Mimir / Cortex]
                                    I[Service Discovery\nK8s / Consul / EC2] -->|target list| B

The Pull Model — Why Prometheus Scrapes

Most legacy monitoring systems use a push model: applications send metrics to the monitoring system. Prometheus uses the opposite — a pull model: Prometheus periodically scrapes (HTTP GETs) a /metrics endpoint on each target.

Advantages of the pull model:

Simple target health check: If Prometheus cannot scrape a target, it knows the target is down — no separate health check needed
Config in one place: Scrape targets are defined in Prometheus config, not scattered across every application
No backpressure: Prometheus controls the scrape rate; a misbehaving application cannot flood the metrics backend
Debugging: You can manually curl the /metrics endpoint to see exactly what data Prometheus is collecting

# Manually curl a Prometheus metrics endpoint
curl http://localhost:8080/metrics

# Example output:
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 42
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 2.84

TSDB Storage Internals

Prometheus's TSDB is a custom storage engine with excellent write performance. Key design decisions:

In-memory head block: Recent 2 hours of data kept in RAM for fast writes and reads
WAL (Write-Ahead Log): New samples written to WAL first for crash recovery
Chunk compression: XOR delta-of-delta encoding achieves ~1.37 bytes/sample (vs 16 bytes raw)
Compaction: Every 2 hours, head block flushed to disk; compaction merges smaller blocks into larger ones
Inverted index: Label-to-series mapping enables fast label-based queries

Service Discovery

In dynamic environments (Kubernetes, cloud), targets come and go constantly. Prometheus supports automatic service discovery from many sources:

# prometheus.yml — Kubernetes service discovery
scrape_configs:
  - job_name: 'kubernetes-pods'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      # Only scrape pods with the annotation prometheus.io/scrape: "true"
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      # Use the annotation prometheus.io/port if specified
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: (.+)
        replacement: ${1}
      # Add namespace and pod name as labels
      - source_labels: [__meta_kubernetes_namespace]
        target_label: namespace
      - source_labels: [__meta_kubernetes_pod_name]
        target_label: pod

Prometheus Configuration

Scrape Configuration

A complete Prometheus configuration file structure:

# prometheus.yml — complete configuration example
global:
  scrape_interval: 15s         # Default scrape interval
  evaluation_interval: 15s     # How often to evaluate rules
  external_labels:
    environment: production
    region: us-east-1

# Alertmanager targets
alerting:
  alertmanagers:
    - static_configs:
        - targets: ['alertmanager:9093']

# Rule files (recording rules + alerting rules)
rule_files:
  - "recording_rules.yml"
  - "alerting_rules.yml"

scrape_configs:
  # Scrape Prometheus itself
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  # Scrape Node Exporter on all servers
  - job_name: 'node'
    static_configs:
      - targets:
          - 'server1:9100'
          - 'server2:9100'
          - 'server3:9100'
    scrape_interval: 30s  # Override global for this job

  # Scrape an application with custom path
  - job_name: 'myapp'
    metrics_path: /internal/metrics
    scheme: https
    static_configs:
      - targets: ['myapp.example.com:443']
    tls_config:
      insecure_skip_verify: false

Recording Rules

Recording rules pre-compute expensive queries and store the result as a new time series. This is essential for:

Dashboard queries that would otherwise be too slow
Reducing cardinality of frequently-queried aggregations
Creating "summary" metrics that cross multiple jobs or services

# recording_rules.yml
groups:
  - name: http_request_rates
    interval: 30s
    rules:
      # Pre-compute 5-minute request rate by job and status
      - record: job:http_requests:rate5m
        expr: sum(rate(http_requests_total[5m])) by (job, status_code)

      # Pre-compute p99 latency per job
      - record: job:http_request_duration_p99:rate5m
        expr: |
          histogram_quantile(0.99,
            sum(rate(http_request_duration_seconds_bucket[5m])) by (job, le)
          )

      # Pre-compute error rate per job
      - record: job:http_error_rate:rate5m
        expr: |
          sum(rate(http_requests_total{status_code=~"5.."}[5m])) by (job)
          /
          sum(rate(http_requests_total[5m])) by (job)

Alerting Rules

Alerting rules evaluate PromQL expressions and fire alerts to Alertmanager when conditions are met:

# alerting_rules.yml
groups:
  - name: slo_alerts
    rules:
      # High error rate alert
      - alert: HighErrorRate
        expr: job:http_error_rate:rate5m > 0.05
        for: 5m
        labels:
          severity: critical
          team: platform
        annotations:
          summary: "High error rate on {{ $labels.job }}"
          description: "Error rate is {{ $value | humanizePercentage }} on {{ $labels.job }} (threshold: 5%)"
          runbook_url: "https://runbooks.example.com/high-error-rate"

      # High latency alert
      - alert: HighP99Latency
        expr: job:http_request_duration_p99:rate5m > 0.5
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "High p99 latency on {{ $labels.job }}"
          description: "p99 latency is {{ $value | humanizeDuration }} on {{ $labels.job }}"

      # Instance down alert
      - alert: InstanceDown
        expr: up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Instance {{ $labels.instance }} is down"
          description: "{{ $labels.job }} on {{ $labels.instance }} has been down for more than 1 minute"

PromQL Deep Dive

PromQL (Prometheus Query Language) is a functional query language designed specifically for time series data. Understanding it deeply is what separates Prometheus beginners from practitioners.

Selectors & Matchers

Selectors specify which time series to query. Matchers filter by label values:

# Exact match — select the metric with this exact label
http_requests_total{job="api", status_code="200"}

# Regex match — select all 5xx status codes
http_requests_total{status_code=~"5.."}

# Negative match — exclude 2xx codes
http_requests_total{status_code!~"2.."}

# Range vector selector — get data over last 5 minutes
http_requests_total{job="api"}[5m]

# Instant vector with offset — query value 1 hour ago
http_requests_total offset 1h

Key PromQL Functions

# rate() — per-second rate of increase for counters
# Use for counters. Handles counter resets.
rate(http_requests_total[5m])

# irate() — instant rate (last two samples only)
# More responsive but noisier than rate()
irate(http_requests_total[5m])

# increase() — total increase over a time range
# Useful for "requests in the last hour"
increase(http_requests_total[1h])

# delta() — change in value over range (for gauges)
delta(node_memory_Active_bytes[1h])

# avg_over_time() — average of a gauge over range
avg_over_time(node_cpu_utilization[30m])

# max_over_time() / min_over_time()
max_over_time(node_memory_Active_bytes[1h])

# histogram_quantile() — calculate percentiles from histograms
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))

Aggregation Operators

Aggregations reduce the dimensionality of your query results by grouping and combining series:

# sum() — add up values across dimensions
# Total requests per second across all instances
sum(rate(http_requests_total[5m]))

# sum by() — sum but keep specified labels
# Requests per second, broken down by job
sum by (job) (rate(http_requests_total[5m]))

# sum without() — sum but drop specified labels
# Sum everything except the instance label
sum without (instance) (rate(http_requests_total[5m]))

# avg() — average across dimensions
avg(node_cpu_utilization) by (datacenter)

# max() and min()
max by (job) (http_request_duration_p99)

# count() — count number of time series
count(up == 1) by (job)  # How many instances are up per job?

# topk() and bottomk() — top/bottom N series
topk(5, rate(http_requests_total[5m]))  # Top 5 highest-traffic services

Practical Query Patterns

Real-world PromQL queries you will use in production:

# SLO compliance: is error rate below 1%?
sum(rate(http_requests_total{status_code=~"5.."}[30m]))
  /
sum(rate(http_requests_total[30m]))
  < 0.01

# Availability: what % of requests succeeded?
sum(rate(http_requests_total{status_code!~"5.."}[1h]))
  /
sum(rate(http_requests_total[1h]))
  * 100

# CPU saturation across cluster
100 - avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance) * 100

# Memory pressure — warn when available is low
node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100 < 20

# Rate of change in error rate (is it getting worse?)
deriv(rate(http_requests_total{status_code=~"5.."}[10m])[30m:1m])

# Comparing current vs 1 week ago (week-over-week traffic)
rate(http_requests_total[5m])
  /
rate(http_requests_total[5m] offset 7d)

                            
                            PromQL Learning Path: Start with basic selectors and rate(). Then learn aggregations (sum by, avg). Then recording rules to pre-compute expensive queries. Then histogram_quantile() for percentiles. Then multi-step calculations for SLO compliance. Each step builds on the previous.
                        

Alertmanager — Routing, Grouping & Silencing

Alertmanager receives alerts from Prometheus, applies routing logic, and delivers notifications to the appropriate channels. Its three key features:

1. Grouping

Multiple related alerts are bundled into a single notification. Without grouping, a single database outage could fire 500 separate "instance down" alerts. With grouping, they are delivered as one notification: "500 instances in us-east-1 are down."

2. Inhibition

A critical alert can suppress related lower-priority alerts. If a cluster node is down (critical), Alertmanager inhibits all the "service not responding" alerts from pods on that node — since the root cause is known.

3. Silencing

During planned maintenance, silence alerts matching specific labels for a defined window. This prevents false alarm fatigue during expected downtime.

# alertmanager.yml — routing configuration
global:
  resolve_timeout: 5m

route:
  group_by: ['alertname', 'cluster', 'service']
  group_wait: 30s       # Wait 30s to collect related alerts before sending
  group_interval: 5m    # How long to wait before sending updates
  repeat_interval: 12h  # How long before re-alerting
  receiver: 'slack-default'
  routes:
    # Critical alerts go to PagerDuty immediately
    - match:
        severity: critical
      receiver: 'pagerduty'
      group_wait: 0s
    # Platform team alerts to their Slack channel
    - match:
        team: platform
      receiver: 'slack-platform'

receivers:
  - name: 'slack-default'
    slack_configs:
      - api_url: 'https://hooks.slack.com/services/...'
        channel: '#alerts'
        title: '{{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'
  - name: 'pagerduty'
    pagerduty_configs:
      - routing_key: 'your-pagerduty-routing-key'

Conclusion & Next Steps

Prometheus and PromQL form the bedrock of modern metrics systems. You now understand:

How time series databases store data efficiently using compression and block-based storage
Prometheus architecture: pull model, TSDB, Alertmanager, service discovery
How to configure Prometheus scrape jobs, recording rules, and alerting rules
PromQL fundamentals: selectors, functions (rate, histogram_quantile), aggregations
Practical query patterns for SLO compliance, availability, saturation, and anomaly detection

Previous Part 2: Metrics Fundamentals & the Four Golden Signals Next Part 4: Logging Deep Dive

Cookie Consent

Part 3: Time Series Data, Prometheus & PromQL

Table of Contents

Time Series Databases

Core TSDB Concepts

Sampling, Retention & Downsampling

Prometheus Architecture

The Pull Model — Why Prometheus Scrapes

TSDB Storage Internals

Service Discovery

Prometheus Configuration

Scrape Configuration

Recording Rules

Alerting Rules

PromQL Deep Dive

Selectors & Matchers

Key PromQL Functions

Aggregation Operators

Practical Query Patterns

Alertmanager — Routing, Grouping & Silencing

1. Grouping

2. Inhibition

3. Silencing

Conclusion & Next Steps

Cookie Consent

Part 3: Time Series Data, Prometheus & PromQL

Table of Contents

Time Series Databases

Core TSDB Concepts

Sampling, Retention & Downsampling

Prometheus Architecture

The Pull Model — Why Prometheus Scrapes

TSDB Storage Internals

Service Discovery

Prometheus Configuration

Scrape Configuration

Recording Rules

Alerting Rules

PromQL Deep Dive

Selectors & Matchers

Key PromQL Functions

Aggregation Operators

Practical Query Patterns

Alertmanager — Routing, Grouping & Silencing

1. Grouping

2. Inhibition

3. Silencing

Conclusion & Next Steps

Continue the Series

Part 4: Logging Deep Dive — From Fundamentals to Centralized

Tool Deep Dive: Prometheus Complete Guide

Part 7: Observability Architecture, Visualization & Alerting