Prometheus Deep Dive Part 1: Observability, Monitoring & Prometheus

The Evolution of Monitoring

To understand where Prometheus fits in the monitoring landscape, we need to trace the lineage of monitoring systems from their earliest forms to the cloud-native era. Each generation solved the problems of its time while creating the constraints that the next generation would overcome.

Early Monitoring Systems

The history of system monitoring stretches back to the earliest networked computers, but the modern era begins with a few pivotal systems:

Timeline

Monitoring System Evolution

Era	System	Paradigm	Limitations
1988	SNMP	Agent-based polling	Limited to network devices, complex MIBs
1999	Nagios	Check-based (up/down)	No time series, configuration explosion at scale
2006	Graphite	Push metrics + dashboards	No labels/dimensions, hierarchical naming
2008	Borgmon (Google internal)	Pull-based, label-dimensional, rules	Proprietary, never open-sourced
2012	Prometheus (SoundCloud)	Pull-based, multi-dimensional, PromQL	Single-node TSDB, 15-day default retention
2013	InfluxDB	Push-based time series DB	Clustering complexity, commercial features gated
2017	Thanos / Cortex	Prometheus long-term storage	Operational complexity
2020	Grafana Mimir	Horizontally scalable Prometheus	Requires object storage infrastructure
2023	OpenTelemetry Metrics	Vendor-neutral telemetry pipeline	Still maturing, ecosystem fragmentation

HistoryEvolutionCloud Native

The Nagios era (late 1990s–2010s) defined monitoring as “Is it up or down?” Nagios and its forks (Icinga, Shinken) excelled at host and service checks but lacked time-series storage. You knew something was broken, but understanding why required SSH-ing into machines and reading logs manually.

The Graphite era (2006–2015) introduced time-series metrics collection via StatsD and carbon. Teams could finally graph metric values over time. But Graphite used hierarchical dot-notation naming (servers.web01.cpu.user) which made ad-hoc querying across dimensions nearly impossible. Want CPU usage by region? You needed a completely different metric path.

Google’s Borgmon

Inside Google, the Borg cluster manager (predecessor to Kubernetes) had its own monitoring system: Borgmon. First described publicly in Google’s 2016 SRE book, Borgmon introduced several revolutionary concepts:

                            
                            Borgmon’s Key Innovations:
                            Multi-dimensional data model — metrics identified by name + key-value label pairs, not hierarchical paths
Pull-based collection — Borgmon scrapes targets rather than targets pushing to it
Powerful query language — algebraic expressions over time series for dashboards and alerting
Rules-based alerting — alerts defined as expressions, not threshold checks
Service discovery integration — automatically finds targets from the cluster scheduler

                        

These same principles became the foundation of Prometheus. Matt Proud and Julius Volz, both ex-Googlers who joined SoundCloud, brought Borgmon’s philosophy to the open-source world.

Birth at SoundCloud

In 2012, SoundCloud was rapidly adopting microservices and found existing monitoring tools inadequate. Nagios couldn’t handle the dynamic nature of containerized services, and Graphite’s hierarchical naming was too rigid for multi-dimensional queries.

Matt Proud and Julius Volz began building Prometheus as an internal project, drawing directly from their experience with Borgmon at Google. The key design decisions made at SoundCloud:

Written in Go — single binary, easy deployment, no external dependencies
Pull-based scraping — Prometheus actively fetches metrics from instrumented targets
Local TSDB storage — no external database required; everything self-contained
Label-based data model — every metric has arbitrary key-value pairs for dimensional queries
PromQL — a functional query language purpose-built for time series aggregation
Alerting via expressions — alerts are PromQL queries that evaluate to true/false

Prometheus was open-sourced in January 2015 and quickly gained traction as the Kubernetes ecosystem was forming. The timing was perfect — Kubernetes needed a monitoring system that understood dynamic service discovery, and Prometheus needed a platform that generated the kind of ephemeral, labeled workloads it was designed to monitor.

CNCF Graduation & Ecosystem

Prometheus joined the Cloud Native Computing Foundation (CNCF) in May 2016 as its second hosted project after Kubernetes itself. It graduated in August 2018, signifying production readiness and a healthy governance model.

Prometheus Project Timeline

timeline
    title Prometheus Milestones
    2012 : Development begins at SoundCloud
    2015 : Open-sourced (v0.1)
    2016 : Joins CNCF
         : Prometheus 1.0 release
    2017 : Prometheus 2.0 (new TSDB)
         : Thanos project announced
    2018 : CNCF Graduation
         : Remote write protocol formalized
    2019 : Cortex donated to CNCF
    2020 : OpenMetrics standardization
    2022 : Grafana Mimir open-sourced
         : Native histograms introduced
    2023 : Prometheus 2.47+ (UTF-8 metrics)
    2024 : OpenTelemetry Prometheus receiver GA
    2025 : Prometheus 3.0 (OTLP native ingestion)

Today, Prometheus has over 55,000 GitHub stars, 900+ contributors, and forms the metrics backbone for the majority of Kubernetes deployments worldwide. Its exposition format became the basis for the OpenMetrics standard (RFC draft), and its remote-write protocol is the de facto standard for metrics ingestion across the ecosystem.

Observability Terminology

Monitoring vs Observability

These terms are often used interchangeably, but they represent fundamentally different philosophies:

                            
                            Monitoring tells you when something is wrong. It answers predefined questions: “Is the service up?”, “Is latency above threshold?”, “Are errors increasing?” You must know what to ask in advance.
                        

                            
                            Observability lets you ask arbitrary questions about your system’s internal state by examining its external outputs. It answers questions you didn’t anticipate: “Why are requests from region X, for user segment Y, using API version Z, experiencing 3x normal latency?”
                        

Prometheus primarily enables monitoring (predefined metric collection and alerting), but its multi-dimensional data model and PromQL make it significantly more powerful than traditional monitoring tools. Combined with logs (Loki) and traces (Tempo), it forms a complete observability system.

The Three Pillars + Profiles

Modern observability is built on four telemetry signal types, each offering a different lens into system behavior:

The Four Signals of Observability

flowchart LR
    subgraph Signals["Telemetry Signals"]
        M["Metrics
What happened?
Numeric aggregates
over time"]
        L["Logs
What details?
Discrete events
with context"]
        T["Traces
Where specifically?
Request flow
across services"]
        P["Profiles
Why resource usage?
CPU/memory
at code level"]
    end

    M -->|"Prometheus
Mimir"| D["Dashboards
& Alerts"]
    L -->|"Loki
Elasticsearch"| D
    T -->|"Tempo
Jaeger"| D
    P -->|"Pyroscope
pprof"| D

Signal	Prometheus Role	Example	Cardinality
Metrics	Primary — collection, storage, querying, alerting	`http_requests_total{method="GET", status="200"}`	Low (aggregated)
Logs	Indirect — Loki uses PromQL-like LogQL	`{job="api"} \|= "error" \| json`	High (per-event)
Traces	Indirect — exemplars link metrics to traces	Trace ID embedded in histogram bucket	Very high (per-request)
Profiles	Complementary — Pyroscope correlates with metrics	CPU flame graph for high-latency period	Very high (per-function)

Metric Types & Semantics

Prometheus defines four core metric types, each with distinct semantics that determine how they should be queried:

# Counter - monotonically increasing value (resets on restart)
# USE: request counts, bytes sent, errors total
# QUERY: Always use rate() or increase() - raw value is meaningless
http_requests_total{method="GET", handler="/api/users", status="200"} 142857

# Gauge - value that can go up and down
# USE: temperature, memory usage, active connections, queue depth
# QUERY: Direct value is meaningful; use avg_over_time(), max_over_time()
node_memory_MemAvailable_bytes 8589934592

# Histogram - samples observations into configurable buckets
# USE: request duration, response size - anything where distribution matters
# QUERY: histogram_quantile(0.99, rate(http_duration_seconds_bucket[5m]))
http_request_duration_seconds_bucket{le="0.1"} 24054
http_request_duration_seconds_bucket{le="0.25"} 100392
http_request_duration_seconds_bucket{le="0.5"} 129389
http_request_duration_seconds_bucket{le="1.0"} 133988
http_request_duration_seconds_bucket{le="+Inf"} 144320
http_request_duration_seconds_sum 53423.67
http_request_duration_seconds_count 144320

# Summary - client-side calculated quantiles (pre-aggregated)
# USE: When you need exact quantiles but don't need to aggregate across instances
# LIMITATION: Cannot aggregate quantiles across multiple instances
go_gc_duration_seconds{quantile="0.5"} 0.000235
go_gc_duration_seconds{quantile="0.9"} 0.000892
go_gc_duration_seconds{quantile="0.99"} 0.003401
go_gc_duration_seconds_sum 4.293820
go_gc_duration_seconds_count 18232

                            
                            Common Mistake: Using summary when you should use histogram. Summary quantiles cannot be aggregated across instances — the P99 of P99s is NOT the global P99. Always prefer histograms for request latency in distributed systems. The only exception is when you have a single instance and need precise quantiles with zero server-side computation.
                        

Labels & Dimensions

Labels are the heart of Prometheus’ power. Every unique combination of metric name + label key-value pairs creates a distinct time series:

# These are FOUR distinct time series:
http_requests_total{method="GET", handler="/api/users", status="200"}   → series 1
http_requests_total{method="GET", handler="/api/users", status="500"}   → series 2
http_requests_total{method="POST", handler="/api/users", status="201"}  → series 3
http_requests_total{method="GET", handler="/api/orders", status="200"}  → series 4

# Cardinality = unique combinations of all label values
# For this metric: methods(4) × handlers(50) × statuses(10) = 2,000 series
# Add an instance label with 100 pods: 2,000 × 100 = 200,000 series!

# GOOD labels (bounded cardinality):
http_requests_total{method="GET", status="200", service="user-api", environment="production"}

# BAD labels (unbounded cardinality - will kill your Prometheus):
http_requests_total{user_id="abc123", request_id="req-789", trace_id="..."}

The cardinality of a metric is the total number of unique time series it creates. High cardinality is the primary scaling challenge in Prometheus deployments. We’ll explore cardinality management in depth in Part 8 (Optimizing & Debugging).

Prometheus’ Role in Observability

The Pull-Based Model

Prometheus’ most distinctive architectural choice is its pull-based (scrape) model. Instead of applications pushing metrics to a central collector, Prometheus actively fetches metrics from HTTP endpoints exposed by targets:

Pull vs Push Models

flowchart TD
    subgraph Pull["Pull Model (Prometheus)"]
        P[Prometheus Server]
        T1[Target: /metrics]
        T2[Target: /metrics]
        T3[Target: /metrics]
        P -->|"GET /metrics
every 15s"| T1
        P -->|"GET /metrics
every 15s"| T2
        P -->|"GET /metrics
every 15s"| T3
    end

    subgraph Push["Push Model (StatsD/InfluxDB)"]
        C[Collector/DB]
        A1[App]
        A2[App]
        A3[App]
        A1 -->|"Push UDP/TCP"| C
        A2 -->|"Push UDP/TCP"| C
        A3 -->|"Push UDP/TCP"| C
    end

Advantages of pull:

Easier to detect “target down” — if a scrape fails, Prometheus knows immediately; with push, silence is ambiguous
No backpressure on applications — targets serve metrics on demand, they don’t need retry/buffer logic
Central control of scrape frequency — one configuration change affects all targets
Simpler firewall rules — only Prometheus needs outbound access to targets
Development convenience — curl the /metrics endpoint directly to debug instrumentation

Disadvantages of pull:

Short-lived jobs — batch jobs that finish before the next scrape miss data (solved by Pushgateway)
Network boundaries — Prometheus must reach all targets (solved by federation/remote-write)
Event-based metrics — not ideal for high-frequency events where every occurrence matters

Architecture Overview

Prometheus Core Architecture

flowchart TD
    subgraph Targets["Monitored Targets"]
        App1["Application
/metrics endpoint"]
        App2["Node Exporter
/metrics"]
        App3["Kubernetes API
Service Discovery"]
    end

    subgraph Prom["Prometheus Server"]
        SD["Service Discovery"]
        SC["Scrape Manager"]
        TSDB["Local TSDB
(time series storage)"]
        RE["Rule Engine
(recording + alerting rules)"]
        API["HTTP API
(PromQL queries)"]
    end

    subgraph Downstream["Downstream"]
        AM["Alertmanager
(dedup, routing, silencing)"]
        GF["Grafana
(dashboards)"]
        RW["Remote Write
(Mimir, Thanos, VictoriaMetrics)"]
    end

    App3 --> SD
    SD --> SC
    SC -->|"scrape /metrics"| App1
    SC -->|"scrape /metrics"| App2
    SC --> TSDB
    TSDB --> RE
    TSDB --> API
    RE -->|"fire alerts"| AM
    API -->|"query"| GF
    TSDB -->|"remote_write"| RW

Ecosystem Components

Prometheus is not a single binary but an ecosystem of purpose-built components:

Component	Purpose	When You Need It
Prometheus Server	Scraping, storage, query, rules	Always — the core
Alertmanager	Alert routing, deduplication, silencing	Any production alerting
Pushgateway	Metrics bridge for batch/short-lived jobs	Cron jobs, CI builds, Lambda functions
Node Exporter	Linux host metrics (CPU, memory, disk, network)	Any Linux infrastructure
Blackbox Exporter	Probing endpoints (HTTP, TCP, DNS, ICMP)	Synthetic monitoring, uptime checks
Client Libraries	Instrument application code (Go, Java, Python, etc.)	Custom application metrics
Exporters (100+)	Translate third-party metrics to Prometheus format	MySQL, Redis, Kafka, AWS, etc.

Design Philosophy & Tradeoffs

Reliability Over Accuracy

Prometheus makes a deliberate tradeoff: reliability of the monitoring system itself over perfect accuracy of every data point. This manifests in several ways:

                            
                            Prometheus Design Principles:
                            Each Prometheus server is independent — no clustering required for basic operation. If one server fails, others continue working
Eventual consistency is acceptable — a missed scrape or small gap in data is preferable to a monitoring system that crashes under load
Simple over complex — the core binary has zero external dependencies (no ZooKeeper, no Kafka, no Cassandra)
Local decision-making — alerting rules evaluate locally; alerts fire even if the network to downstream is degraded

                        

This philosophy stems from a core insight: your monitoring system must be more reliable than the systems it monitors. A monitoring system with complex distributed consensus requirements will fail in exactly the scenarios where you need it most — during network partitions, infrastructure failures, and cascading outages.

Local Storage by Default

Prometheus stores all time series data in a local Time Series Database (TSDB) on disk. This is simultaneously its greatest strength and primary limitation:

Strength	Limitation
Zero external dependencies	Limited by local disk capacity
Fast queries (local SSD)	Data loss if disk fails (mitigate with RAID/replication)
Simple operations	No global query view across instances
Predictable performance	Typically 15–30 day retention

For most teams, the local TSDB is sufficient. When you outgrow it, the remote write protocol lets you replicate data to long-term storage systems (Mimir, Thanos, VictoriaMetrics) — covered in Parts 10 and 11 of this track.

What Prometheus Is Not

Understanding Prometheus’ boundaries prevents misuse and disappointment:

                            
                            Prometheus is NOT:
                            An event logging system — it stores aggregated metrics, not individual events. Use Loki for logs
A long-term storage system — default retention is 15 days. Use Mimir/Thanos for years of data
100% accurate per-request billing — scrape intervals mean some data points are interpolated
A distributed database — each server is independent. Use federation or remote storage for global views
An anomaly detection engine — it provides the data; ML-based detection requires additional tooling
A dashboarding tool — Grafana is the standard visualization layer

                        

Prometheus vs Other Systems

vs Graphite

Aspect	Prometheus	Graphite
Data Model	Multi-dimensional (labels)	Hierarchical (dot-notation)
Collection	Pull (scrape)	Push (StatsD/carbon)
Query Language	PromQL (functional)	Graphite functions (pipe-based)
Storage	Local TSDB (compressed)	Whisper files (fixed-size)
Service Discovery	Native (K8s, Consul, DNS, etc.)	None (manual configuration)
Alerting	Built-in rules + Alertmanager	Requires external (Grafana alerts)
Scalability	Single server; scale via sharding/remote-write	Relay + carbon-cache clustering

vs InfluxDB

Aspect	Prometheus	InfluxDB
License	Apache 2.0 (fully open)	MIT (OSS) / Proprietary (Cloud)
Data Model	Metrics only (labels)	Tags + fields (richer but complex)
Collection	Pull	Push (line protocol)
Query Language	PromQL	Flux / InfluxQL
Use Case	Monitoring & alerting	General time series (IoT, analytics)
Ecosystem	Massive (CNCF, K8s native)	Smaller, self-contained
Clustering	External (Mimir, Thanos)	Enterprise only (proprietary)

vs Datadog & Commercial APM

Aspect	Prometheus	Datadog / New Relic / Dynatrace
Cost	Free (infrastructure costs only)	Per-host/per-metric pricing ($15-$50/host/month)
Data Residency	Your infrastructure	Vendor’s cloud (compliance concern)
Operational Burden	You manage it	Fully managed SaaS
Customization	Complete control	Limited to vendor features
Integration Depth	Deep K8s/cloud-native	Broad but shallower per-tool
Vendor Lock-in	None (OpenMetrics standard)	High (proprietary agents, query languages)

                            
                            When to Choose Prometheus: Organizations with Kubernetes infrastructure, engineering teams comfortable with open-source operations, cost-sensitive environments, or strict data residency requirements. Prometheus is the right choice when you want full control and the long-term ability to avoid vendor lock-in.
                        

Conclusion & What’s Next

Prometheus didn’t emerge in a vacuum. It’s the open-source crystallization of Google’s decade-long internal monitoring experience, adapted for the cloud-native era. Its design philosophy — reliability over accuracy, simplicity over features, pull over push — makes it uniquely suited to monitoring dynamic, containerized infrastructure.

Key takeaways from this foundational part:

Prometheus descends directly from Google’s Borgmon via ex-Googlers at SoundCloud
The multi-dimensional label model replaced rigid hierarchical naming
Pull-based collection makes “target down” detection trivial
Four metric types (counter, gauge, histogram, summary) have distinct query semantics
Cardinality (unique label combinations) is the primary scaling constraint
Prometheus is one component in a larger observability ecosystem (metrics, logs, traces, profiles)

Next in the Series

In Part 2: Deploying Prometheus to Kubernetes, we’ll set up a complete Prometheus deployment using the kube-prometheus-stack Helm chart, configure service discovery for Kubernetes workloads, and build the lab environment we’ll use throughout the rest of this track.

Series Hub All Monitoring & Observability Articles Next Part 2: Deploying to Kubernetes