Back to Monitoring & Observability Series

Prometheus Deep Dive Part 9: Systems Monitoring with the Node Exporter

June 15, 2026 Wasil Zafar 32 min read

The Node Exporter is the most widely deployed Prometheus exporter, providing hundreds of system-level metrics from Linux hosts. Master its collectors — CPU, memory, disk, network, filesystem, hardware — plus the textfile collector for custom business metrics, and build production-grade alerting rules for infrastructure monitoring.

Table of Contents

  1. Node Exporter Overview
  2. CPU Collector
  3. Memory Collector
  4. Disk & Filesystem Collectors
  5. Network Collector
  6. Textfile Collector
  7. Advanced Collectors
  8. Conclusion

Node Exporter Overview

The Prometheus Node Exporter exposes hardware and OS-level metrics from *nix kernels. It reads from /proc, /sys, and other kernel pseudo-filesystems to provide hundreds of metrics covering CPU, memory, disk, network, filesystem, and more.

Architecture & Deployment

# Kubernetes DaemonSet — runs on every node
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '9100'
    spec:
      hostPID: true
      hostNetwork: true    # Access host network metrics
      containers:
        - name: node-exporter
          image: prom/node-exporter:v1.8.1
          args:
            - '--path.rootfs=/host'
            - '--path.procfs=/host/proc'
            - '--path.sysfs=/host/sys'
            - '--collector.textfile.directory=/host/var/lib/node_exporter/textfile'
            - '--collector.filesystem.mount-points-exclude=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)'
            - '--collector.netclass.ignored-devices=^(veth.*|docker.*|br-.*)$'
            - '--collector.systemd'
            - '--no-collector.mdadm'         # Disable unused collectors
            - '--no-collector.infiniband'
          ports:
            - containerPort: 9100
              hostPort: 9100
          volumeMounts:
            - name: rootfs
              mountPath: /host
              readOnly: true
              mountPropagation: HostToContainer
          resources:
            limits:
              cpu: 250m
              memory: 180Mi
            requests:
              cpu: 100m
              memory: 128Mi
      volumes:
        - name: rootfs
          hostPath:
            path: /
      tolerations:
        - effect: NoSchedule
          operator: Exists

Enabling/Disabling Collectors

Reference

Default Collectors (Enabled)

CollectorMetrics PrefixSource
cpunode_cpu_*/proc/stat
meminfonode_memory_*/proc/meminfo
diskstatsnode_disk_*/proc/diskstats
filesystemnode_filesystem_*statfs()
netdevnode_network_*/proc/net/dev
loadavgnode_load*/proc/loadavg
textfile(custom)*.prom files
unamenode_uname_infouname()
timenode_time_*clock_gettime()
conntracknode_nf_conntrack*/proc/sys/net/netfilter
ConfigurationLinux

CPU Collector

Key Metrics & Modes

The CPU collector exposes node_cpu_seconds_total — a counter tracking cumulative CPU time spent in each mode per CPU core:

# CPU modes exposed by node_cpu_seconds_total{mode="..."}
# user     — Time in user space (applications)
# system   — Time in kernel space (syscalls, drivers)
# idle     — Idle time (waiting for work)
# iowait   — Waiting for I/O completion
# irq      — Servicing hardware interrupts
# softirq  — Servicing software interrupts
# steal    — Time stolen by hypervisor (VMs)
# nice     — Low-priority user space processes
# guest    — Running virtual CPUs for guests

Essential PromQL Queries

# Overall CPU utilization (all cores, all modes except idle)
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# Per-mode breakdown (useful for identifying bottleneck type)
avg by (instance, mode) (rate(node_cpu_seconds_total[5m])) * 100

# CPU saturation — load average vs CPU count
node_load1 / count without (cpu, mode) (node_cpu_seconds_total{mode="idle"})

# iowait specifically (indicates disk bottleneck)
avg by (instance) (rate(node_cpu_seconds_total{mode="iowait"}[5m])) * 100

# Steal time (VM neighbor noise / overcommit)
avg by (instance) (rate(node_cpu_seconds_total{mode="steal"}[5m])) * 100

# Number of CPUs per node
count without (cpu, mode) (node_cpu_seconds_total{mode="idle"})

Alerting Rules

# Production alerting rules for CPU
groups:
  - name: node_cpu_alerts
    rules:
      - alert: HighCpuUsage
        expr: |
          100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[10m])) * 100) > 85
        for: 15m
        labels:
          severity: warning
        annotations:
          summary: "High CPU on {{ $labels.instance }}"
          description: "CPU usage above 85% for 15 minutes (current: {{ $value | printf \"%.1f\" }}%)"

      - alert: CpuSaturation
        expr: |
          node_load15 / count without (cpu, mode) (node_cpu_seconds_total{mode="idle"}) > 2
        for: 30m
        labels:
          severity: critical
        annotations:
          summary: "CPU saturated on {{ $labels.instance }}"
          description: "15-min load average is {{ $value | printf \"%.1f\" }}x the CPU count"

      - alert: HighStealTime
        expr: |
          avg by (instance) (rate(node_cpu_seconds_total{mode="steal"}[5m])) * 100 > 10
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "High steal time on {{ $labels.instance }}"
          description: "{{ $value | printf \"%.1f\" }}% steal — noisy neighbor or overcommitted host"

Memory Collector

Key Metrics

# Memory metrics from /proc/meminfo
node_memory_MemTotal_bytes        # Total physical RAM
node_memory_MemFree_bytes         # Completely free (unused)
node_memory_MemAvailable_bytes    # Available for allocation (includes reclaimable)
node_memory_Buffers_bytes         # Disk buffer cache
node_memory_Cached_bytes          # Page cache
node_memory_SwapTotal_bytes       # Total swap space
node_memory_SwapFree_bytes        # Free swap
node_memory_Slab_bytes            # Kernel slab allocator
node_memory_SReclaimable_bytes    # Reclaimable slab memory
node_memory_CommitLimit_bytes     # Overcommit limit
node_memory_Committed_AS_bytes    # Memory committed by all processes

PromQL Patterns

# Actual memory usage (most accurate)
# Uses MemAvailable which accounts for reclaimable cache
(1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100

# Breakdown: Used / Buffers / Cached / Free
node_memory_MemTotal_bytes
  - node_memory_MemFree_bytes
  - node_memory_Buffers_bytes
  - node_memory_Cached_bytes
  - node_memory_SReclaimable_bytes

# Swap usage (any swap usage may indicate memory pressure)
(1 - node_memory_SwapFree_bytes / node_memory_SwapTotal_bytes) * 100

# OOM kill count (if using kernel 4.13+)
rate(node_vmstat_oom_kill[5m])

# Memory pressure — major page faults (require disk I/O)
rate(node_vmstat_pgmajfault[5m])
# Memory alerting rules
groups:
  - name: node_memory_alerts
    rules:
      - alert: HighMemoryUsage
        expr: |
          (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 > 90
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "High memory on {{ $labels.instance }}"
          description: "Memory usage {{ $value | printf \"%.1f\" }}% — available: {{ with printf \"node_memory_MemAvailable_bytes{instance='%s'}\" $labels.instance | query }}{{ . | first | value | humanize1024 }}{{ end }}"

      - alert: SwapUsageHigh
        expr: |
          (1 - node_memory_SwapFree_bytes / node_memory_SwapTotal_bytes) * 100 > 50
        for: 15m
        labels:
          severity: warning
        annotations:
          summary: "Swap usage on {{ $labels.instance }}"
          description: "{{ $value | printf \"%.0f\" }}% swap in use — memory pressure likely"

Disk & Filesystem Collectors

Disk I/O Metrics

# Disk I/O metrics from /proc/diskstats
node_disk_reads_completed_total      # Completed read operations
node_disk_writes_completed_total     # Completed write operations
node_disk_read_bytes_total           # Bytes read
node_disk_written_bytes_total        # Bytes written
node_disk_read_time_seconds_total    # Time spent reading
node_disk_write_time_seconds_total   # Time spent writing
node_disk_io_time_seconds_total      # Time spent doing I/O (utilization)
node_disk_io_time_weighted_seconds_total  # Weighted I/O time (queue depth)

# Disk utilization (% of time doing I/O)
rate(node_disk_io_time_seconds_total{device!~"dm-.*"}[5m]) * 100

# Average read/write latency
rate(node_disk_read_time_seconds_total[5m])
  / rate(node_disk_reads_completed_total[5m])

# IOPS
rate(node_disk_reads_completed_total[5m])
  + rate(node_disk_writes_completed_total[5m])

# Throughput (bytes/second)
rate(node_disk_read_bytes_total[5m]) + rate(node_disk_written_bytes_total[5m])

# Average queue depth (saturation indicator)
rate(node_disk_io_time_weighted_seconds_total[5m])

Filesystem Metrics

# Filesystem metrics from statfs()
node_filesystem_size_bytes          # Total filesystem size
node_filesystem_avail_bytes         # Available space (non-root)
node_filesystem_free_bytes          # Free space (includes root reserved)
node_filesystem_files               # Total inodes
node_filesystem_files_free          # Free inodes
node_filesystem_readonly            # Read-only flag

# Filesystem usage percentage
(1 - node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100

# Predict when filesystem will be full (linear extrapolation)
predict_linear(node_filesystem_avail_bytes{fstype!~"tmpfs|overlay"}[6h], 24*3600) < 0

# Inode usage (often overlooked until 100%)
(1 - node_filesystem_files_free / node_filesystem_files) * 100
# Disk & filesystem alerting
groups:
  - name: node_disk_alerts
    rules:
      - alert: DiskWillFillIn24h
        expr: |
          predict_linear(node_filesystem_avail_bytes{fstype!~"tmpfs|overlay"}[6h], 24*3600) < 0
          and node_filesystem_avail_bytes / node_filesystem_size_bytes < 0.2
        for: 30m
        labels:
          severity: warning
        annotations:
          summary: "Disk filling on {{ $labels.instance }}:{{ $labels.mountpoint }}"
          description: "Filesystem {{ $labels.mountpoint }} predicted to fill within 24 hours"

      - alert: DiskSpaceCritical
        expr: |
          (1 - node_filesystem_avail_bytes{fstype!~"tmpfs|overlay"} / node_filesystem_size_bytes) > 0.95
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Disk 95%+ full on {{ $labels.instance }}:{{ $labels.mountpoint }}"

      - alert: InodeExhaustion
        expr: |
          (1 - node_filesystem_files_free / node_filesystem_files) > 0.90
        for: 15m
        labels:
          severity: warning
        annotations:
          summary: "Inode exhaustion on {{ $labels.instance }}:{{ $labels.mountpoint }}"

Network Collector

Network Device Metrics

# Network interface metrics from /proc/net/dev
node_network_receive_bytes_total        # Bytes received
node_network_transmit_bytes_total       # Bytes transmitted
node_network_receive_packets_total      # Packets received
node_network_transmit_packets_total     # Packets transmitted
node_network_receive_errs_total         # Receive errors
node_network_transmit_errs_total        # Transmit errors
node_network_receive_drop_total         # Dropped incoming
node_network_transmit_drop_total        # Dropped outgoing

# Bandwidth utilization (bits/sec)
rate(node_network_receive_bytes_total{device!~"lo|veth.*|docker.*"}[5m]) * 8
rate(node_network_transmit_bytes_total{device!~"lo|veth.*|docker.*"}[5m]) * 8

# Packet error rate
rate(node_network_receive_errs_total[5m])
  / rate(node_network_receive_packets_total[5m]) * 100

# Network interface speed and state
node_network_speed_bytes    # Negotiated link speed
node_network_up             # Interface operational state (1=up)

Conntrack & Sockets

# Connection tracking (critical for firewalls/load balancers)
node_nf_conntrack_entries            # Current tracked connections
node_nf_conntrack_entries_limit      # Maximum connections allowed

# Conntrack utilization (approaching 100% = dropped connections)
node_nf_conntrack_entries / node_nf_conntrack_entries_limit * 100

# TCP socket state (from /proc/net/sockstat)
node_sockstat_TCP_tw         # TIME_WAIT sockets
node_sockstat_TCP_alloc      # Allocated sockets
node_sockstat_sockets_used   # Total sockets in use

Textfile Collector

The textfile collector reads .prom files from a configured directory, exposing their contents as Prometheus metrics. This is the primary mechanism for exposing custom metrics from cron jobs, scripts, or applications that can’t serve an HTTP endpoint.

Setup & Configuration

# Enable with directory flag
--collector.textfile.directory=/var/lib/node_exporter/textfile

# Create the directory
mkdir -p /var/lib/node_exporter/textfile

# Write metrics in Prometheus exposition format
# File MUST have .prom extension
cat > /var/lib/node_exporter/textfile/backup_status.prom << 'EOF'
# HELP backup_last_success_timestamp_seconds Unix timestamp of last successful backup
# TYPE backup_last_success_timestamp_seconds gauge
backup_last_success_timestamp_seconds{job="database",target="postgres-main"} 1718452800
# HELP backup_size_bytes Size of last backup in bytes
# TYPE backup_size_bytes gauge
backup_size_bytes{job="database",target="postgres-main"} 5368709120
# HELP backup_duration_seconds Duration of last backup
# TYPE backup_duration_seconds gauge
backup_duration_seconds{job="database",target="postgres-main"} 342.5
EOF

Common Patterns

#!/bin/bash
# /etc/cron.d/node-exporter-textfile
# Cron job that writes textfile metrics every 5 minutes

# SSL certificate expiry
CERT_EXPIRY=$(echo | openssl s_client -connect myapp.example.com:443 2>/dev/null | \
  openssl x509 -noout -enddate | cut -d= -f2)
CERT_EPOCH=$(date -d "${CERT_EXPIRY}" +%s)

cat > /var/lib/node_exporter/textfile/ssl_expiry.prom << EOF
# HELP ssl_certificate_expiry_seconds Unix timestamp when cert expires
# TYPE ssl_certificate_expiry_seconds gauge
ssl_certificate_expiry_seconds{domain="myapp.example.com"} ${CERT_EPOCH}
EOF

# Package update count
UPDATES=$(apt list --upgradable 2>/dev/null | grep -c upgradable)
cat > /var/lib/node_exporter/textfile/apt_updates.prom << EOF
# HELP node_apt_upgradable_packages Number of packages with available updates
# TYPE node_apt_upgradable_packages gauge
node_apt_upgradable_packages ${UPDATES}
EOF

# Custom application health (from script/API call)
HTTP_CODE=$(curl -s -o /dev/null -w '%{http_code}' http://localhost:8080/health)
cat > /var/lib/node_exporter/textfile/app_health.prom << EOF
# HELP app_health_check_status HTTP status code from health endpoint
# TYPE app_health_check_status gauge
app_health_check_status{app="my-service"} ${HTTP_CODE}
EOF
Textfile Gotchas: Always write to a temp file and mv atomically to avoid partial reads. Never use timestamps in textfile metrics (Prometheus adds scrape time). If the file is stale, the metric node_textfile_mtime_seconds will show when it was last modified — alert on staleness rather than checking the metric value.

Advanced Collectors

systemd Collector

# Enable systemd collector
--collector.systemd
--collector.systemd.unit-include="(nginx|postgresql|redis|docker)\.service"

# Metrics exposed:
node_systemd_unit_state{name="nginx.service", state="active"}    # 1 if in this state
node_systemd_unit_state{name="nginx.service", state="failed"}    # 1 if failed
node_systemd_timer_last_trigger_seconds                          # Last timer trigger time

# Alert on service failure
node_systemd_unit_state{state="failed"} == 1

Hardware (IPMI, hwmon, thermal)

# Hardware temperature monitoring
node_hwmon_temp_celsius                          # Temperature sensors
node_thermal_zone_temp                           # CPU thermal zones
node_cooling_device_cur_state                    # Cooling device state

# Power supply (laptop/UPS)
node_power_supply_energy_watthour
node_power_supply_online

# IPMI (requires ipmi-tools + root access)
# Enable with: --collector.ipmi
node_ipmi_temperature_celsius{name="CPU1 Temp"}
node_ipmi_fan_speed_rpm{name="FAN1"}
node_ipmi_power_watts{name="System Board"}

Conclusion

Node Exporter Best Practices:
  • Deploy as DaemonSet with hostNetwork: true and hostPID: true for complete visibility
  • Filter filesystem mounts — exclude tmpfs, overlay, and container-internal mounts
  • Use textfile collector for custom metrics (backup status, cert expiry, package updates)
  • Enable systemd collector to monitor critical services
  • Disable unused collectors to reduce scrape time and cardinality
  • Use recording rules for common dashboard queries (CPU %, memory %, disk predictions)
  • Alert on predictions (predict_linear) not just thresholds for disk and memory