What is FinOps?
FinOps (Financial Operations) is the practice of bringing financial accountability to cloud spending. It combines people, processes, and tools to enable organisations to make informed trade-offs between speed, cost, and quality. FinOps doesn't mean spending less — it means getting maximum value from every dollar spent on cloud infrastructure.
Think of FinOps like fuel efficiency in a fleet of vehicles. The goal isn't to drive less — it's to ensure every trip delivers value, routes are optimised, engines are tuned, and vehicles are right-sized for their cargo. Some trucks need to be large; the waste is running a 40-tonne truck for a 500kg delivery.
The FinOps Framework
flowchart LR
Inform["Inform
Visibility & Allocation"] --> Optimize["Optimise
Right-Size & Rates"]
Optimize --> Operate["Operate
Governance & Automation"]
Operate -->|"Continuous"| Inform
style Inform fill:#e8f4f4,stroke:#3B9797,color:#132440
style Optimize fill:#f0f4f8,stroke:#16476A,color:#132440
style Operate fill:#f0f4f8,stroke:#16476A,color:#132440
| Phase | Activities | Key Metrics |
|---|---|---|
| Inform | Tag resources, allocate costs to teams, build dashboards | Cost per team, cost per service, untagged spend % |
| Optimise | Right-size instances, purchase reservations, eliminate waste | CPU/memory utilisation, savings vs on-demand, idle resource cost |
| Operate | Budget alerts, automated scaling, policy enforcement | Budget variance, anomaly detection rate, cost per deployment |
Cost Visibility & Allocation
Tagging Strategy
Tags are the foundation of cost allocation. Without consistent tagging, cloud bills are unattributable — just a large number that nobody owns. A robust tagging strategy requires mandatory tags enforced by policy.
# Kyverno policy — enforce cost allocation tags on all resources
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-cost-labels
annotations:
policies.kyverno.io/title: Require Cost Allocation Labels
policies.kyverno.io/severity: high
spec:
validationFailureAction: Enforce
background: true
rules:
- name: require-cost-labels
match:
any:
- resources:
kinds: ["Deployment", "StatefulSet", "Job", "CronJob"]
validate:
message: >-
Cost allocation labels are required: 'team', 'cost-center',
'environment', and 'service'. Add these labels to metadata.labels.
pattern:
metadata:
labels:
team: "?*"
cost-center: "?*"
environment: "?*"
service: "?*"
Kubernetes Cost Attribution
Kubernetes makes cost attribution challenging because multiple workloads share node resources. A single EC2 instance might run pods from five different teams. Kubecost and OpenCost solve this by measuring actual CPU, memory, and storage consumption per pod and attributing node costs proportionally.
# Install Kubecost via Helm
helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm repo update
helm install kubecost kubecost/cost-analyzer \
--namespace kubecost \
--create-namespace \
--set kubecostToken="YOUR_TOKEN" \
--set prometheus.server.retention=30d
# Verify installation
kubectl get pods -n kubecost
# NAME READY STATUS RESTARTS AGE
# kubecost-cost-analyzer-xxx 1/1 Running 0 2m
# kubecost-prometheus-server-xxx 1/1 Running 0 2m
# Access the dashboard
kubectl port-forward -n kubecost svc/kubecost-cost-analyzer 9090:9090
echo "Kubecost dashboard at http://localhost:9090"
# Query Kubecost API for team-level cost allocation
curl -s "http://localhost:9090/model/allocation?window=7d&aggregate=label:team" | \
python3 -c "
import json, sys
data = json.load(sys.stdin)
for alloc in data.get('data', [{}])[0].values():
name = alloc.get('name', 'unallocated')
total = alloc.get('totalCost', 0)
cpu = alloc.get('cpuCost', 0)
mem = alloc.get('ramCost', 0)
print(f'{name:20s} Total: \${total:8.2f} CPU: \${cpu:6.2f} Memory: \${mem:6.2f}')
"
# team-payments Total: $ 342.50 CPU: $187.20 Memory: $155.30
# team-frontend Total: $ 218.75 CPU: $120.40 Memory: $ 98.35
# team-data Total: $ 567.80 CPU: $312.50 Memory: $255.30
# __unallocated__ Total: $ 89.20 CPU: $ 45.10 Memory: $ 44.10
Resource Right-Sizing
Right-sizing is the single highest-impact FinOps optimisation. Most Kubernetes workloads request 2-5× more CPU and memory than they actually use — because developers set requests based on peak load estimates rather than actual consumption data.
Lyft's Right-Sizing Programme
Lyft discovered that their Kubernetes clusters averaged 12% CPU utilisation — meaning 88% of compute capacity was paid for but unused. By deploying Vertical Pod Autoscaler (VPA) in recommendation mode, they identified that 73% of workloads had requests set 3× higher than p99 usage. A phased right-sizing campaign (automated recommendations → team review → gradual adjustment) reduced their compute bill by $4.2M annually while maintaining the same performance SLOs.
Vertical Pod Autoscaler (VPA)
# VPA in recommendation mode — analyse first, resize later
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: payment-service-vpa
namespace: production
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: payment-service
updatePolicy:
updateMode: "Off" # Recommendation only — no automatic resizing
resourcePolicy:
containerPolicies:
- containerName: payment-service
minAllowed:
cpu: 50m
memory: 64Mi
maxAllowed:
cpu: 2000m
memory: 4Gi
# View VPA recommendations
kubectl describe vpa payment-service-vpa -n production
# Status:
# Recommendation:
# Container Recommendations:
# Container Name: payment-service
# Lower Bound: Cpu: 85m, Memory: 128Mi
# Target: Cpu: 150m, Memory: 256Mi ← Use this
# Upper Bound: Cpu: 400m, Memory: 512Mi
# Uncapped Target: Cpu: 150m, Memory: 256Mi
#
# Current requests: Cpu: 500m, Memory: 1Gi
# Recommendation: Cpu: 150m, Memory: 256Mi
# Potential savings: 70% CPU, 75% memory
echo "Right-sizing analysis complete"
Rate Optimisation
Reserved Instances & Savings Plans
For workloads with predictable, steady-state resource consumption (databases, control planes, core services), reserved capacity offers 30-72% savings over on-demand pricing. The key is matching commitment levels to actual baseline usage — not peak capacity.
| Strategy | Commitment | Discount | Flexibility | Best For |
|---|---|---|---|---|
| On-Demand | None | 0% | Maximum | Unpredictable workloads |
| 1-Year Reserved | 1 year | 30-40% | Low | Stable, predictable services |
| 3-Year Reserved | 3 years | 55-72% | Very low | Databases, infrastructure |
| Savings Plans | $/hr spend | 20-66% | Medium | Flexible compute commitment |
| Spot/Preemptible | None | 60-90% | Interruptible | Batch, CI, fault-tolerant |
Spot Instances for Kubernetes
# Karpenter provisioner — mix of on-demand and spot capacity
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: default
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: karpenter.k8s.aws/instance-category
operator: In
values: ["c", "m", "r"]
- key: karpenter.k8s.aws/instance-generation
operator: Gt
values: ["5"]
nodeClassRef:
name: default
limits:
cpu: "1000"
memory: 4000Gi
disruption:
consolidationPolicy: WhenUnderutilized
consolidateAfter: 30s
Showback & Chargeback Models
Showback presents cost data to teams without financial consequences — creating awareness. Chargeback actually bills teams from their budget for resources consumed. Most organisations start with showback and graduate to chargeback as cost visibility matures.
Kubecost-Driven Showback Reports
# Generate weekly cost report per team using Kubecost API
curl -s "http://kubecost:9090/model/allocation?window=7d&aggregate=label:team&accumulate=true" | \
python3 -c "
import json, sys
data = json.load(sys.stdin)
print('Weekly Kubernetes Cost Report')
print('=' * 60)
total = 0
for name, alloc in sorted(data.get('data', [{}])[0].items()):
cost = alloc.get('totalCost', 0)
eff = alloc.get('totalEfficiency', 0) * 100
total += cost
bar = '#' * int(eff / 5)
print(f'{name:20s} \${cost:8.2f} Efficiency: {eff:5.1f}% {bar}')
print('=' * 60)
print(f'{\"TOTAL\":20s} \${total:8.2f}')
"
# Weekly Kubernetes Cost Report
# ============================================================
# team-data $ 567.80 Efficiency: 72.3% ##############
# team-frontend $ 218.75 Efficiency: 45.2% #########
# team-payments $ 342.50 Efficiency: 68.1% #############
# __unallocated__ $ 89.20 Efficiency: 0.0%
# ============================================================
# TOTAL $ 1218.25
Cost Automation & Governance
Automated Idle Resource Cleanup
# CronJob to identify and report idle Kubernetes resources
apiVersion: batch/v1
kind: CronJob
metadata:
name: idle-resource-reporter
namespace: finops
spec:
schedule: "0 8 * * 1" # Every Monday at 8am
jobTemplate:
spec:
template:
spec:
containers:
- name: reporter
image: bitnami/kubectl:latest
command:
- /bin/sh
- -c
- |
echo "=== Idle Resource Report ==="
echo ""
echo "Deployments with 0 replicas (>7 days):"
kubectl get deployments -A -o json | \
jq -r '.items[] | select(.spec.replicas == 0) |
"\(.metadata.namespace)/\(.metadata.name)"'
echo ""
echo "PVCs not mounted to any pod:"
kubectl get pvc -A -o json | \
jq -r '.items[] | select(.status.phase == "Bound") |
"\(.metadata.namespace)/\(.metadata.name)"' | \
while read pvc; do
ns=$(echo $pvc | cut -d/ -f1)
name=$(echo $pvc | cut -d/ -f2)
mounted=$(kubectl get pods -n $ns -o json | \
jq -r ".items[].spec.volumes[]? | select(.persistentVolumeClaim.claimName == \"$name\")" | wc -l)
[ "$mounted" -eq 0 ] && echo " UNUSED: $pvc"
done
restartPolicy: Never
Conclusion & Next Steps
FinOps is not a one-time exercise — it's a continuous practice of visibility, optimisation, and governance. The organisations that succeed at FinOps make cost a first-class engineering metric alongside latency, availability, and throughput.
- Start with visibility — Tag everything, deploy Kubecost, and build dashboards that every team sees weekly.
- Right-size first — VPA recommendations are the lowest-hanging fruit. Most workloads are 2-5× over-provisioned.
- Commit strategically — Reserve baseline capacity with 1-year plans. Use spot for burst and batch workloads.
- Automate governance — Budget alerts, idle resource cleanup, and cost policies prevent drift.
- Measure efficiency, not just cost — Cost per request, cost per transaction, and utilisation percentages matter more than total spend.
Next in the Series
In Part 15: AIOps & Intelligent Automation, we'll explore ML-driven operations — anomaly detection, predictive alerting, automated incident response, chaos engineering, and intelligent runbook automation.