Back to Systems Thinking & Architecture Mastery Series

Managed Kubernetes Services — EKS, AKS & GKE

May 15, 2026 Wasil Zafar 22 min read

Managed Kubernetes is the ultimate expression of control/data plane separation as a service — the cloud vendor owns the control plane entirely, and you own the data plane where your workloads run. Understanding this split is the key to choosing the right managed offering.

Table of Contents

  1. The Managed K8s Responsibility Split
  2. Amazon EKS Architecture
  3. Azure AKS Architecture
  4. Google GKE Architecture
  5. Three-Way Comparison
  6. Shared Responsibility Model
  7. Cost Implications
  8. Control Plane Access Limitations
  9. Multi-Cluster Management
  10. When to Use Managed vs Self-Managed

The Managed K8s Responsibility Split

Managed Kubernetes services take the control/data plane separation pattern and draw a clear operational boundary: the cloud provider owns, operates, and maintains the entire control plane, while you manage the data plane (worker nodes and workloads). This is the reason managed Kubernetes can exist as a service — the planes are architecturally independent enough to be operated by different teams or organizations.

Core Insight: Without the clean separation between control plane (API server, etcd, scheduler, controller-manager) and data plane (kubelet, kube-proxy, container runtime, your pods), it would be impossible to offer "Kubernetes as a Service." The separation is what makes managed services possible.
Managed vs Self-Managed Kubernetes — Responsibility Split
flowchart TB
    subgraph SM["Self-Managed (You Own Everything)"]
        direction TB
        SM_CP["Control Plane\nAPI Server, etcd, Scheduler\nController Manager"]
        SM_DP["Data Plane\nkubelet, kube-proxy\nContainer Runtime, Pods"]
        SM_CP --> SM_DP
    end
    subgraph MK["Managed K8s (Split Ownership)"]
        direction TB
        subgraph VENDOR["Cloud Vendor Manages"]
            MK_CP["Control Plane\nAPI Server, etcd, Scheduler\nController Manager"]
        end
        subgraph YOU["You Manage"]
            MK_DP["Data Plane\nWorker Nodes, Pods\nNetworking, Storage"]
        end
        MK_CP --> MK_DP
    end
                            

What the cloud vendor handles in the control plane:

  • etcd management — backups, encryption at rest, high availability, version upgrades
  • API server — scaling, TLS termination, authentication integration, audit logging
  • Scheduler & controllers — patching, upgrades, monitoring, restart on failure
  • Control plane HA — multi-AZ deployment, automatic failover, SLA guarantees

Amazon EKS Architecture

Amazon Elastic Kubernetes Service (EKS) runs the Kubernetes control plane across multiple AWS Availability Zones. The control plane components (API server, etcd) run in an AWS-managed VPC, separate from your workload VPC — connected via cross-account elastic network interfaces (ENIs).

EKS Architecture — AWS-Managed Control Plane
flowchart TB
    subgraph AWS_VPC["AWS-Managed VPC (Hidden)"]
        ETCD["etcd\n(3 nodes, encrypted)"]
        API["API Server\n(NLB fronted, multi-AZ)"]
        SCHED["Scheduler"]
        CM["Controller Manager"]
        API --> ETCD
        SCHED --> API
        CM --> API
    end
    subgraph YOUR_VPC["Your VPC"]
        subgraph MNG["Managed Node Group"]
            N1["EC2 Node 1\nkubelet + pods"]
            N2["EC2 Node 2\nkubelet + pods"]
        end
        subgraph FG["Fargate Profile"]
            F1["Fargate Pod 1"]
            F2["Fargate Pod 2"]
        end
    end
    API -->|"ENI Bridge"| N1
    API -->|"ENI Bridge"| N2
    API -->|"ENI Bridge"| F1
                            

EKS Data Plane Options

  • Managed Node Groups — AWS provisions and manages EC2 instances; you choose instance type and scaling policies
  • Self-Managed Nodes — you control the EC2 launch template, AMI, and lifecycle entirely
  • Fargate — serverless compute per pod; no nodes to manage at all (maximum data plane abstraction)
  • EKS Auto Mode — AWS manages compute, storage, and networking; Karpenter provisions nodes automatically
# Create an EKS cluster with managed node group
eksctl create cluster \
  --name production-cluster \
  --region us-west-2 \
  --version 1.29 \
  --nodegroup-name standard-workers \
  --node-type m5.xlarge \
  --nodes 3 \
  --nodes-min 2 \
  --nodes-max 5 \
  --managed

# Verify control plane is accessible
kubectl cluster-info

# Check node status (data plane)
kubectl get nodes -o wide
EKS Insight
EKS ENI Bridge — The Control/Data Plane Connection

EKS uses cross-account Elastic Network Interfaces (ENIs) placed in your VPC subnets to connect the managed control plane to your worker nodes. This means the API server can communicate with kubelets without traversing the public internet, while keeping the control plane infrastructure completely isolated in AWS's account. You never see the EC2 instances running etcd or the API server.

AWSNetworkingSecurity

Azure AKS Architecture

Azure Kubernetes Service (AKS) provides a free managed control plane (you pay only for worker nodes). The control plane runs in an Azure-managed subscription, while worker nodes run as VMs in your subscription and VNET. AKS offers the most flexible networking options among the three providers.

AKS Networking Models

  • Azure CNI — every pod gets a real VNET IP address (routable within your network)
  • Azure CNI Overlay — pods get IPs from an overlay network, reducing VNET IP consumption
  • kubenet — basic networking with UDR-based routing (simpler, fewer IP requirements)
  • Azure CNI powered by Cilium — eBPF data plane with Azure-native networking
# AKS cluster with managed identity and Azure CNI
apiVersion: containerservice.azure.com/v1
kind: ManagedCluster
metadata:
  name: production-aks
  location: eastus2
spec:
  kubernetesVersion: "1.29"
  identity:
    type: SystemAssigned
  networkProfile:
    networkPlugin: azure
    networkPolicy: calico
    serviceCidr: "10.0.0.0/16"
    dnsServiceIP: "10.0.0.10"
  agentPoolProfiles:
    - name: systempool
      count: 3
      vmSize: Standard_D4s_v5
      mode: System
      availabilityZones: ["1", "2", "3"]
    - name: userpool
      count: 5
      vmSize: Standard_D8s_v5
      mode: User
      enableAutoScaling: true
      minCount: 3
      maxCount: 10

AKS Unique Features

  • Virtual Nodes (ACI) — serverless burst capacity via Azure Container Instances (similar to Fargate)
  • Free control plane — no charge for control plane operation (unique among the three)
  • Azure Policy for Kubernetes — Azure Policy extends into the cluster via Gatekeeper
  • Microsoft Entra integration — native AAD-based authentication and RBAC

Google GKE Architecture

Google Kubernetes Engine (GKE) is the most mature managed Kubernetes offering — built by the team that created Kubernetes itself. GKE pushes the abstraction furthest with Autopilot mode, where Google manages both the control plane AND the data plane infrastructure.

GKE Standard vs Autopilot — Abstraction Levels
flowchart LR
    subgraph STD["GKE Standard"]
        STD_G["Google Manages:\nControl Plane"]
        STD_Y["You Manage:\nNode Pools, Scaling\nOS Patches, Security"]
    end
    subgraph AP["GKE Autopilot"]
        AP_G["Google Manages:\nControl Plane +\nNode Infrastructure +\nOS + Scaling + Security"]
        AP_Y["You Manage:\nPod Specs Only"]
    end
    STD -->|"More Abstraction"| AP
                            

GKE Autopilot Mode

Autopilot represents the logical extreme of managed Kubernetes: Google manages everything except your pod specifications. You define what to run; Google handles where and how.

  • No node management — Google provisions, scales, and patches nodes automatically
  • Per-pod billing — pay for requested CPU/memory per pod, not for node capacity
  • Built-in security — hardened node OS, no SSH access, enforced security policies
  • Automatic scaling — nodes scale based on pending pod resource requests
# Create GKE Autopilot cluster (maximum managed experience)
gcloud container clusters create-auto production-autopilot \
  --region us-central1 \
  --release-channel regular \
  --enable-master-authorized-networks \
  --master-authorized-networks 10.0.0.0/8

# The cluster is ready — no node configuration needed
# Just deploy workloads
kubectl apply -f deployment.yaml

# Google handles node provisioning automatically
kubectl get nodes  # Nodes appear as pods are scheduled
GKE Advantage: GKE's control plane auto-scaling is transparent — it automatically adjusts API server capacity based on cluster size and request volume. EKS and AKS have fixed control plane tiers (though they scale internally). GKE also provides 99.95% SLA for regional clusters (zonal: 99.5%).

Three-Way Comparison

Comparison Matrix
EKS vs AKS vs GKE — Control & Data Plane Features
Feature EKS AKS GKE
Control plane cost $0.10/hr (~$73/mo) Free Free (Autopilot) / $0.10/hr (Standard)
Control plane SLA 99.95% 99.95% (with AZ) 99.95% (regional)
etcd access None None None
Serverless option Fargate Virtual Nodes (ACI) Autopilot
Upgrade strategy In-place, manual trigger In-place, auto or manual Auto with release channels
Max nodes/cluster 5,000 5,000 15,000
Networking model VPC CNI (pod = ENI IP) Azure CNI / kubenet / Cilium VPC-native (alias IPs)
Identity integration IAM Roles for Service Accounts Workload Identity (Entra) Workload Identity Federation
EKSAKSGKE

Shared Responsibility Model

The shared responsibility model in managed Kubernetes maps directly to the control/data plane split — with some gray areas that differ by provider.

Shared Responsibility — Control Plane (Vendor) vs Data Plane (You)
flowchart TB
    subgraph VENDOR["Cloud Vendor Responsibility (Control Plane)"]
        V1["etcd availability & backups"]
        V2["API server patching & scaling"]
        V3["Control plane HA & failover"]
        V4["Kubernetes version security patches"]
        V5["Control plane monitoring"]
    end
    subgraph SHARED["Shared Responsibility"]
        S1["Kubernetes version upgrades (trigger)"]
        S2["Network policy configuration"]
        S3["RBAC policy definition"]
        S4["Add-on management"]
    end
    subgraph YOU["Your Responsibility (Data Plane)"]
        Y1["Worker node OS patching"]
        Y2["Pod security & image scanning"]
        Y3["Application configuration"]
        Y4["Network segmentation"]
        Y5["Data encryption & secrets"]
        Y6["Workload scaling policies"]
    end
                            
Common Misconception: "Managed Kubernetes means I don't have to worry about security." Wrong. The vendor secures the control plane infrastructure, but you are responsible for everything running ON the cluster — pod security standards, network policies, RBAC, image vulnerability scanning, secrets management, and supply chain security. The data plane is YOUR attack surface.

Cost Implications

The control/data plane split creates distinct cost models across providers:

  • EKS — $0.10/hr control plane fee + EC2/Fargate compute. Fargate pricing is per vCPU/GB-hour (higher unit cost, no waste)
  • AKS — free control plane, pay only for VM compute. Optionally pay for uptime SLA tier ($0.10/hr/cluster)
  • GKE Autopilot — per-pod resource billing (CPU/memory/ephemeral-storage per second). No node idle waste
  • GKE Standard — $0.10/hr control plane + VM compute (same model as EKS)
Cost Analysis
When Serverless Data Planes Win

Fargate (EKS) and Autopilot (GKE) eliminate data plane idle capacity — you pay only for what pods use. This wins for bursty/variable workloads where traditional node groups would be over-provisioned. But for steady-state workloads at scale, reserved instances on managed node groups are 40-60% cheaper. The choice maps to your workload's variance — high variance favors serverless, low variance favors committed nodes.

FinOpsOptimizationCompute

Control Plane Access Limitations

The trade-off of a managed control plane: you gain operational simplicity but lose fine-grained control. Key limitations across all providers:

  • No etcd access — cannot query etcd directly, use custom compaction settings, or access raw data
  • Limited API server flags — cannot customize admission webhooks at the API server level (only via dynamic admission)
  • No custom schedulers (easily) — must run as secondary schedulers, not replace the default
  • Audit log limitations — control plane logs may have retention limits or extra cost
  • Version skew constraints — must upgrade within provider's supported version window
  • Add-on compatibility — some open-source tools conflict with vendor-managed add-ons
# Managed node group configuration (EKS)
# You control instance types and scaling — vendor manages node lifecycle
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: production
  region: us-west-2
managedNodeGroups:
  - name: critical-workloads
    instanceType: m5.2xlarge
    desiredCapacity: 5
    minSize: 3
    maxSize: 10
    volumeSize: 100
    volumeType: gp3
    labels:
      workload-type: critical
    taints:
      - key: dedicated
        value: critical
        effect: NoSchedule
    iam:
      withAddonPolicies:
        autoScaler: true
        ebs: true

Multi-Cluster Management

As organizations scale, they run many clusters — leading to a "fleet management" problem that each provider addresses differently:

  • EKS — EKS Connector for external clusters; AWS Organizations for cross-account; no native fleet management (use Rancher, Crossplane)
  • AKS — Azure Arc-enabled Kubernetes extends Azure management to any cluster (on-premises, other clouds); Fleet Manager for multi-cluster orchestration
  • GKE — GKE Fleet (formerly Anthos) provides unified multi-cluster management, config sync, and service mesh across GKE and external clusters
Fleet Pattern: Multi-cluster management introduces a "meta-control-plane" — a higher-level control plane that manages individual cluster control planes. GKE Fleet's Config Sync pushes policies to multiple clusters from a Git repo. AKS Fleet Manager orchestrates rolling upgrades across a fleet. This is control/data plane separation applied recursively.

When to Use Managed vs Self-Managed

Choose Managed Kubernetes When:

  • Your team's core competency is application development, not infrastructure operations
  • You want SLA guarantees for control plane availability
  • You need tight integration with cloud provider IAM, networking, and storage
  • You're running fewer than 50 clusters and don't need exotic customizations
  • Compliance requirements are met by provider's shared responsibility model

Choose Self-Managed Kubernetes When:

  • You need custom API server configurations (admission controllers, audit policies)
  • You require direct etcd access for backup/restore control
  • You're running in air-gapped or on-premises environments
  • Your organization has a dedicated platform team with deep K8s expertise
  • Cost optimization at massive scale justifies operational overhead
  • Regulatory requirements demand full control of all infrastructure layers
Decision Framework
The 80/20 Rule of Managed K8s

For 80% of organizations, managed Kubernetes is the right choice. The 20% who need self-managed typically fall into three categories: (1) hyperscalers running thousands of clusters where per-cluster fees add up, (2) regulated industries with strict data residency and audit requirements, and (3) platform engineering teams building opinionated developer platforms that need deep control plane customization. If you're not in these categories, managed K8s saves you from operational toil that doesn't differentiate your product.

StrategyOperationsDecision