Back to Distributed Systems & Kubernetes Series

Part 14: Kubernetes Security

May 14, 2026 Wasil Zafar 44 min read

Kubernetes security requires defence in depth — authentication, authorisation, pod hardening, network segmentation, secrets management, and supply chain verification. A compromised pod in a default cluster can reach everything. Let's fix that.

Table of Contents

  1. The 4C Security Model
  2. Authentication
  3. RBAC (Role-Based Access Control)
  4. Pod Security
  5. Secrets Management
  6. Policy Enforcement
  7. Supply Chain Security
  8. Exercises
  9. Conclusion

The 4C Security Model

Cloud Native Security — 4C Model (Defence in Depth)
flowchart TD
    subgraph Cloud ["Cloud (Infrastructure)"]
        subgraph Cluster ["Cluster (Kubernetes)"]
            subgraph Container ["Container (Runtime)"]
                subgraph Code ["Code (Application)"]
                    APP[Your Application]
                end
            end
        end
    end
    
    C1[Network security
IAM roles
Encryption at rest] --> Cloud C2[RBAC
Network Policies
Pod Security
Admission Control] --> Cluster C3[Minimal images
Read-only FS
Non-root user
No capabilities] --> Container C4[Input validation
TLS
Dependency scanning
SAST/DAST] --> Code

Attack Surface

Default Kubernetes Is Not Secure: Out of the box, a Kubernetes cluster has: no network policies (all pods can reach all pods), broad default service account permissions, secrets stored unencrypted in etcd, no pod security restrictions (containers run as root). Every one of these must be hardened for production.
Attack Vector Risk Mitigation
Exposed API Server Full cluster compromise Private API endpoint, RBAC, audit logging
Compromised pod Lateral movement to other services Network policies, pod security
Leaked service account token API access with pod's permissions Minimal RBAC, disable token auto-mount
Vulnerable container image Known CVE exploitation Image scanning, signed images, minimal base
Secrets in etcd etcd access exposes all secrets etcd encryption, external secrets store
Privileged container Container escape → node access Pod Security Standards, no privileged pods

Authentication

Service Accounts

# Service accounts authenticate pods to the API server
# Every namespace has a "default" SA — don't use it for production!

# Create a dedicated service account:
apiVersion: v1
kind: ServiceAccount
metadata:
  name: payment-service
  namespace: production
  annotations:
    # AWS: IAM Roles for Service Accounts (IRSA)
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789:role/payment-s3-access
    # GCP: Workload Identity
    iam.gke.io/gcp-service-account: payment@project.iam.gserviceaccount.com
automountServiceAccountToken: false  # Don't mount token unless needed
---
# Use in a Pod:
apiVersion: v1
kind: Pod
metadata:
  name: payment-pod
spec:
  serviceAccountName: payment-service
  automountServiceAccountToken: false   # Explicitly opt-out of token mount
  containers:
  - name: payment
    image: payment:v2.0
# Best practices for service accounts:
# 1. Create dedicated SAs per workload (not default)
# 2. Disable auto-mount unless pod needs API access
# 3. Use short-lived tokens (bound service account tokens, K8s 1.22+)
# 4. Use cloud IAM integration (IRSA/Workload Identity) for cloud resources
# 5. Audit which SAs have cluster-admin or broad permissions

# Check service account tokens:
kubectl get serviceaccounts -A
kubectl get secrets -A | grep -i token

# Identify overprivileged service accounts:
kubectl auth can-i --list --as=system:serviceaccount:default:default
# Resources         Verbs
# *.*               [*]    ← DANGER: this SA has cluster-admin!

User Authentication

# Kubernetes does NOT manage user accounts — it delegates to:
# - X.509 client certificates (kubeadm default)
# - OIDC tokens (Dex, Keycloak, Azure AD, Google)
# - Webhook token authentication
# - Service account tokens (for automation)

# OIDC integration example (Azure AD):
# In API server config:
# --oidc-issuer-url=https://login.microsoftonline.com/TENANT_ID/v2.0
# --oidc-client-id=CLIENT_ID
# --oidc-username-claim=email
# --oidc-groups-claim=groups

# kubeconfig with OIDC:
# users:
# - name: developer@company.com
#   user:
#     exec:
#       apiVersion: client.authentication.k8s.io/v1beta1
#       command: kubelogin
#       args: ["get-token", "--environment", "AzurePublicCloud"]

# Check who you're authenticated as:
kubectl auth whoami
# Username: developer@company.com
# Groups:  [backend-team system:authenticated]

RBAC (Role-Based Access Control)

RBAC Model

RBAC: Who Can Do What Where
flowchart LR
    subgraph WHO [Subject — Who]
        U[User]
        G[Group]
        SA[ServiceAccount]
    end
    subgraph BINDING [Binding — Connects Who to What]
        RB[RoleBinding
namespace-scoped] CRB[ClusterRoleBinding
cluster-wide] end subgraph WHAT [Role — What + Where] R[Role
namespace-scoped] CR[ClusterRole
cluster-wide] end U --> RB G --> CRB SA --> RB RB --> R CRB --> CR

Practical Examples

# Role: defines WHAT actions on WHICH resources (namespace-scoped)
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
  namespace: production
rules:
- apiGroups: [""]           # "" = core API group
  resources: ["pods", "pods/log"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["services", "endpoints"]
  verbs: ["get", "list"]
---
# RoleBinding: binds WHO to the Role
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: dev-pod-reader
  namespace: production
subjects:
- kind: Group
  name: backend-developers    # OIDC group
  apiGroup: rbac.authorization.k8s.io
- kind: ServiceAccount
  name: ci-pipeline
  namespace: cicd
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io
# ClusterRole: cluster-wide permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: namespace-admin
rules:
- apiGroups: [""]
  resources: ["pods", "services", "configmaps", "secrets"]
  verbs: ["*"]
- apiGroups: ["apps"]
  resources: ["deployments", "statefulsets", "daemonsets"]
  verbs: ["*"]
- apiGroups: ["networking.k8s.io"]
  resources: ["networkpolicies", "ingresses"]
  verbs: ["get", "list", "create", "update", "delete"]
# Note: no access to nodes, namespaces, clusterroles, PVs
---
# ClusterRoleBinding for SRE team (all namespaces):
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: sre-namespace-admin
subjects:
- kind: Group
  name: sre-team
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: namespace-admin
  apiGroup: rbac.authorization.k8s.io

RBAC Best Practices

# RBAC auditing and debugging:

# Check what a user can do:
kubectl auth can-i --list --as=developer@company.com -n production
# Resources         Verbs
# pods              [get list watch]
# pods/log          [get list watch]
# services          [get list]

# Test specific permissions:
kubectl auth can-i create deployments -n production --as=developer@company.com
# no

kubectl auth can-i delete pods -n production --as=system:serviceaccount:cicd:ci-pipeline
# yes

# Find all ClusterRoleBindings granting cluster-admin:
kubectl get clusterrolebindings -o json | \
  jq '.items[] | select(.roleRef.name=="cluster-admin") | .subjects[]'

# Best practices:
# 1. Least privilege: grant minimum required permissions
# 2. Use Groups not Users in bindings (easier to manage)
# 3. Namespace-scoped Roles over ClusterRoles when possible
# 4. Avoid wildcards: specify exact verbs and resources
# 5. Audit regularly: who has cluster-admin?
# 6. Separate CI/CD SAs from application SAs
# 7. No cluster-admin for human users (use namespace-admin)

Pod Security

Security Context

# Security context: harden individual pods/containers
apiVersion: v1
kind: Pod
metadata:
  name: hardened-pod
spec:
  securityContext:              # Pod-level security
    runAsNonRoot: true          # Reject containers trying to run as root
    runAsUser: 1000             # Run as UID 1000
    runAsGroup: 3000            # Primary group GID 3000
    fsGroup: 2000              # Volume ownership GID
    seccompProfile:
      type: RuntimeDefault      # Enable seccomp syscall filtering
  containers:
  - name: app
    image: my-app:v2.0
    securityContext:            # Container-level security
      allowPrivilegeEscalation: false  # No setuid/setgid
      readOnlyRootFilesystem: true     # Immutable filesystem
      capabilities:
        drop: ["ALL"]                   # Drop all Linux capabilities
        # add: ["NET_BIND_SERVICE"]     # Only add what's needed
    volumeMounts:
    - name: tmp
      mountPath: /tmp           # Writable tmp (since rootfs is read-only)
  volumes:
  - name: tmp
    emptyDir: {}                # Ephemeral writable volume

Pod Security Standards

Pod Security Standards (PSS) replace the deprecated PodSecurityPolicies. They define three security levels enforced per-namespace via labels:

Level Restriction Use Case
Privileged Unrestricted (no checks) System infrastructure (CNI, logging)
Baseline Prevents known privilege escalations Most workloads
Restricted Full hardening (non-root, no capabilities, seccomp) Security-critical, multi-tenant
# Enforce Pod Security Standards via namespace labels:
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    # Enforce restricted: reject non-compliant pods
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    # Warn on baseline violations (logs warning, allows pod)
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: latest
    # Audit: log violations to audit log
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/audit-version: latest
# Test what would happen:
kubectl label --dry-run=server --overwrite ns production \
  pod-security.kubernetes.io/enforce=restricted
# Warning: existing pods in namespace "production" violate the new
# PodSecurity enforce level "restricted":
#   payment-abc12: allowPrivilegeEscalation != false,
#                  runAsNonRoot != true

# Common fixes for restricted compliance:
# ✓ Set runAsNonRoot: true
# ✓ Set allowPrivilegeEscalation: false
# ✓ Drop ALL capabilities
# ✓ Set seccompProfile: RuntimeDefault
# ✓ Don't use hostNetwork, hostPID, hostIPC
# ✓ Don't mount hostPath volumes

Secrets Management

Kubernetes Secrets

Secrets Aren't Encrypted by Default: Kubernetes Secrets are base64-encoded (NOT encrypted) and stored in etcd in plain text by default. Anyone with etcd access or API read permissions on secrets can read them. Enable encryption at rest and consider external secrets stores for production.
# Kubernetes secrets are base64, NOT encrypted:
kubectl create secret generic db-creds \
  --from-literal=username=admin \
  --from-literal=password=s3cur3P@ss

kubectl get secret db-creds -o jsonpath='{.data.password}' | base64 -d
# s3cur3P@ss  ← anyone with secret read access can decode

# Enable encryption at rest (API server configuration):
# --encryption-provider-config=/etc/kubernetes/encryption-config.yaml

# encryption-config.yaml:
# apiVersion: apiserver.config.k8s.io/v1
# kind: EncryptionConfiguration
# resources:
# - resources: ["secrets"]
#   providers:
#   - aescbc:
#       keys:
#       - name: key1
#         secret: <base64-encoded-32-byte-key>
#   - identity: {}   # Fallback for reading unencrypted secrets

External Secrets

# External Secrets Operator: sync secrets from external stores
# Supports: AWS Secrets Manager, Azure Key Vault, HashiCorp Vault,
#           GCP Secret Manager, 1Password, and more

# SecretStore: connection to external provider
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: aws-secrets
  namespace: production
spec:
  provider:
    aws:
      service: SecretsManager
      region: us-east-1
      auth:
        jwt:
          serviceAccountRef:
            name: external-secrets-sa
---
# ExternalSecret: what to sync
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: database-credentials
  namespace: production
spec:
  refreshInterval: 1h          # Re-sync every hour
  secretStoreRef:
    name: aws-secrets
    kind: SecretStore
  target:
    name: db-creds             # K8s Secret to create
    creationPolicy: Owner
  data:
  - secretKey: username        # Key in K8s Secret
    remoteRef:
      key: production/database  # AWS Secret name
      property: username        # JSON property in AWS Secret
  - secretKey: password
    remoteRef:
      key: production/database
      property: password

Policy Enforcement

OPA Gatekeeper

# OPA Gatekeeper: policy-as-code with Rego language

# ConstraintTemplate: define the policy logic
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8srequiredlabels
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredLabels
      validation:
        openAPIV3Schema:
          type: object
          properties:
            labels:
              type: array
              items:
                type: string
  targets:
  - target: admission.k8s.gatekeeper.sh
    rego: |
      package k8srequiredlabels
      violation[{"msg": msg}] {
        provided := {label | input.review.object.metadata.labels[label]}
        required := {label | label := input.parameters.labels[_]}
        missing := required - provided
        count(missing) > 0
        msg := sprintf("Missing required labels: %v", [missing])
      }
---
# Constraint: apply the policy
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: require-team-label
spec:
  match:
    kinds:
    - apiGroups: [""]
      kinds: ["Namespace"]
    - apiGroups: ["apps"]
      kinds: ["Deployment"]
  parameters:
    labels: ["team", "environment"]

# Result: any Deployment without "team" and "environment" labels is REJECTED

Kyverno

# Kyverno: Kubernetes-native policies (no Rego needed)
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-latest-tag
spec:
  validationFailureAction: Enforce   # Enforce or Audit
  rules:
  - name: require-image-tag
    match:
      any:
      - resources:
          kinds: ["Pod"]
    validate:
      message: "Images must use a specific tag, not 'latest' or empty."
      pattern:
        spec:
          containers:
          - image: "!*:latest & *:*"  # Must have tag, not "latest"
---
# Mutating policy: add default labels
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-default-labels
spec:
  rules:
  - name: add-managed-by
    match:
      any:
      - resources:
          kinds: ["Deployment", "StatefulSet"]
    mutate:
      patchStrategicMerge:
        metadata:
          labels:
            +(app.kubernetes.io/managed-by): "kyverno"

Supply Chain Security

Image Scanning

# Scan images for known vulnerabilities:

# Trivy (open-source, CNCF):
trivy image payment-service:v2.0
# payment-service:v2.0 (alpine 3.19.1)
# Total: 2 (HIGH: 1, CRITICAL: 1)
# ┌─────────────────┬──────────────┬──────────┬─────────────────────────┐
# │     Library     │ Vulnerability│ Severity │     Fixed Version       │
# ├─────────────────┼──────────────┼──────────┼─────────────────────────┤
# │ libcrypto3      │ CVE-2024-XXX │ CRITICAL │ 3.1.4-r5               │
# │ libssl3         │ CVE-2024-YYY │ HIGH     │ 3.1.4-r5               │
# └─────────────────┴──────────────┴──────────┴─────────────────────────┘

# Scan in CI/CD pipeline:
trivy image --exit-code 1 --severity CRITICAL,HIGH payment-service:v2.0
# Exit code 1 = vulnerabilities found → fail the pipeline

# Scan Kubernetes manifests:
trivy config ./k8s-manifests/
# Checks for misconfigurations (running as root, no limits, etc.)

# Best practices:
# 1. Use minimal base images (distroless, alpine, scratch)
# 2. Multi-stage builds (build deps don't ship to production)
# 3. Pin image digests, not just tags
# 4. Scan in CI/CD AND continuously in production
# 5. Set up admission policy to reject images with CRITICAL CVEs

Image Signing

# Cosign (Sigstore): sign and verify container images

# Sign an image:
cosign sign --key cosign.key myregistry.io/payment:v2.0
# Pushing signature to: myregistry.io/payment:sha256-abc123.sig

# Verify an image:
cosign verify --key cosign.pub myregistry.io/payment:v2.0
# Verification for myregistry.io/payment:v2.0 --
# The following checks were performed:
# - The cosign claims were validated
# - The signatures were verified against the specified public key

# Keyless signing (with OIDC identity):
cosign sign myregistry.io/payment:v2.0
# Uses your identity provider (GitHub, Google) — no key management!

# Enforce signed images with admission control:
# Kyverno policy to verify signatures:
# apiVersion: kyverno.io/v1
# kind: ClusterPolicy
# spec:
#   rules:
#   - name: verify-image-signature
#     match:
#       resources:
#         kinds: ["Pod"]
#     verifyImages:
#     - imageReferences: ["myregistry.io/*"]
#       attestors:
#       - entries:
#         - keys:
#             publicKeys: |-
#               -----BEGIN PUBLIC KEY-----
#               ...
#               -----END PUBLIC KEY-----

Exercises

Exercise 1 — RBAC Lab: Create a namespace "dev-team". Create a Role that allows only reading pods and viewing logs. Bind it to a ServiceAccount "dev-reader". Use kubectl auth can-i to verify the SA can read pods but cannot delete them or access secrets.
Exercise 2 — Pod Hardening: Take an existing Deployment running as root with no security context. Harden it: set runAsNonRoot, drop all capabilities, enable read-only filesystem, add seccomp profile. Apply "restricted" PSS to the namespace and verify the pod is compliant.
Exercise 3 — Policy Enforcement: Install Kyverno (or OPA Gatekeeper). Create policies that: (a) reject pods using "latest" tag, (b) require resource limits on all containers, (c) require a "team" label on all Deployments. Test by trying to deploy non-compliant resources.
Exercise 4 — Secrets Rotation: Create a Kubernetes Secret. Demonstrate that it's base64-encoded (not encrypted). Set up the External Secrets Operator (or Sealed Secrets) to manage a secret from an external store. Simulate a rotation — update the external secret and verify the K8s Secret updates.

Conclusion

Kubernetes security is a continuous practice, not a one-time setup. The key layers:

  • Authentication: Dedicated service accounts, OIDC for users, disable auto-mount
  • Authorisation: RBAC with least privilege, no cluster-admin for humans
  • Pod Security: Non-root, read-only FS, dropped capabilities, PSS enforcement
  • Network: Default-deny network policies per namespace (covered in Part 8)
  • Secrets: External secrets stores, encryption at rest, rotation
  • Policy: Admission controllers (Kyverno/Gatekeeper) to prevent misconfigurations
  • Supply Chain: Minimal images, vulnerability scanning, image signing

In Part 15, we'll cover Observability & Troubleshooting — Prometheus metrics, Grafana dashboards, distributed tracing, structured logging, and systematic approaches to debugging Kubernetes issues.