Back to Computing & Systems Foundations Series

Part 22: Kubernetes Networking Foundations

May 13, 2026Wasil Zafar20 min read

How Kubernetes networking works — the flat pod network, Service abstractions, CNI plugins, Ingress routing, NetworkPolicies, and DNS service discovery.

Table of Contents

  1. The Kubernetes Network Model
  2. Pod-to-Pod Communication
  3. Services
  4. kube-proxy & IPVS
  5. Ingress & Ingress Controllers
  6. NetworkPolicies
  7. CoreDNS & Service Discovery
  8. Exercises
  9. Conclusion

The Kubernetes Network Model

Kubernetes imposes a flat network model — every Pod gets its own unique IP address, and all Pods can communicate directly without NAT. This simplifies application networking enormously: containers behave as if they're on the same LAN, regardless of which node they're scheduled on.

The 3 Kubernetes Networking Rules:
  1. Every Pod gets its own IP address — no need to map ports between Pods.
  2. Pods can communicate with all other Pods without NAT — the IP a Pod sees for itself is the same IP others use to reach it.
  3. Agents (kubelet, kube-proxy) on a node can communicate with all Pods on that node — host-to-Pod networking works both directions.

Kubernetes does not implement networking itself — it delegates to CNI (Container Network Interface) plugins. The CNI plugin is responsible for assigning Pod IPs, configuring network interfaces, and establishing cross-node connectivity (via overlay networks, BGP, or cloud provider routing).

Pod-to-Pod Communication

Same Node

When two Pods run on the same node, they communicate via a Linux bridge (typically cbr0 or cni0). Each Pod's network namespace connects to the bridge via a veth pair — one end inside the Pod (appears as eth0), the other end attached to the bridge on the host.

Pod-to-Pod Networking (Same Node & Cross-Node)
flowchart LR
    subgraph Node1["Node 1"]
        PA["Pod A\neth0\n10.244.1.5"] -->|veth pair| BR1["Bridge\ncni0\n10.244.1.1"]
        PB["Pod B\neth0\n10.244.1.8"] -->|veth pair| BR1
    end
    subgraph Node2["Node 2"]
        PC["Pod C\neth0\n10.244.2.3"] -->|veth pair| BR2["Bridge\ncni0\n10.244.2.1"]
    end
    BR1 -->|"CNI Overlay\n(VXLAN/BGP/\nCloud Routes)"| BR2
            

Cross-Node

Cross-node communication depends on the CNI plugin's strategy. Three common approaches:

  • Overlay (VXLAN/Geneve): Encapsulates Pod traffic in outer UDP packets between nodes. Works anywhere, but adds ~50 bytes of overhead per packet (Flannel VXLAN, Calico VXLAN mode).
  • BGP Routing: Advertises Pod CIDR routes between nodes via BGP. No encapsulation overhead, but requires L3 network support (Calico BGP mode).
  • Cloud Provider Routes: Uses the cloud's VPC routing table to route Pod CIDRs to the correct node. Zero overlay overhead, native performance (AWS VPC CNI, GKE native routing).
# View Pod IPs and which node they're on
kubectl get pods -o wide
# NAME          READY   STATUS    IP            NODE
# nginx-abc     1/1     Running   10.244.1.5    node-1
# redis-xyz     1/1     Running   10.244.2.3    node-2

# View the CNI bridge on a node
ssh node-1 "ip link show type bridge"
# cni0: <BROADCAST,MULTICAST,UP> mtu 1450 state UP

# View veth pairs connecting Pods to the bridge
ssh node-1 "bridge link show"
# veth1234@if2: <BROADCAST,MULTICAST,UP> master cni0

# View Pod CIDR allocation per node
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.podCIDR}{"\n"}{end}'
# node-1    10.244.1.0/24
# node-2    10.244.2.0/24

Services

Pods are ephemeral — they come and go, and their IPs change. A Service provides a stable virtual IP (ClusterIP) and DNS name that load-balances across a set of Pods selected by label. Services are the primary abstraction for service discovery in Kubernetes.

Service Types — Progressive Exposure
flowchart LR
    CI["ClusterIP\n(internal only)"] --> NP["NodePort\n(+ node IP:port)"]
    NP --> LB["LoadBalancer\n(+ external IP)"]
    style CI fill:#3B9797,color:#fff
    style NP fill:#16476A,color:#fff
    style LB fill:#BF092F,color:#fff
            

ClusterIP

The default Service type. Assigns a virtual IP reachable only within the cluster. kube-proxy programs iptables/IPVS rules to DNAT traffic to one of the backend Pod IPs.

# ClusterIP Service manifest
apiVersion: v1
kind: Service
metadata:
  name: my-api
  namespace: default
spec:
  type: ClusterIP
  selector:
    app: my-api        # Matches Pods with label app=my-api
  ports:
    - port: 80         # Service port (what clients connect to)
      targetPort: 8080 # Pod port (where the app listens)
      protocol: TCP

NodePort

Exposes the Service on a static port (30000–32767) on every node's IP. External traffic hits <NodeIP>:<NodePort> and gets routed to a backend Pod. Useful for development; in production, use LoadBalancer or Ingress.

LoadBalancer

On cloud providers, creates an external load balancer (AWS ELB/NLB, GCP LB, Azure LB) that routes traffic to NodePorts. The LB gets a public IP/DNS. This is the simplest way to expose a service to the internet — but each Service gets its own LB (expensive at scale; prefer Ingress for HTTP).

ExternalName

Maps a Service to an external DNS name (CNAME record). No proxying — just DNS resolution. Useful for abstracting external dependencies (e.g., a managed database) behind a Kubernetes-native DNS name.

Service TypeScopePort RangeUse Case
ClusterIPInternal onlyAnyInter-service communication within cluster
NodePortInternal + Node IP30000–32767Development, on-prem without LB
LoadBalancerExternal (public IP)AnyExposing single service to internet
ExternalNameDNS aliasN/AAbstracting external dependencies
Headless (clusterIP: None)Internal (no VIP)AnyStatefulSets, direct Pod discovery via DNS
# List Services and their ClusterIPs
kubectl get svc
# NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)
# kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP
# my-api       ClusterIP   10.96.45.123    <none>        80/TCP

# Describe a Service — shows endpoints (backend Pod IPs)
kubectl describe svc my-api
# Endpoints: 10.244.1.5:8080, 10.244.2.3:8080

# View Endpoints resource directly
kubectl get endpoints my-api
# NAME     ENDPOINTS                          AGE
# my-api   10.244.1.5:8080,10.244.2.3:8080   5m

kube-proxy & IPVS

kube-proxy runs on every node and implements the Service abstraction by programming packet-forwarding rules. It watches the API server for Service/Endpoint changes and updates rules in real time.

Two modes:

  • iptables mode (default): Creates DNAT rules that randomly select a backend Pod. O(n) rule evaluation for n endpoints — struggles above ~5,000 Services.
  • IPVS mode: Uses Linux IPVS (IP Virtual Server) kernel module for L4 load balancing. O(1) lookup via hash tables, supports multiple algorithms (round-robin, least connections, source hashing). Recommended for large clusters.
# View iptables rules created by kube-proxy for a Service
sudo iptables -t nat -L KUBE-SERVICES -n | grep my-api
# -d 10.96.45.123/32 -p tcp --dport 80 -j KUBE-SVC-XXXXXX

# Follow the chain to see backend selection
sudo iptables -t nat -L KUBE-SVC-XXXXXX -n
# -m statistic --mode random --probability 0.5 -j KUBE-SEP-AAAAAA
# -j KUBE-SEP-BBBBBB

# Each KUBE-SEP chain DNATs to a Pod IP
sudo iptables -t nat -L KUBE-SEP-AAAAAA -n
# -p tcp -j DNAT --to-destination 10.244.1.5:8080

# Check kube-proxy mode
kubectl -n kube-system get cm kube-proxy -o yaml | grep mode
# mode: "iptables"  (or "ipvs")

# IPVS mode: view virtual servers
sudo ipvsadm -Ln | head -20
# TCP  10.96.45.123:80 rr
#   -> 10.244.1.5:8080    Masq    1
#   -> 10.244.2.3:8080    Masq    1

Ingress & Ingress Controllers

An Ingress is a Kubernetes resource that defines HTTP/HTTPS routing rules — mapping hostnames and paths to backend Services. Unlike LoadBalancer Services (one LB per service), a single Ingress can route to many Services, making it cost-effective for HTTP workloads.

Ingress resources are declarative — they do nothing without an Ingress Controller (nginx-ingress, Traefik, HAProxy, AWS ALB Ingress Controller, etc.) watching and implementing the rules.

# Ingress manifest — route by host and path
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /users
            pathType: Prefix
            backend:
              service:
                name: users-svc
                port:
                  number: 80
          - path: /orders
            pathType: Prefix
            backend:
              service:
                name: orders-svc
                port:
                  number: 80
  tls:
    - hosts:
        - api.example.com
      secretName: api-tls-cert
# List Ingress resources
kubectl get ingress
# NAME          CLASS   HOSTS             ADDRESS        PORTS     AGE
# app-ingress   nginx   api.example.com   34.56.78.90    80, 443   10m

# View Ingress controller pods
kubectl -n ingress-nginx get pods
# ingress-nginx-controller-xxxx   1/1   Running

# Check the Ingress controller's configuration
kubectl -n ingress-nginx exec deploy/ingress-nginx-controller -- cat /etc/nginx/nginx.conf | grep -A 5 "api.example.com"

NetworkPolicies

A NetworkPolicy is a firewall rule for Pods. It selects Pods via labels and defines allowed ingress/egress traffic. NetworkPolicies are enforced by the CNI plugin (Calico, Cilium, Weave — but NOT Flannel alone).

Default Behaviour: When no NetworkPolicy selects a Pod, all traffic is allowed (fully open). The moment any NetworkPolicy selects a Pod, all traffic not explicitly allowed by a policy is denied. This means applying your first NetworkPolicy is a breaking change if you haven't allowed the traffic you need. Always start with a "deny all" policy + explicit allows for a zero-trust posture.
# NetworkPolicy: deny all ingress to Pods in namespace "production"
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all-ingress
  namespace: production
spec:
  podSelector: {}          # Selects ALL Pods in this namespace
  policyTypes:
    - Ingress              # Only affects ingress (egress unchanged)
  ingress: []              # Empty = no ingress allowed
# NetworkPolicy: allow traffic from frontend to backend on port 8080
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend         # Apply to Pods with app=backend
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: frontend   # Allow from Pods with app=frontend
      ports:
        - protocol: TCP
          port: 8080
# List NetworkPolicies in a namespace
kubectl get networkpolicies -n production
# NAME                        POD-SELECTOR   AGE
# deny-all-ingress            <none>         5m
# allow-frontend-to-backend   app=backend    3m

# Test connectivity — exec into a frontend pod and curl backend
kubectl exec -n production deploy/frontend -- curl -s --max-time 3 http://backend:8080/health
# {"status":"ok"}

# Test that other pods are blocked
kubectl exec -n production deploy/monitoring -- curl -s --max-time 3 http://backend:8080/health
# curl: (28) Connection timed out  (BLOCKED by NetworkPolicy)

CoreDNS & Service Discovery

CoreDNS is the cluster DNS server (runs as a Deployment in kube-system). Every Pod's /etc/resolv.conf points to the CoreDNS ClusterIP. Service discovery works via DNS — any Service is reachable by its name.

DNS record format: <service>.<namespace>.svc.cluster.local

# View CoreDNS pods and service
kubectl -n kube-system get pods -l k8s-app=kube-dns
kubectl -n kube-system get svc kube-dns
# NAME       TYPE        CLUSTER-IP   PORT(S)
# kube-dns   ClusterIP   10.96.0.10   53/UDP,53/TCP

# Resolve a Service name from inside a Pod
kubectl exec -it deploy/debug -- nslookup my-api
# Name:    my-api.default.svc.cluster.local
# Address: 10.96.45.123

# Resolve a Service in another namespace
kubectl exec -it deploy/debug -- nslookup redis.cache.svc.cluster.local
# Address: 10.96.78.90

# View a Pod's DNS configuration
kubectl exec deploy/debug -- cat /etc/resolv.conf
# nameserver 10.96.0.10
# search default.svc.cluster.local svc.cluster.local cluster.local
# ndots:5

# Headless Service — returns individual Pod IPs (no ClusterIP)
kubectl exec deploy/debug -- nslookup my-statefulset-headless
# Name:    my-statefulset-headless.default.svc.cluster.local
# Address: 10.244.1.5
# Address: 10.244.2.3

CNI Plugin Comparison

CNI PluginNetwork ModeNetworkPolicyKey Features
CalicoBGP, VXLAN, IPIPYes (full)Most popular, eBPF dataplane option, WireGuard encryption
CiliumeBPF (no iptables)Yes (L3-L7)eBPF-native, L7 policies, Hubble observability, service mesh
FlannelVXLAN, host-gwNoSimplest setup, no NetworkPolicy (pair with Calico for policies)
Weave NetVXLAN, sleeveYes (basic)Encryption built-in, mesh topology, multicast support
AWS VPC CNINative VPC routingYes (via Calico addon)Pods get real VPC IPs, no overlay overhead, security groups per Pod
Advanced Architecture

Service Mesh — When Kubernetes Networking Isn't Enough

Kubernetes Services provide L4 load balancing and basic discovery. A service mesh (Istio, Linkerd, Cilium Service Mesh) adds L7 capabilities via sidecar proxies (Envoy) injected into every Pod:

  • mTLS — automatic mutual TLS between all services (zero-trust networking without app changes)
  • Traffic management — canary releases, circuit breaking, retries, timeouts at the mesh level
  • Observability — distributed tracing, golden metrics (latency, traffic, errors, saturation) per service
  • Policy — L7 authorization (allow GET /api/users but deny DELETE /api/users)

The trade-off: added latency (~1-2ms per hop), memory overhead (~50MB per sidecar), and operational complexity. Use a service mesh when you have 10+ microservices and need consistent security/observability without modifying application code.

IstioLinkerdEnvoymTLS

Exercises

# Exercise 1: Inspect your cluster's Service networking
kubectl get svc --all-namespaces -o wide
kubectl get endpoints --all-namespaces | head -20

# Exercise 2: Trace DNS resolution inside a Pod
kubectl run debug --image=busybox:1.36 --rm -it --restart=Never -- nslookup kubernetes.default

# Exercise 3: View kube-proxy mode
kubectl -n kube-system get cm kube-proxy -o jsonpath='{.data.config\.conf}' | grep mode

# Exercise 4: Check Pod CIDR allocations
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{" → "}{.spec.podCIDR}{"\n"}{end}'

# Exercise 5: List NetworkPolicies and their selectors
kubectl get networkpolicies --all-namespaces -o wide

Conclusion & Next Steps

Kubernetes networking is built on a simple contract — every Pod gets a routable IP, and CNI plugins handle the implementation. Services provide stable endpoints and load balancing via kube-proxy (iptables/IPVS). Ingress consolidates HTTP routing behind a single entry point. NetworkPolicies enforce microsegmentation (deny-by-default once applied). CoreDNS makes everything discoverable by name. Understanding these layers — and which component is responsible for each — is the key to debugging connectivity issues in production clusters.