The Kubernetes Networking Model
Four Fundamental Requirements
Kubernetes imposes a simple but strict networking model. Every implementation (CNI plugin) must satisfy these four requirements:
- Every pod gets its own unique IP address
- Pods on the same node can communicate without NAT
- Pods on different nodes can communicate without NAT
- Agents on a node (kubelet, kube-proxy) can communicate with all pods on that node
flowchart TD
subgraph Node-1 [Node 1 — 192.168.1.10]
P1[Pod A
10.244.1.2]
P2[Pod B
10.244.1.3]
end
subgraph Node-2 [Node 2 — 192.168.1.11]
P3[Pod C
10.244.2.2]
P4[Pod D
10.244.2.3]
end
P1 <-->|Direct IP, no NAT| P2
P1 <-->|Direct IP, no NAT| P3
P2 <-->|Direct IP, no NAT| P4
P3 <-->|Direct IP, no NAT| P4
IP Address Allocation
Kubernetes uses three separate IP ranges (CIDRs) that must not overlap:
| Network | Typical CIDR | Purpose | Managed By |
|---|---|---|---|
| Node network | 192.168.0.0/16 | Physical/VM node IPs | Infrastructure (DHCP, cloud) |
| Pod network | 10.244.0.0/16 | Pod IP addresses | CNI plugin |
| Service network | 10.96.0.0/12 | ClusterIP Services (virtual) | kube-proxy / iptables |
# View the three network ranges in a cluster:
# Node network:
kubectl get nodes -o wide
# NAME INTERNAL-IP STATUS
# master-1 192.168.1.10 Ready
# worker-1 192.168.1.11 Ready
# worker-2 192.168.1.12 Ready
# Pod network (CIDR per node):
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.podCIDR}{"\n"}{end}'
# master-1 10.244.0.0/24 (256 pod IPs for this node)
# worker-1 10.244.1.0/24
# worker-2 10.244.2.0/24
# Service network:
kubectl cluster-info dump | grep -m 1 service-cluster-ip-range
# --service-cluster-ip-range=10.96.0.0/12
# Pod IPs:
kubectl get pods -o wide
# NAME IP NODE
# nginx-abc12 10.244.1.15 worker-1
# redis-def34 10.244.2.22 worker-2
Pod-to-Pod Communication
Same Node Communication
When two pods on the same node communicate, traffic flows through a virtual bridge (typically cbr0 or cni0) — never leaving the host:
flowchart TD
subgraph Node [Worker Node]
subgraph NS1 [Pod A Network Namespace]
E1[eth0: 10.244.1.2]
end
subgraph NS2 [Pod B Network Namespace]
E2[eth0: 10.244.1.3]
end
V1[veth-pod-a] --- BR[Bridge cni0
10.244.1.1]
V2[veth-pod-b] --- BR
E1 --- V1
E2 --- V2
end
# Observe same-node networking from inside a node:
# List bridge interfaces:
ip link show type bridge
# 3: cni0: mtu 1500 state UP
# List virtual ethernet pairs connected to the bridge:
bridge link show
# 5: vethXXXXXX@if4: master cni0
# 7: vethYYYYYY@if6: master cni0
# See the bridge's IP (gateway for pods on this node):
ip addr show cni0
# inet 10.244.1.1/24 scope global cni0
# From inside Pod A (10.244.1.2), reach Pod B (10.244.1.3):
# Traffic path: Pod A eth0 → veth → cni0 bridge → veth → Pod B eth0
# All within the kernel — extremely fast (no network hops)
Cross-Node Communication
When pods on different nodes communicate, the CNI plugin must route traffic across the physical network. Different CNI plugins use different strategies:
| Strategy | How It Works | CNI Example | Overhead |
|---|---|---|---|
| Overlay (VXLAN) | Encapsulate pod packets inside UDP between nodes | Flannel, Calico (optional) | ~50 bytes per packet |
| Direct routing (BGP) | Program host routes so nodes know pod CIDRs | Calico (default), Cilium | Zero (native routing) |
| Cloud native | Use cloud VPC routing tables | AWS VPC CNI, Azure CNI | Zero (VPC-native) |
| eBPF | Kernel-level packet processing | Cilium | Minimal (bypass iptables) |
Network Namespaces & veth Pairs
Each pod gets its own Linux network namespace — an isolated network stack with its own interfaces, routing table, and IP addresses. The pod's namespace connects to the host via a virtual ethernet (veth) pair:
# Inspect pod network namespace:
# Find the pod's container ID:
CONTAINER_ID=$(kubectl get pod nginx-pod -o jsonpath='{.status.containerStatuses[0].containerID}' | cut -d'/' -f3)
# Get the network namespace path:
PID=$(crictl inspect $CONTAINER_ID | jq .info.pid)
ls -la /proc/$PID/ns/net
# Enter the pod's network namespace:
nsenter -t $PID -n ip addr show
# 1: lo: inet 127.0.0.1/8
# 3: eth0@if7: inet 10.244.1.15/24
# ↑ This "if7" means the other end is interface index 7 on the host
# From the host, find the matching veth:
ip link show | grep "^7:"
# 7: veth12345678@if3: master cni0
# The veth pair:
# Pod namespace: eth0 (index 3) ←→ Host: veth12345678 (index 7)
# Connected to bridge cni0 on the host side
CNI Plugins
The CNI Interface
The Container Network Interface (CNI) is a specification that defines how network plugins interact with container runtimes. When kubelet creates a pod, it calls the CNI plugin to set up networking:
# CNI plugin lifecycle:
# 1. kubelet creates pod sandbox (pause container)
# 2. kubelet calls CNI ADD — plugin assigns IP, creates veth, sets routes
# 3. Pod runs with networking configured
# 4. Pod terminates → kubelet calls CNI DEL — plugin cleans up
# CNI configuration location:
ls /etc/cni/net.d/
# 10-calico.conflist (or 10-flannel.conflist, etc.)
# CNI binary location:
ls /opt/cni/bin/
# bandwidth bridge calico calico-ipam flannel host-local loopback portmap
# Example CNI config (Calico):
cat /etc/cni/net.d/10-calico.conflist
# {
# "name": "k8s-pod-network",
# "cniVersion": "0.3.1",
# "plugins": [
# { "type": "calico", "datastore_type": "kubernetes", ... },
# { "type": "bandwidth", ... },
# { "type": "portmap", ... }
# ]
# }
Calico
Calico is the most popular CNI plugin for production Kubernetes. It uses BGP routing by default (no overlay, no encapsulation overhead) and provides powerful network policies:
# Calico architecture:
# - calico-node (DaemonSet): runs on every node
# - Felix: programs routes and network policies into iptables/eBPF
# - BIRD: BGP daemon, advertises pod routes to other nodes
# - calico-kube-controllers: watches Kubernetes API for policy changes
# - calico-typha (optional): API proxy for large clusters (500+ nodes)
# Check Calico status:
kubectl get pods -n calico-system
# NAME READY STATUS
# calico-node-abc12 1/1 Running
# calico-node-def34 1/1 Running
# calico-kube-controllers-6f7g8h9j-xyz99 1/1 Running
# Calico node status:
kubectl exec -n calico-system calico-node-abc12 -- calico-node -bird-ready
# calico/node is ready.
# View BGP peering:
kubectl exec -n calico-system calico-node-abc12 -- birdcl show protocols
# BIRD 2.0.8 ready.
# Name Proto State Info
# node_192_168_1_11 BGP up Established
# node_192_168_1_12 BGP up Established
Cilium
Cilium uses eBPF (extended Berkeley Packet Filter) to implement networking and security at the Linux kernel level — bypassing iptables entirely for dramatically better performance at scale:
# Cilium provides:
# - eBPF-based networking (no iptables, O(1) routing)
# - L3/L4/L7 network policies (filter by HTTP path, gRPC method)
# - Transparent encryption (WireGuard or IPsec)
# - Hubble: distributed observability (network flow logs)
# - Service mesh (without sidecars)
# Check Cilium status:
cilium status
# KVStore: Ok Disabled
# Kubernetes: Ok 1.30 (v1.30.0)
# Nodes: 5/5 reachable
# IPAM: 10.0.0.0/8 available
# Encryption: WireGuard (Enabled)
# View network flows with Hubble:
hubble observe --namespace production --protocol http
# TIMESTAMP SOURCE DESTINATION TYPE VERDICT
# May 14 10:23:45 payment/pod-abc inventory/pod-def HTTP FORWARDED
# GET /api/v1/stock → 200 OK (4.2ms)
Flannel
Flannel is the simplest CNI plugin — it only provides basic connectivity (no network policies). It uses VXLAN overlay by default, encapsulating pod traffic in UDP packets between nodes:
# Flannel: simplicity over features
# - VXLAN overlay (encapsulates packets)
# - No network policy support (use with Calico for policies)
# - Minimal resource usage
# - Best for: learning, small clusters, simple requirements
# Flannel components:
# - flanneld DaemonSet on each node
# - Stores subnet allocation in Kubernetes API (or etcd)
# - Creates flannel.1 VXLAN interface on each node
# Check Flannel:
kubectl get pods -n kube-flannel
# NAME READY STATUS
# kube-flannel-ds-abc12 1/1 Running
# kube-flannel-ds-def34 1/1 Running
# VXLAN interface on each node:
ip link show flannel.1
# flannel.1: mtu 1450 type vxlan
CNI Comparison
| Feature | Calico | Cilium | Flannel | AWS VPC CNI |
|---|---|---|---|---|
| Routing | BGP (native) | eBPF / VXLAN | VXLAN overlay | VPC native |
| Network Policies | L3/L4 + L7 (enterprise) | L3/L4/L7 (native) | None | L3/L4 (with Calico) |
| Encryption | WireGuard | WireGuard / IPsec | None | VPC encryption |
| Performance | Excellent | Best (eBPF) | Good | Excellent |
| Complexity | Medium | Medium-High | Low | Low (AWS only) |
| Best for | General production | Large scale, security | Learning, simple needs | AWS EKS |
Network Policies
Default Allow & Zero Trust
By default, Kubernetes allows all traffic between all pods — any pod can talk to any other pod in the cluster. Network Policies let you restrict this to implement zero-trust networking:
# Default deny all ingress traffic in a namespace:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
namespace: production
spec:
podSelector: {} # Applies to ALL pods in namespace
policyTypes:
- Ingress # Block all incoming traffic
# No ingress rules = deny everything
Ingress & Egress Rules
# Allow specific traffic to the payment service:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-payment-traffic
namespace: production
spec:
podSelector:
matchLabels:
app: payment # This policy applies to payment pods
policyTypes:
- Ingress
- Egress
ingress:
# Allow traffic FROM the API gateway only
- from:
- podSelector:
matchLabels:
app: api-gateway
ports:
- protocol: TCP
port: 8080
# Allow traffic FROM monitoring (different namespace)
- from:
- namespaceSelector:
matchLabels:
team: platform
podSelector:
matchLabels:
app: prometheus
ports:
- protocol: TCP
port: 9090
egress:
# Allow connections TO the database
- to:
- podSelector:
matchLabels:
app: postgres
ports:
- protocol: TCP
port: 5432
# Allow DNS (required for service discovery)
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
Common Policy Patterns
Production Network Policy Approach
- Default deny: Apply deny-all ingress and egress per namespace
- Allow DNS: Every pod needs DNS — allow egress to kube-dns
- Allow required paths: Explicitly allow each service-to-service path
- Allow monitoring: Prometheus scraping from monitoring namespace
- Allow ingress controller: External traffic from ingress namespace
This creates a "whitelist" model — only explicitly allowed traffic flows. Any new service must have its policies defined before it can communicate.
Cluster DNS (CoreDNS)
DNS Resolution Flow
CoreDNS runs as a Deployment in the kube-system namespace and provides DNS resolution for all Services and Pods in the cluster:
sequenceDiagram
participant Pod as Application Pod
participant Res as Pod's /etc/resolv.conf
participant DNS as CoreDNS (10.96.0.10)
participant API as Kubernetes API
Pod->>Res: Resolve "payment-svc"
Res->>DNS: Query payment-svc.default.svc.cluster.local
DNS->>API: Look up Service endpoints
API->>DNS: ClusterIP: 10.96.45.12
DNS->>Pod: A record: 10.96.45.12
Pod->>Pod: Connect to 10.96.45.12:80
# Every pod gets DNS configured automatically:
kubectl exec nginx-pod -- cat /etc/resolv.conf
# nameserver 10.96.0.10 ← CoreDNS ClusterIP
# search default.svc.cluster.local svc.cluster.local cluster.local
# ndots:5
# The "search" line enables short names:
# "payment-svc" → tries payment-svc.default.svc.cluster.local first
# "payment-svc.production" → tries payment-svc.production.svc.cluster.local
# DNS record types Kubernetes creates:
# A record: payment-svc.default.svc.cluster.local → 10.96.45.12
# SRV record: _http._tcp.payment-svc.default.svc.cluster.local → port 80
# PTR record: 12.45.96.10.in-addr.arpa → payment-svc.default.svc.cluster.local
# Headless Service DNS (returns pod IPs directly):
# A record: postgres-headless.default.svc.cluster.local → 10.244.1.5, 10.244.2.8
# A record: postgres-0.postgres-headless.default.svc.cluster.local → 10.244.1.5
# CoreDNS configuration:
kubectl get configmap coredns -n kube-system -o yaml
DNS Debugging
# DNS debugging toolkit:
# 1. Test DNS resolution from inside a pod:
kubectl run dns-test --image=busybox:1.36 --rm -it -- nslookup payment-svc
# Server: 10.96.0.10
# Address: 10.96.0.10:53
# Name: payment-svc.default.svc.cluster.local
# Address: 10.96.45.12
# 2. Test with dig for detailed output:
kubectl run dig-test --image=tutum/dnsutils --rm -it -- dig payment-svc.default.svc.cluster.local
# 3. Check CoreDNS pod health:
kubectl get pods -n kube-system -l k8s-app=kube-dns
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=20
# 4. Common DNS issues:
# - Pod stuck in ContainerCreating → CNI plugin failing
# - DNS resolution timeout → CoreDNS pods not running or overloaded
# - "NXDOMAIN" → Service name typo or wrong namespace
# - ndots:5 causing slow resolution → set dnsConfig in pod spec
# 5. Override DNS config for specific pods:
# spec:
# dnsPolicy: ClusterFirst
# dnsConfig:
# options:
# - name: ndots
# value: "2" # Reduce DNS lookups for external names
Exercises
kubectl exec to: (a) Check the pod's IP and default route, (b) Ping the other pod by IP, (c) Trace the route between them, (d) Identify which CNI plugin your cluster uses and how it routes cross-node traffic.
nslookup my-service return? (b) What does nslookup my-headless-service return? (c) How would a client use each?
Conclusion
Kubernetes networking creates an elegant abstraction — a flat network where every pod can reach every other pod — but the implementation underneath is sophisticated. Key takeaways:
- Flat network model: Every pod gets a unique IP, no NAT between pods
- CNI plugins handle the complexity: Choose based on your needs — Calico for general use, Cilium for scale/security, Flannel for simplicity
- Network Policies are essential: Default-deny + explicit allow = zero-trust networking
- DNS is automatic: CoreDNS provides service discovery without manual configuration
- Three separate CIDRs: Node, Pod, and Service networks must not overlap
In Part 9, we'll build on this networking foundation with Services, Ingress controllers, and Service Mesh — the higher-level abstractions that expose your applications to the outside world and manage inter-service communication.