Docker Networking Is Linux Networking
Every "Docker network" you create is really a combination of standard Linux kernel features orchestrated by the Docker daemon. There's no magic — just automation of primitives that network engineers have used for years:
- Network namespaces — isolated network stacks (interfaces, routing tables, iptables rules)
- Virtual Ethernet pairs (veth) — virtual cables connecting namespaces
- Linux bridges — software switches connecting multiple veth endpoints
- iptables — packet filtering, NAT, and port forwarding
- VXLAN tunnels — encapsulation for multi-host overlay networks
docker network create fails, when containers can't reach each other, when port mapping doesn't work — the debugging happens at the Linux level. Knowing these primitives means you can diagnose issues that Docker's error messages never explain.
Let's build a container network from scratch using only Linux commands, then see how Docker automates the same process.
Virtual Ethernet Pairs (veth)
A veth pair is like a virtual Ethernet cable with a connector at each end. Whatever goes in one end comes out the other. Docker places one end inside the container's network namespace (as eth0) and the other end on the host's bridge (as vethXXXXXX).
flowchart LR
subgraph CNS["Container Network Namespace"]
ETH0["eth0
172.17.0.2/16"]
end
subgraph HNS["Host Network Namespace"]
VETH["veth7a3b2c1
(no IP)"]
BR["docker0 bridge
172.17.0.1/16"]
HOST_ETH["eth0
192.168.1.10"]
end
ETH0 <-->|"veth pair"| VETH
VETH --- BR
BR --- HOST_ETH
Creating veth Pairs Manually
Let's recreate what Docker does behind the scenes — manually creating a network namespace, veth pair, and bridge connection:
# Create a network namespace (what Docker does for each container)
sudo ip netns add container1
# Create a veth pair
sudo ip link add veth-host type veth peer name veth-container
# Move one end into the container's namespace
sudo ip link set veth-container netns container1
# Assign an IP inside the namespace (container's perspective)
sudo ip netns exec container1 ip addr add 172.18.0.2/24 dev veth-container
sudo ip netns exec container1 ip link set veth-container up
sudo ip netns exec container1 ip link set lo up
# Set up the host end
sudo ip link set veth-host up
# Verify — from inside the namespace
sudo ip netns exec container1 ip addr show
# => veth-container: inet 172.18.0.2/24
# Verify — from the host
ip link show veth-host
# => veth-host:
# See Docker's veth pairs for running containers
# Find the veth pair for a specific container
CONTAINER_PID=$(docker inspect --format '{{.State.Pid}}' my-container)
sudo nsenter -t $CONTAINER_PID -n ip link show eth0
# => eth0@if42:
# The "if42" tells you the host-side interface index is 42
# Find which host interface has index 42
ip link | grep "^42:"
# => 42: veth7a3b2c1@if41:
# Shortcut: list all veth pairs on the host
ip link show type veth
Linux Bridges
A Linux bridge is a software switch — it learns MAC addresses and forwards Ethernet frames between connected interfaces. Docker's docker0 is just a Linux bridge with veth pairs plugged into it.
# Show all bridges on the system
bridge link show
# => veth7a3b2c1 state forwarding ... master docker0
# => vethf4e5d6c state forwarding ... master docker0
# Alternative: use brctl (bridge-utils package)
brctl show
# bridge name bridge id STP enabled interfaces
# docker0 8000.0242c0a80001 no veth7a3b2c1
# vethf4e5d6c
# Show the bridge's MAC address table (learned addresses)
bridge fdb show br docker0
# => 02:42:ac:11:00:02 dev veth7a3b2c1 master docker0
# => 02:42:ac:11:00:03 dev vethf4e5d6c master docker0
# Bridge details including STP state
ip -d link show docker0
# => docker0:
# => bridge forward_delay 1500 hello_time 200 max_age 2000
flowchart TD
subgraph NETNS1["Container A netns"]
A_ETH["eth0 172.17.0.2"]
end
subgraph NETNS2["Container B netns"]
B_ETH["eth0 172.17.0.3"]
end
subgraph NETNS3["Container C netns"]
C_ETH["eth0 172.17.0.4"]
end
subgraph HOST["Host Network Namespace"]
VETHA["vethAAA"]
VETHB["vethBBB"]
VETHC["vethCCC"]
BRIDGE["docker0 bridge
172.17.0.1/16"]
IPT["iptables (MASQUERADE + DNAT)"]
ETH0["eth0 192.168.1.10"]
end
INTERNET["Internet / LAN"]
A_ETH <--> VETHA
B_ETH <--> VETHB
C_ETH <--> VETHC
VETHA --- BRIDGE
VETHB --- BRIDGE
VETHC --- BRIDGE
BRIDGE --- IPT
IPT --- ETH0
ETH0 --- INTERNET
Creating a Bridge Manually (What Docker Does)
# Create a bridge (equivalent of docker network create)
sudo ip link add my-bridge type bridge
sudo ip addr add 172.18.0.1/24 dev my-bridge
sudo ip link set my-bridge up
# Attach the veth host-end to the bridge
sudo ip link set veth-host master my-bridge
# Add a default route inside the container namespace
sudo ip netns exec container1 ip route add default via 172.18.0.1
# Now the container can reach the host via the bridge gateway
sudo ip netns exec container1 ping -c 1 172.18.0.1
# => 64 bytes from 172.18.0.1: icmp_seq=1 ttl=64
# Enable IP forwarding on the host (required for internet access)
sudo sysctl net.ipv4.ip_forward=1
# Add masquerade rule for outbound traffic
sudo iptables -t nat -A POSTROUTING -s 172.18.0.0/24 ! -o my-bridge -j MASQUERADE
# Container can now reach the internet
sudo ip netns exec container1 ping -c 1 8.8.8.8
# => 64 bytes from 8.8.8.8: icmp_seq=1 ttl=117
iptables & NAT
Docker heavily uses iptables for two purposes: port forwarding (DNAT — making container services reachable from outside) and masquerading (SNAT — allowing containers to reach the internet with the host's IP).
# View Docker's NAT rules
sudo iptables -t nat -L -n -v --line-numbers
# DOCKER chain (port forwarding / DNAT)
sudo iptables -t nat -L DOCKER -n -v
# Chain DOCKER (2 references)
# pkts bytes target prot opt in out source destination
# 125 7500 DNAT tcp -- !docker0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8080 to:172.17.0.2:80
# 42 2520 DNAT tcp -- !docker0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:3000 to:172.17.0.3:3000
# POSTROUTING chain (masquerade / SNAT for outbound)
sudo iptables -t nat -L POSTROUTING -n -v
# Chain POSTROUTING (policy ACCEPT)
# pkts bytes target prot opt in out source destination
# 892 53520 MASQUERADE all -- * !docker0 172.17.0.0/16 0.0.0.0/0
# DOCKER-ISOLATION chain (inter-network isolation)
sudo iptables -L DOCKER-ISOLATION-STAGE-1 -n -v
# Drops traffic between different Docker bridge networks
Tracing a Packet Through iptables
When an external client sends a request to host:8080, here's the exact packet path:
# Use TRACE target to follow a packet through all chains
sudo iptables -t raw -A PREROUTING -p tcp --dport 8080 -j TRACE
sudo iptables -t raw -A OUTPUT -p tcp --dport 8080 -j TRACE
# Watch the trace in kernel log
sudo dmesg -w | grep TRACE
# Packet path for inbound traffic to published port:
# 1. PREROUTING (raw) → packet enters host
# 2. PREROUTING (nat) → DOCKER chain → DNAT to 172.17.0.2:80
# 3. FORWARD (filter) → DOCKER-USER → DOCKER-ISOLATION → DOCKER chain
# 4. POSTROUTING (nat) → packet leaves to container
# Remove trace rules when done
sudo iptables -t raw -D PREROUTING -p tcp --dport 8080 -j TRACE
sudo iptables -t raw -D OUTPUT -p tcp --dport 8080 -j TRACE
flowchart TD
CLIENT["Client Request
→ host:8080"]
PRE["PREROUTING chain"]
DNAT["DOCKER chain
DNAT → 172.17.0.2:80"]
FWD["FORWARD chain
DOCKER-ISOLATION check"]
POST["POSTROUTING chain"]
CONTAINER["Container
172.17.0.2:80"]
CLIENT --> PRE
PRE --> DNAT
DNAT --> FWD
FWD --> POST
POST --> CONTAINER
Network Address Translation
Docker uses two forms of NAT to enable container connectivity:
| NAT Type | Direction | iptables Chain | Purpose |
|---|---|---|---|
| SNAT / MASQUERADE | Outbound (container → internet) | POSTROUTING | Replace container's private IP with host's public IP |
| DNAT | Inbound (internet → container) | PREROUTING | Rewrite destination from host:port to container:port |
# SNAT in action — container reaches the internet
# Inside container: src=172.17.0.2 dst=8.8.8.8
# After MASQUERADE: src=192.168.1.10 dst=8.8.8.8 (host IP substituted)
# Verify with conntrack (connection tracking)
sudo conntrack -L -p tcp --dport 80 | head -5
# tcp 6 117 TIME_WAIT src=172.17.0.2 dst=93.184.216.34 sport=45678 dport=80
# src=93.184.216.34 dst=192.168.1.10 sport=80 dport=45678 [ASSURED]
# The conntrack entry shows:
# - Original: container IP (172.17.0.2) → external
# - Reply: external → host IP (192.168.1.10) — then kernel reverses NAT
# DNAT in action — external traffic reaches container
# Incoming: src=client dst=192.168.1.10:8080
# After DNAT: src=client dst=172.17.0.2:80 (container IP substituted)
Overlay Network Internals
Overlay networks use VXLAN (Virtual eXtensible LAN) to create a Layer 2 network that spans multiple physical hosts. Each host runs a VTEP (VXLAN Tunnel Endpoint) that encapsulates container frames inside UDP packets (port 4789).
# Inspect overlay network internals on a Swarm node
docker network inspect my-overlay --verbose
# See VXLAN interfaces created by Docker
ip -d link show type vxlan
# => vxlan1: ... id 4097 ... dstport 4789
# Check the FDB (forwarding database) for remote MAC entries
bridge fdb show dev vxlan1
# => 02:42:0a:00:09:03 dst 192.168.1.11 self permanent
# This means: to reach MAC 02:42:0a:00:09:03, tunnel to host 192.168.1.11
# Capture VXLAN traffic between hosts
sudo tcpdump -i eth0 -n port 4789
# => 192.168.1.10.47652 > 192.168.1.11.4789: VXLAN, flags [I] (0x08), vni 4097
# => 02:42:0a:00:09:02 > 02:42:0a:00:09:03, ethertype IPv4 (0x0800)
# => 10.0.9.2 > 10.0.9.3: ICMP echo request
MTU 1450 on overlay network interfaces. Mismatched MTUs between overlay and underlay cause mysterious packet drops for large payloads.
DNS-Based Service Discovery
Docker runs an embedded DNS server at 127.0.0.11 inside every container on a user-defined network. This server resolves container names, service names, and network aliases to IP addresses.
# Check DNS configuration inside a container
docker run --rm --network mynet alpine cat /etc/resolv.conf
# => nameserver 127.0.0.11
# => options ndots:0
# Resolve a container name
docker run --rm --network mynet alpine nslookup api
# => Name: api
# => Address 1: 192.168.100.3
# Resolve a service name (Swarm mode)
docker exec my-container nslookup tasks.web
# => Name: tasks.web
# => Address 1: 10.0.9.3
# => Address 2: 10.0.9.4
# => Address 3: 10.0.9.5 (all replicas)
# DNS lookup with dig (more detail)
docker run --rm --network mynet nicolaka/netshoot dig api
# => ;; ANSWER SECTION:
# => api. 600 IN A 192.168.100.3
# How Docker intercepts DNS: iptables redirect
sudo iptables -t nat -L DOCKER_OUTPUT -n
# => DNAT udp -- 0.0.0.0/0 127.0.0.11 udp dpt:53 to:127.0.0.11:41245
Docker's Embedded DNS Architecture
Docker doesn't actually run a DNS server at 127.0.0.11:53. Instead, it uses iptables to intercept DNS queries sent to that address and redirects them to a random high port where the Docker daemon's DNS resolver listens. This resolver handles container name lookups locally and forwards external queries to the host's configured DNS servers (from /etc/resolv.conf).
DNS Round-Robin
When multiple containers share the same network alias, Docker's DNS returns all their IPs in round-robin order — providing basic client-side load distribution without any external load balancer.
# Create a network
docker network create lb-test
# Start 3 containers with the same network alias
docker run -d --name web1 --network lb-test --network-alias web alpine sleep 3600
docker run -d --name web2 --network lb-test --network-alias web alpine sleep 3600
docker run -d --name web3 --network lb-test --network-alias web alpine sleep 3600
# DNS returns all IPs (order rotates)
docker run --rm --network lb-test alpine nslookup web
# => Name: web
# => Address 1: 192.168.100.2
# => Address 2: 192.168.100.3
# => Address 3: 192.168.100.4
# Each lookup may return a different order
docker run --rm --network lb-test alpine nslookup web
# => Address 1: 192.168.100.3 (rotated)
# => Address 2: 192.168.100.4
# => Address 3: 192.168.100.2
# Cleanup
docker rm -f web1 web2 web3
docker network rm lb-test
Network Namespaces in Practice
Every running container has its own network namespace. You can enter it from the host to inspect and debug networking without modifying the container:
# Find container's PID
CONTAINER_PID=$(docker inspect --format '{{.State.Pid}}' my-container)
echo "Container PID: $CONTAINER_PID"
# Enter the container's network namespace with nsenter
sudo nsenter -t $CONTAINER_PID -n ip addr show
# => 1: lo: inet 127.0.0.1/8
# => 42: eth0@if43: inet 172.17.0.2/16
# Check routing table
sudo nsenter -t $CONTAINER_PID -n ip route
# => default via 172.17.0.1 dev eth0
# => 172.17.0.0/16 dev eth0 scope link
# Check iptables inside container namespace
sudo nsenter -t $CONTAINER_PID -n iptables -L -n
# => Usually empty (Docker manages rules on host, not in container)
# Run tcpdump inside container's namespace (even if container has no tcpdump)
sudo nsenter -t $CONTAINER_PID -n tcpdump -i eth0 -n -c 10
# List all network namespaces (Docker doesn't symlink to /var/run/netns)
# But you can find them via /proc
ls -la /proc/$CONTAINER_PID/ns/net
# => lrwxrwxrwx 1 root root 0 ... /proc/12345/ns/net -> net:[4026532456]
# Create a symlink so ip netns commands work
sudo ln -sf /proc/$CONTAINER_PID/ns/net /var/run/netns/my-container
sudo ip netns exec my-container ss -tlnp
sudo rm /var/run/netns/my-container
Common Networking Issues
| Issue | Symptom | Root Cause | Solution |
|---|---|---|---|
| DNS resolution fails | ping: bad address 'api' |
Using default bridge (no DNS) or container not on same network | Use user-defined bridge; verify both containers on same network |
| Port already in use | bind: address already in use |
Host port conflicts with another service or container | ss -tlnp | grep :PORT to find conflict; change port mapping |
| Cannot reach internet | Timeout on external requests | IP forwarding disabled or iptables MASQUERADE missing | sysctl net.ipv4.ip_forward=1; restart Docker daemon |
| Intermittent packet drops | Large HTTP responses fail, small ones work | MTU mismatch (overlay MTU 1450 vs underlay 1500) | Set --opt com.docker.network.driver.mtu=1450 on network |
| Firewall blocks containers | External traffic can't reach published ports | Host firewall (ufw/firewalld) drops FORWARD chain | Allow Docker's FORWARD rules; ufw allow from 172.17.0.0/16 |
| Container IP changed | Hard-coded IPs stop working after restart | IPs are dynamic; containers get new IP on recreation | Use container names (DNS) not IP addresses |
| Slow overlay performance | High latency between containers on different hosts | VXLAN encapsulation overhead; checksum offload issues | Verify NIC offload settings; consider host networking for hot paths |
# Debugging workflow for container networking
# Step 1: Check container is on expected network
docker inspect --format='{{json .NetworkSettings.Networks}}' my-container | jq
# Step 2: Check DNS resolution
docker exec my-container nslookup target-service
# Step 3: Check route to target
docker exec my-container ip route get 172.18.0.5
# Step 4: Check if port is listening on target
docker exec target-container ss -tlnp | grep 8080
# Step 5: Check iptables aren't blocking
sudo iptables -L DOCKER-USER -n -v
sudo iptables -L FORWARD -n -v | grep DROP
# Step 6: Capture packets to verify traffic arrives
docker run --rm --net=container:target-container nicolaka/netshoot \
tcpdump -i eth0 -n port 8080 -c 5
Exercises
Build a Network from Scratch
Without using Docker networking commands, create a network namespace, veth pair, bridge, and iptables rules to connect a namespace to the internet. Verify with ping 8.8.8.8 from inside the namespace. Then compare your commands with what Docker creates automatically.
iptables Forensics
Start 3 containers with different port mappings. Use iptables -t nat -L DOCKER -n -v to identify which rule belongs to which container. Then stop one container and verify the rule disappears. Document the complete iptables chain for a published port.
DNS Round-Robin Load Test
Create 5 containers with the same network alias, each running a simple HTTP server that returns its hostname. From a client container, make 100 requests to the alias and count how many went to each backend. Is the distribution uniform? Why or why not?
MTU Troubleshooting
Create an overlay network with default settings. Send ping with -s 1472 (fills 1500 MTU with headers) between containers on different hosts. It should fail. Then try -s 1422 (fits in 1450 overlay MTU). Explain the 50-byte difference by accounting for each encapsulation header.
Conclusion & Next Steps
Docker networking is built entirely on standard Linux primitives — there's nothing proprietary or magical about it. The key internals to remember:
- veth pairs — virtual cables connecting container namespaces to host bridges
- Linux bridges — software switches (docker0) that forward frames between veth endpoints
- iptables NAT — DNAT for port forwarding, MASQUERADE for outbound internet access
- VXLAN — UDP tunneling (port 4789) that creates overlay networks across hosts
- Embedded DNS (127.0.0.11) — resolves container names on user-defined networks via iptables interception
- nsenter — your most powerful debugging tool for inspecting container network state from the host
With networking and its Linux internals understood, we've covered how containers communicate. The next critical piece is how containers persist data — because by default, everything in a container is ephemeral.
Next in the Series
In Part 12: Storage & Data Persistence, we conquer the ephemeral nature of containers with Docker volumes, bind mounts, tmpfs, storage drivers, and patterns for running stateful applications like databases inside containers.