🌐 Kubernetes Networking Fundamentals

Reference: Kubernetes Official Documentation - Services, Load Balancing, and Networking

📋 Table of Contents

Core Network Model
Understanding Kubernetes Networks
Container-to-Container Communication
Pod-to-Pod Communication
Service Discovery and Load Balancing
External Access Patterns
Network Policies
DNS Resolution
Summary Reference

Core Network Model

Fundamental Principles

Kubernetes networking is built on these core requirements:

Every Pod gets its own unique cluster-wide IP address
All Pods can communicate with all other Pods without NAT (Network Address Translation)
Agents on a node can communicate with all Pods on that node
Pods share their network namespace with their containers

�� This flat networking model simplifies application deployment compared to traditional NAT-based container systems.

Architecture Components

┌────────────────────────────────────────────────────────────────────────────────┐
│                       KUBERNETES CLUSTER ARCHITECTURE                          │
└────────────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────────────────┐
│             k8s-cp1.tux2lab.internal - CONTROL PLANE NODE (10.28.28.3)                │
│                                                                                │
│  ┌───────────────┐  ┌──────────────┐  ┌────────────────┐  ┌────────────────┐   │
│  │ kube-apiserver│  │     etcd     │  │ kube-scheduler │  │ kube-controller│   │
│  └───────┬───────┘  └──────────────┘  └────────────────┘  └────────────────┘   │
│          │                                                                     │
│          │ ◄─────── Watch & Report ──────────────────────────────┐             │
│          │                                                       │             │
│  ┌───────▼───────┐  ┌──────────────┐                             │             │
│  │    CoreDNS    │  │  CNI Plugin  │                             │             │
│  └───────────────┘  └──────────────┘                             │             │
│                                                                  │             │
└──────────────────────────────────────────────────────────────────┼─────────────┘
                                                                   │
                                                               API Calls
                                                                   │
                                                                   │
┌───────────────────────────────────────────────────────────────── ┼─────────────┐
│        k8s-w1.tux2lab.internal - WORKER NODE (10.28.28.4)               │             │
│                                                                  │             │
│  ┌─────────────────── ───────────────────────────────────────────┼──────┐      │
│  │     kubelet                                                   │      │      │
│  │  Reports to API                                                      │      │
│  └──────┬────────────────────────┬──────────────────────────────────────┘      │
│         │                        │                                             │
│  ┌──────▼──────┐         ┌───────▼────────┐                                    │
│  │ kube-proxy  │         │  CNI Plugin    │                                    │
│  └──────┬──────┘         └───────┬────────┘                                    │
│         │                        │                                             │
│         │ Programs               │ Configures                                  │
│         ▼                        ▼                                             │
│  ┌─────────────────────────────────────────────────────────┐                   │
│  │           Linux Kernel Network Stack                    │                   │
│  │  ┌─────────┐ ┌────────┐ ┌────────┐ ┌──────┐ ┌──────┐    │                   │
│  │  │iptables │ │routing │ │  cni0  │ │ veth │ │ eth0 │    │                   │
│  │  │  rules  │ │ tables │ │ bridge │ │pairs │ │      │    │                   │
│  │  └─────────┘ └────────┘ └───┬────┘ └──┬───┘ └──────┘    │                   │
│  └────────────────────────────────┼─────────┼──────────────┘                   │
│                                   │         │                                  │
│                                   ▼         ▼                                  │
│  ┌────────────────────┐         ┌────────────────────┐                         │
│  │  POD 1 (10.8.0.5)  │         │  POD 2 (10.8.0.8)  │                         │
│  │  ┌──────────────┐  │         │  ┌──────────────┐  │                         │
│  │  │    Pause     │  │         │  │     app      │  │                         │
│  │  │  Container   │  │         │  │  container   │  │                         │
│  │  └──────────────┘  │         │  ├──────────────┤  │                         │
│  │  ┌──────────────┐  │         │  │   sidecar    │  │                         │
│  │  │    nginx     │  │         │  │  container   │  │                         │
│  │  │  container   │  │         │  └──────────────┘  │                         │
│  │  └──────────────┘  │         │                    │                         │
│  │  eth0@vethXXX      │         │  eth0@vethYYY      │                         │
│  └────────────────────┘         └────────────────────┘                         │
│                                                                                │
│  ┌───────────────────────────────────────────────────┐                         │
│  │  Container Runtime (containerd / CRI-O)           │                         │
│  └───────────────────────────────────────────────────┘                         │
│                                                                                │
│  Physical NIC: eth0 (10.28.28.4)                                               │
└────────────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────────────────┐
│            k8s-w2.tux2lab.internal - WORKER NODE (10.28.28.5)                         │
│                                                                                │
│  Similar structure: kubelet, kube-proxy, CNI Plugin, Container Runtime         │
│  Pod IPs allocated from cluster Pod network range                              │
│  Physical NIC: eth0 (10.28.28.5)                                               │
└────────────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────────────────┐
│                              NETWORK LAYERS                                    │
│                                                                                │
│  📦 Node Network (10.28.28.0/22)                                               │
│     Physical/VM network for node-to-node communication                         │
│                                                                                │
│  🌐 Pod Network (10.8.0.0/16)                                                  │
│     CNI-managed overlay/routed network for pod-to-pod communication            │
│     IP allocation method depends on CNI plugin (see IPAM section below)        │
│                                                                                │
│  ⚖️  Service Network (10.96.0.0/12)                                             │
│     Virtual IPs managed by kube-proxy for service discovery                    │
│     Traffic flow: Client → Service VIP → kube-proxy → Pod IP                   │
│                                                                                │
└────────────────────────────────────────────────────────────────────────────────┘

Architecture Overview:

Key Components

Component	Responsibility	Implementation
Container Runtime	Pod network namespace setup	CRI-compatible runtime (containerd, CRI-O)
Pause Container	Maintains Pod network namespace	Automatically created per Pod
CNI Plugin	Pod network implementation	Calico, Flannel, Cilium, Weave, etc.
kube-proxy	Service proxy and load balancing	iptables, IPVS, or eBPF modes
CoreDNS	Service discovery via DNS	DNS server for cluster
Network Policy	Traffic filtering rules	Implemented by CNI plugin

Understanding Kubernetes Networks

Kubernetes uses three distinct network address spaces that work together:

1. Node Network (Underlay/Physical Network)

The Node Network is the physical or VM network where Kubernetes nodes reside.

┌───────────────────────────────────────────────────────────┐
│                    Node Network (Underlay)                │
│                    Example: 10.28.28.0/22                 │
│                    (Your cluster configuration)           │
│                                                           │
│   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐    │
│   │ Control Node │  │  Worker-1    │  │  Worker-2    │    │
│   │ 10.28.28.3   │  │ 10.28.28.4   │  │ 10.28.28.5   │    │
│   └──────────────┘  └──────────────┘  └──────────────┘    │
│                                                           │
│   • Physical/VM network interfaces                        │
│   • SSH access, node-to-node communication                │
│   • External connectivity                                 │
└───────────────────────────────────────────────────────────┘

Characteristics:

Purpose: Node-to-node communication, external access
CIDR Example: 10.28.28.0/22
- Control plane: 10.28.28.3
- Worker 1: 10.28.28.4
- Worker 2: 10.28.28.5
Assignment: By infrastructure (DHCP, static, cloud provider)
Visibility: Routable within your data center/VPC
Used For:
- SSH into nodes
- kubelet → API server communication
- Node-to-node CNI traffic
- External client → NodePort/LoadBalancer

2. Pod Network (Overlay Network)

The Pod Network is a virtual network managed by the CNI plugin where Pods get their IP addresses.

📖 Deep Dive: See overlay-networks.md for detailed explanation of overlay networks, VXLAN, and encapsulation methods.

┌──────────────────────────────────────────────────────────────┐
│                    Pod Network (Overlay)                     │
│                   Example: 10.8.0.0/16                       │
│                                                              │
│  ┌─────────────────────┐           ┌─────────────────────┐   │
│  │   Worker-1          │           │   Worker-2          │   │
│  │   10.8.0.0/24       │           │   10.8.1.0/24       │   │
│  │                     │           │                     │   │
│  │  ┌──────────────┐   │           │  ┌──────────────┐   │   │
│  │  │ Pod A        │   │           │  │ Pod D        │   │   │
│  │  │ 10.8.0.5     │   │           │  │ 10.8.1.10    │   │   │
│  │  └──────────────┘   │           │  └──────────────┘   │   │
│  │                     │           │                     │   │
│  │  ┌──────────────┐   │           │  ┌──────────────┐   │   │
│  │  │ Pod B        │   │           │  │ Pod E        │   │   │
│  │  │ 10.8.0.8     │   │           │  │ 10.8.1.15    │   │   │
│  │  └──────────────┘   │           │  └──────────────┘   │   │
│  │                     │           │                     │   │
│  │  ┌──────────────┐   │           │  ┌──────────────┐   │   │
│  │  │ Pod C        │   │           │  │ Pod F        │   │   │
│  │  │ 10.8.0.12    │   │           │  │ 10.8.1.20    │   │   │
│  │  └──────────────┘   │           │  └──────────────┘   │   │
│  └─────────────────────┘           └─────────────────────┘   │
│                                                              │
│   • Each Pod gets unique IP from this range                  │
│   • Managed by CNI plugin (Calico, Flannel, etc.)            │
│   • Typically not routable outside cluster                   │
└──────────────────────────────────────────────────────────────┘

Characteristics:

Purpose: Pod-to-Pod communication across the cluster
CIDR Example: 10.8.0.0/16

Configuration: Set during cluster initialization

# Example kubeadm init
kubeadm init --pod-network-cidr=10.8.0.0/16

Visibility: Cluster-wide, not exposed externally
Used For:
- Direct Pod-to-Pod communication
- Container-to-container within Pod via localhost
- Kubernetes internal networking

IPAM (IP Address Management) - How CNI Plugins Handle IP Allocation:

📖 Detailed Guide: See ipam.md for comprehensive explanation of IPAM and different CNI behaviors

Different CNI plugins handle IP allocation differently:

1. Kubernetes-Managed IPAM (Flannel, kube-router):

Kubernetes controller-manager allocates podCIDR per node from --pod-network-cidr
Node gets assigned a sequential subnet (e.g., /24) in node.spec.podCIDR
CNI plugin reads this podCIDR and assigns Pod IPs from it
Allocation is sequential and predictable

# Example: Node with Kubernetes-assigned Pod CIDR
apiVersion: v1
kind: Node
metadata:
  name: k8s-w1
spec:
  podCIDR: 10.8.0.0/24      # Assigned by K8s controller-manager
  podCIDRs:
  - 10.8.0.0/24

2. CNI-Managed IPAM (Calico, Cilium):

CNI plugin has its own IPAM system - ignores node.spec.podCIDR
Uses custom resources (Calico: IPPools, Cilium: CiliumNode)
Allocates IP blocks dynamically as nodes join and Pods are created
Calico:
- Default block size is /26 (62 IPs per block)
- Allocates blocks on-demand from IPPool
- Multiple blocks can be allocated to same node if needed
- Allocation is dynamic, not sequential
Cilium:
- Configurable via CiliumNode resource
- Similar dynamic allocation

# Calico IPPool (defines available IP ranges)
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
  name: default-ipv4-ippool
spec:
  cidr: 10.8.0.0/16        # Total available range
  blockSize: 26            # Default: /26 blocks (62 IPs each)
  ipipMode: Always
  natOutgoing: true

Key Differences:

Aspect	Kubernetes IPAM (Flannel)	CNI IPAM (Calico)
Allocation Source	K8s controller-manager	CNI plugin's IPAM
Uses node.spec.podCIDR	✅ Yes	❌ No (ignores it)
Allocation Pattern	Sequential per node	Dynamic on-demand
Default Block Size	`/24` (configurable)	`/26` (Calico default)
Flexibility	Fixed per node	Multiple blocks per node

Checking Allocations:

# View Kubernetes-assigned Pod CIDR (Flannel uses this)
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.podCIDR}{"\n"}{end}'

# Calico-specific: View IP pools and allocations
kubectl get ippools.crd.projectcalico.org -o wide

# Cilium-specific: View node allocations
kubectl get ciliumnodes -o wide

3. Service Network (ClusterIP Range)

The Service Network is a virtual network for Service objects - these IPs don't actually exist on any interface.

┌──────────────────────────────────────────────────────────┐
│                   Service Network (Virtual)              │
│                   Example: 10.96.0.0/12                  │
│                   (Default Kubernetes range)             │
│                                                          │
│                   ┌─────────────────────┐                │
│                   │  Service: frontend  │                │
│                   │  ClusterIP:         │                │
│                   │  10.96.10.50:80     │                │
│                   └──────────┬──────────┘                │
│                              │                           │
│                   kube-proxy creates rules               │
│                              │                           │
│                ┌─────────────┼─────────────┐             │
│                │             │             │             │
│           ┌────▼───┐    ┌───▼────┐   ┌───▼────┐          │
│           │ Pod A  │    │ Pod B  │   │ Pod C  │          │
│           │10.10.  │    │10.10.  │   │10.10.  │          │
│           │20.5:80 │    │20.8:80 │   │21.10:80│          │
│           └────────┘    └────────┘   └────────┘          │
│                                                          │
│   • Virtual IPs - no actual network interface            │
│   • Managed by kube-proxy using iptables/IPVS            │
│   • Load balances to Pod IPs                             │
└──────────────────────────────────────────────────────────┘

Characteristics:

Purpose: Stable endpoints for Services (load balancing)
CIDR Example: 10.96.0.0/12 (default Kubernetes)
- Provides 1,048,576 IP addresses
- Can be customized: 10.32.0.0/16, 172.20.0.0/16, etc.
Assignment: API server assigns IPs to Service objects
Visibility: Only within cluster, virtual/non-routable

Configuration: Set during API server startup

# API server flag
--service-cluster-ip-range=10.96.0.0/12

Used For:
- Service discovery via DNS
- Load balancing traffic to Pods
- Stable virtual IPs that don't change

Network Relationships and Data Flow

Here's how all three networks interact:

┌────────────────────────────────────────────────────────┐
│                    Complete Network Architecture       │
│                                                        │
│  External Client (Internet/Corporate Network)          │
│         │                                              │
│         │ (1) Access via NodePort/LoadBalancer         │
│         ▼                                              │
│  ┌────────────────────────────────────────────────┐    │
│  │  Node Network: 10.28.28.0/22                   │    │
│  │                                                │    │
│  │  ┌────────────┐         ┌─────────────┐        │    │
│  │  │  Worker-1  │         │  Worker-2   │        │    │
│  │  │10.28.28.4  │         │10.28.28.5   │        │    │
│  │  └─────┬──────┘         └──────┬──────┘        │    │
│  └────────┼───────────────────────┼───────────────┘    │
│           │                       │                    │
│           │ (2) kube-proxy routes to Pod IPs           │
│           │                       │                    │
│  ┌────────┼───────────────────────┼───────────────┐    │
│  │        │  Pod Network: 10.8.0.0/16             │    │
│  │        │                       │               │    │
│  │  ┌─────▼──────┐         ┌─────▼──────┐         │    │
│  │  │ 10.8.0.0/24│         │ 10.8.1.0/24│         │    │
│  │  │            │         │            │         │    │
│  │  │ ┌────────┐ │         │ ┌─────────┐│         │    │
│  │  │ │Pod A   │ │         │ │Pod D    ││         │    │
│  │  │ │10.8.   │ │         │ │10.8.    ││         │    │
│  │  │ │0.5     │ │         │ │1.10     ││         │    │
│  │  │ └────────┘ │         │ └─────────┘│         │    │
│  │  └────────────┘         └────────────┘         │    │
│  └────────────────────────────────────────────────     │
│           ▲                                            │
│           │                                            │
│           │ (3) Service maps ClusterIP → Pod IPs       │
│           │                                            │
│  ┌────────┴──────────────────────────────────────┐     │
│  │  Service Network: 10.96.0.0/12 (Virtual)      │     │
│  │  Service: frontend                            │     │
│  │  ClusterIP: 10.96.10.50:80                    │     │
│  │  Endpoints: 10.8.0.5:80, 10.8.1.10:80         │     │
│  └───────────────────────────────────────────────      │
└────────────────────────────────────────────────────────┘

Traffic Flow Example:
1. Client accesses: http://10.28.28.4:30080 (NodePort on Worker-1)
2. kube-proxy on node routes to Service ClusterIP 10.96.10.50:80
3. Service load-balances to Pod IPs: 10.8.0.5:80 or 10.8.1.10:80
4. Pod receives and processes request
5. Response flows back through the same path

Network Configuration Summary

Network Type	Example CIDR	Purpose	Routable	Configured By
Node Network	`10.28.28.0/22`	Physical node connectivity	Yes (within infra)	Infrastructure/Cloud
Pod Network	`10.8.0.0/16`	Pod-to-Pod communication	No (cluster internal)	CNI Plugin
Service Network	`10.96.0.0/12`	Service virtual IPs	No (virtual only)	Kubernetes API

Important Network Rules

✅ Networks must NOT overlap:

# ❌ BAD - Overlapping ranges
Node Network:    10.8.0.0/16
Pod Network:     10.8.0.0/16    # Overlaps!
Service Network: 10.96.0.0/12   # OK

# ✅ GOOD - Non-overlapping ranges (Example Configuration)
Node Network:    10.28.28.0/22
Pod Network:     10.8.0.0/16
Service Network: 10.96.0.0/12

✅ Typical CIDR Sizing:

Node Network: Match your existing infrastructure
Pod Network:
- Small cluster: /24 per node, /16 total
- Medium cluster: /24 per node, /16 or /12 total
- Large cluster: /24 per node, /8 or /12 total
Service Network:
- Small cluster: /16
- Medium/Large cluster: /12

Checking Your Network Configuration

# View Pod network CIDR
kubectl cluster-info dump | grep -m 1 cluster-cidr
kubectl describe node <node-name> | grep PodCIDR

# View Service network CIDR
kubectl cluster-info dump | grep -m 1 service-cluster-ip-range

# View Node IPs
kubectl get nodes -o wide

# View Pod IPs
kubectl get pods -o wide --all-namespaces

# View Service IPs
kubectl get services --all-namespaces

Real-World Configuration Example

# Cluster Configuration
Node Network:    10.28.28.0/22
  - Control Plane: 10.28.28.3
  - Worker 1:      10.28.28.4
  - Worker 2:      10.28.28.5

Pod Network:     10.8.0.0/16
  - Worker 1 CIDR: 10.8.0.0/24
  - Worker 2 CIDR: 10.8.1.0/24
  - Worker 3 CIDR: 10.8.2.0/24

Service Network: 10.96.0.0/12
  - kubernetes service:    10.96.0.1
  - kube-dns service:      10.96.0.10
  - Your app services:     10.96.x.x

↑ Back to Table of Contents

Container-to-Container Communication

Within the Same Pod

Containers in the same Pod share a network namespace, enabling localhost communication. This network namespace is maintained by a special pause container that Kubernetes automatically creates for every Pod.

📖 Deep Dive: See pause-containers.md for detailed explanation of how pause containers maintain network stability across container restarts.

┌─────────────────────────────────────────────────┐
│                    Pod: web-app                 │
│               IP: 10.8.0.5                      │
│                                                 │
│  ┌──────────────────┐      ┌─────────────────┐  │
│  │  Container 1     │      │  Container 2    │  │
│  │  (nginx)         │◄────►│  (redis)        │  │
│  │                  │      │                 │  │
│  │  Port: 8080      │      │  Port: 6379     │  │
│  └──────────────────┘      └─────────────────┘  │
│           │                         │           │
│           └─────────┬───────────────┘           │
│                     │                           │
│          ┌──────────▼──────────┐                │
│          │   localhost (lo)    │                │
│          │   127.0.0.1         │                │
│          └─────────────────────┘                │
└─────────────────────────────────────────────────┘

Characteristics

Shared Resources: Network namespace, IP address, loopback interface
Communication Method: via localhost or 127.0.0.1
Port Conflicts: Containers must use different ports (can't both use port 80)
Use Cases:
- Sidecar containers (logging, monitoring)
- Service mesh proxies (Envoy, Linkerd)
- Init containers for setup tasks

Example Communication

# NGINX container talks to Redis container in same pod
redis-cli -h localhost -p 6379

# Sidecar logging agent reads from main container
curl http://localhost:8080/metrics

↑ Back to Table of Contents

Pod-to-Pod Communication

Same Node Communication

Pods on the same node communicate through a virtual bridge and veth pairs.

┌──────────────────────────────────────────────────┐
│                        Node 1                    │
│                     10.28.28.4                   │
│                                                  │
│  ┌─────────────┐              ┌─────────────┐    │
│  │   Pod A     │              │   Pod B     │    │
│  │  10.8.0.2   │              │  10.8.0.3   │    │
│  │             │              │             │    │
│  │  ┌───────┐  │              │  ┌───────┐  │    │
│  │  │ eth0  │  │              │  │ eth0  │  │    │
│  │  └───┬───┘  │              │  └───┬───┘  │    │
│  └──────┼──────┘              └──────┼──────┘    │
│         │                            │           │
│      veth0                         veth1         │
│         │                            │           │
│         └────────────┬───────────────┘           │
│                      │                           │
│              ┌───────▼───────┐                   │
│              │  cni0 bridge  │                   │
│              │               │                   │
│              └───────┬───────┘                   │
│                      │                           │
│              ┌───────▼───────┐                   │
│              │   eth0 (NIC)  │                   │
│              └───────────────┘                   │
└──────────────────────────────────────────────────┘

How It Works

Pod A sends packet to Pod B (10.8.0.2 → 10.8.0.3)
Packet goes through Pod A's eth0 (actually a veth pair)
Other end of veth pair connected to cni0 bridge
Bridge forwards packet to Pod B's veth pair
Packet arrives at Pod B's eth0

Key Points:

✅ No NAT required
✅ Direct IP-to-IP communication
✅ Layer 2 switching within node

Cross-Node Communication

Pods on different nodes communicate through the CNI network implementation.

┌───────────────────────────────────────────────────────────────────┐
│                         Kubernetes Cluster                        │
│                                                                   │
│  ┌──────────────────────────┐      ┌──────────────────────────┐   │
│  │       Node 1             │      │       Node 2             │   │
│  │    10.28.28.4            │      │    10.28.28.5            │   │
│  │    Pod CIDR: 10.8.0.0/24 │      │    Pod CIDR: 10.8.1.0/24 │   │
│  │                          │      │                          │   │
│  │  ┌─────────────┐         │      │  ┌─────────────┐         │   │
│  │  │   Pod A     │         │      │  │   Pod B     │         │   │
│  │  │  10.8.0.5   │         │      │  │  10.8.1.8   │         │   │
│  │  └──────┬──────┘         │      │  └──────▲──────┘         │   │
│  │         │                │      │         │                │   │
│  │    ┌────▼─────┐          │      │    ┌────┴─────┐          │   │
│  │    │cni0      │          │      │    │cni0      │          │   │
│  │    └────┬─────┘          │      │    └────▲─────┘          │   │
│  │         │                │      │         │                │   │
│  │    ┌────▼─────┐          │      │    ┌────┴─────┐          │   │
│  │    │  CNI     │          │      │    │  CNI     │          │   │
│  │    │  Plugin  │          │      │    │  Plugin  │          │   │
│  │    └────┬─────┘          │      │    └────▲─────┘          │   │
│  │         │                │      │         │                │   │
│  │    ┌────▼─────┐          │      │    ┌────┴─────┐          │   │
│  │    │ eth0     │──────────┼──────┼───►│ eth0     │          │   │
│  │    └──────────┘          │      │    └──────────┘          │   │
│  │                          │      │                          │   │
│  └──────────────────────────┘      └──────────────────────────┘   │
│                                                                   │
│  Packet Flow:                                                     │
│  1. Pod A (10.8.0.5) → Pod B (10.8.1.8)                           │
│  2. cni0 bridge on Node 1 → CNI plugin routes                     │
│  3. Encapsulated (VXLAN/IP-in-IP) or routed (BGP)                 │
│  4. Physical network → Node 2 eth0                                │
│  5. CNI plugin decapsulates → cni0 → Pod B                        │
│                                                                   │
└───────────────────────────────────────────────────────────────────┘

CNI Implementation Methods

Method	Example CNI	How It Works	Use Case
Overlay Network	Flannel (VXLAN), Weave	Encapsulates packets in UDP/VXLAN tunnel	Simple setup, works across any network
IP-in-IP	Calico	Encapsulates IP packet in another IP packet	Better performance than VXLAN
BGP Routing	Calico (BGP mode)	Uses BGP to advertise Pod routes	Best performance, requires BGP support
eBPF	Cilium	Uses eBPF for kernel-level packet processing	Modern, high-performance, observability

Example: Cross-Node Pod Communication

# From Pod A on Node 1, ping Pod B on Node 2
ping 10.8.1.8

# Trace the route
traceroute 10.8.1.8

# Check routing table on node
ip route show
# Output might show:
# 10.8.1.0/24 via 10.28.28.5 dev eth0

↑ Back to Table of Contents

Service Discovery and Load Balancing

kube-proxy: How Services Work

kube-proxy runs on every node and makes Kubernetes Services work by translating Service virtual IPs into actual Pod IPs. It watches the API server for Service and Endpoint changes, then programs network rules (iptables/IPVS/eBPF) to route traffic to the correct Pods.

kube-proxy Modes:

Mode	How It Works	Performance	Use Case
iptables	Creates iptables rules for each service	Good	Default, most compatible
IPVS	Uses Linux IPVS for load balancing	Better	Large clusters (>1000 services)
eBPF	Uses eBPF programs in kernel	Best	Modern, requires newer kernels

📖 Detailed Guide: See kube-proxy.md for complete explanation of how kube-proxy works, detailed mode comparisons, traffic flow diagrams, configuration examples, and troubleshooting

The Service Abstraction

Services provide stable endpoints for accessing a group of Pods, even as Pods are created/destroyed. When you create a Service, it gets a stable ClusterIP. kube-proxy on each node translates traffic to this ClusterIP into connections to actual Pod IPs, providing load balancing across healthy Pods.

┌──────────────────────────────────────────────────────────────┐
│                    Kubernetes Service                        │
│                                                              │
│                  ┌─────────────────────┐                     │
│                  │   Service: web      │                     │
│                  │   Type: ClusterIP   │                     │
│                  │   IP: 10.96.0.100   │                     │
│                  │   Port: 80          │                     │
│                  └──────────┬──────────┘                     │
│                             │                                │
│                    (kube-proxy manages)                      │
│                             │                                │
│       ┌─────────────────────┼───────────────────┐            │
│       │                     │                   │            │
│       │                     │                   │            │
│  ┌────▼────┐          ┌────▼────┐          ┌────▼────┐       │
│  │  Pod 1  │          │  Pod 2  │          │  Pod 3  │       │
│  │ 10.8.0. │          │ 10.8.0. │          │ 10.8.1. │       │
│  │   10    │          │   11    │          │    5    │       │
│  │         │          │         │          │         │       │
│  │label:   │          │label:   │          │label:   │       │
│  │app=web  │          │app=web  │          │app=web  │       │
│  └─────────┘          └─────────┘          └─────────┘       │
│                                                              │
│  EndpointSlice tracks healthy Pod IPs                        │
└──────────────────────────────────────────────────────────────┘

Service Types

1. ClusterIP (Default)

Internal-only service with a stable cluster IP.

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  type: ClusterIP
  selector:
    app: web
  ports:
    - protocol: TCP
      port: 80        # Service port
      targetPort: 8080 # Pod port

Characteristics:

✅ Only accessible within the cluster
✅ Stable virtual IP (ClusterIP)
✅ Load balances to backend Pods
✅ Default service type

2. NodePort

Exposes service on each node's IP at a static port.

┌───────────────────────────────────────────────────────────┐
│                   Kubernetes Cluster                      │
│                                                           │
│  External Client                                          │
│       │                                                   │
│       │  http://10.28.28.3:30080                          │
│       │  http://10.28.28.4:30080                          │
│       │  http://10.28.28.5:30080                          │
│       │                                                   │
│  ┌────▼──────┐     ┌───────────┐     ┌───────────┐        │
│  │  Node 1   │     │  Node 2   │     │  Node 3   │        │
│  │ :30080    │     │ :30080    │     │ :30080    │        │
│  └────┬──────┘     └─────┬─────┘     └─────┬─────┘        │
│       │                  │                 │              │
│       └──────────────────┼─────────────────┘              │
│                          │                                │
│                   ┌──────▼─────┐                          │
│                   │   Service  │                          │
│                   │ ClusterIP  │                          │
│                   └─────┬──────┘                          │
│                         │                                 │
│                ┌────────┼─────────┐                       │
│                │        │         │                       │
│           ┌────▼───┐ ┌──▼────┐ ┌──▼────┐                  │
│           │ Pod 1  │ │ Pod 2 │ │ Pod 3 │                  │
│           └────────┘ └───────┘ └───────┘                  │
└───────────────────────────────────────────────────────────┘

apiVersion: v1
kind: Service
metadata:
  name: my-nodeport-service
spec:
  type: NodePort
  selector:
    app: web
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
      nodePort: 30080  # Port range: 30000-32767

Characteristics:

✅ Accessible from outside cluster
⚠️ Exposes port on all nodes
⚠️ Limited port range (30000-32767)
⚠️ Need to manage node IPs

3. LoadBalancer

Provisions an external load balancer (cloud provider).

┌────────────────────────────────────────────────────┐
│                  Cloud Environment                 │
│                                                    │
│                   External Clients                 │
│                          │                         │
│                          │  http://lb-ip:80        │
│                          │                         │
│                  ┌───────▼───────┐                 │
│                  │  Cloud Load   │                 │
│                  │  Balancer     │                 │
│                  │  (AWS ELB/    │                 │
│                  │   GCP LB/     │                 │
│                  │   Azure LB)   │                 │
│                  └──────┬────────┘                 │
│                         │                          │
│  ┌──────────────────────┼───────────────────────┐  │
│  │        Kubernetes Cluster                    │  │
│  │                      │                       │  │
│  │  ┌────────┐      ┌───▼────┐      ┌────────┐  │  │
│  │  │ Node 1 │      │ Node 2 │      │ Node 3 │  │  │
│  │  │:30080  │      │:30080  │      │:30080  │  │  │
│  │  └───┬────┘      └────┬───┘      └────┬───┘  │  │
│  │      │                │               │      │  │
│  │      └────────────────┼───────────────┘      │  │
│  │                       │                      │  │
│  │                ┌──────▼─────┐                │  │
│  │                │   Service  │                │  │
│  │                │ ClusterIP  │                │  │
│  │                └─────┬──────┘                │  │
│  │                      │                       │  │
│  │             ┌────────┼─────────┐             │  │
│  │             │        │         │             │  │
│  │        ┌────▼───┐ ┌──▼────┐ ┌──▼────┐        │  │
│  │        │ Pod 1  │ │ Pod 2 │ │ Pod 3 │        │  │
│  │        └────────┘ └───────┘ └───────┘        │  │
│  └──────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────┘

apiVersion: v1
kind: Service
metadata:
  name: my-loadbalancer-service
spec:
  type: LoadBalancer
  selector:
    app: web
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

Characteristics:

✅ Automatic external IP provisioning
✅ Managed by cloud provider
✅ High availability
⚠️ Requires cloud environment or MetalLB (bare metal)
⚠️ Cost per LoadBalancer

Load Balancer Types

Kubernetes LoadBalancer services behave differently depending on the environment:

Environment	Implementation	How It Works
Cloud (AWS)	AWS ELB/NLB/ALB	Cloud provider provisions load balancer automatically
Cloud (GCP)	Google Cloud Load Balancer	GCP creates external load balancer with health checks
Cloud (Azure)	Azure Load Balancer	Azure provisions L4 load balancer with public IP
Bare Metal	MetalLB	Software-based load balancer using L2 (ARP) or L3 (BGP)
On-Premises	MetalLB / F5 / HAProxy	Requires manual configuration or MetalLB

Cloud vs On-Premises Load Balancing

Cloud Environment:

# Cloud LoadBalancer - Automatic provisioning
apiVersion: v1
kind: Service
metadata:
  name: my-app
spec:
  type: LoadBalancer
  selector:
    app: my-app
  ports:
  - port: 80
    targetPort: 8080

# Cloud provider automatically:
# 1. Provisions external load balancer
# 2. Assigns public IP address
# 3. Configures health checks
# 4. Routes traffic to nodes

Result: Service gets external IP from cloud provider (e.g., 34.123.45.67)

On-Premises/Bare Metal:

Without MetalLB:

$ kubectl get svc my-app
NAME     TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)
my-app   LoadBalancer   10.106.190.47   <pending>     80:30681/TCP

With MetalLB:

$ kubectl get svc my-app
NAME     TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)
my-app   LoadBalancer   10.106.190.47   10.28.31.101   80:30681/TCP

MetalLB provides two modes:

Layer 2 Mode: Uses ARP, one node handles traffic (simple setup)
Layer 3 Mode: Uses BGP, true load balancing across nodes (production)

📖 Detailed Guide: See metallb.md for comprehensive MetalLB setup, Layer 2 vs Layer 3 comparison, and configuration examples

EndpointSlices

Kubernetes uses EndpointSlices to track which Pods are backing a Service:

apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  name: my-service-abc123
  labels:
    kubernetes.io/service-name: my-service
addressType: IPv4
ports:
  - name: http
    protocol: TCP
    port: 8080
endpoints:
  - addresses:
      - "10.8.0.10"
    conditions:
      ready: true
  - addresses:
      - "10.8.0.11"
    conditions:
      ready: true

↑ Back to Table of Contents

External Access Patterns

Ingress Architecture

Ingress provides HTTP/HTTPS routing to services based on rules.

┌──────────────────────────────────────────────────────┐
│                        Internet                      │
│                            │                         │
│                  ┌─────────▼─────────┐               │
│                  │   DNS Resolution  │               │
│                  │  app.example.com  │               │
│                  └─────────┬─────────┘               │
│                            │                         │
│  ┌─────────────────────────┼──────────────────────┐  │
│  │         Kubernetes Cluster                     │  │
│  │                         │                      │  │
│  │             ┌───────────▼──────────┐           │  │
│  │             │  Ingress Controller  │           │  │
│  │             │  (NGINX/Traefik/     │           │  │
│  │             │   HAProxy)           │           │  │
│  │             │  LoadBalancer:80,443 │           │  │
│  │             └──────────┬───────────┘           │  │
│  │                        │                       │  │
│  │            Ingress Rules (routes)              │  │
│  │                        │                       │  │
│  │       ┌────────────────┼───────────────┐       │  │
│  │       │                │               │       │  │
│  │  ┌────▼─────┐    ┌─────▼─────┐   ┌─────▼────┐  │  │
│  │  │ Service  │    │  Service  │   │  Service │  │  │
│  │  │   web    │    │    api    │   │   admin  │  │  │
│  │  │ :80      │    │   :80     │   │   :80    │  │  │
│  │  └────┬─────┘    └────┬──────┘   └────┬─────┘  │  │
│  │       │               │               │        │  │
│  │  ┌────▼────┐     ┌────▼────┐     ┌────▼────┐   │  │
│  │  │  Pods   │     │  Pods   │     │  Pods   │   │  │
│  │  │  (web)  │     │  (api)  │     │ (admin) │   │  │
│  │  └─────────┘     └─────────┘     └─────────┘   │  │
│  └────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────┘

Routing Rules:
  app.example.com/        → web service
  app.example.com/api     → api service
  admin.example.com/      → admin service

Ingress Resource Example

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: web-service
            port:
              number: 80
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80
  - host: admin.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: admin-service
            port:
              number: 80
  tls:
  - hosts:
    - app.example.com
    - admin.example.com
    secretName: tls-secret

Gateway API (Next Generation)

The Gateway API is the evolution of Ingress, providing more expressive and extensible routing.

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: example-gateway
spec:
  gatewayClassName: example-gateway-class
  listeners:
  - name: http
    protocol: HTTP
    port: 80

↑ Back to Table of Contents

Network Policies

Network Policies control traffic flow between Pods (Layer 3/4 firewall rules).

Default Behavior (No Network Policy)

┌──────────────────────────────────────────┐
│         All Pods Can Talk to All         │
│                                          │
│  ┌────────┐    ┌────────┐    ┌────────┐  │
│  │Frontend│◄──►│Backend │◄──►│  DB    │  │
│  │  Pod   │    │  Pod   │    │  Pod   │  │
│  └────────┘    └────────┘    └────────┘  │
│      ▲             ▲             ▲       │
│      └─────────────┴─────────────┘       │
│           All traffic allowed            │
└──────────────────────────────────────────┘

With Network Policy (Restricted)

┌──────────────────────────────────────────┐
│       Network Policy Applied             │
│                                          │
│  ┌────────┐    ┌────────┐    ┌────────┐  │
│  │Frontend│───►│Backend │───►│  DB    │  │
│  │  Pod   │    │  Pod   │    │  Pod   │  │
│  └────────┘    └────────┘    └────────┘  │
│                                          │
│  ✓ Frontend → Backend allowed            │
│  ✓ Backend → DB allowed                  │
│  ✗ Frontend → DB blocked                 │
│  ✗ External → DB blocked                 │
└──────────────────────────────────────────┘

Network Policy Example

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-network-policy
  namespace: default
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: database
    ports:
    - protocol: TCP
      port: 5432

This policy:

Applies to Pods with label app=backend
Ingress: Only allows traffic from Pods with app=frontend on port 8080
Egress: Only allows traffic to Pods with app=database on port 5432

Important Notes

⚠️ Network Policies require CNI support: Not all CNI plugins implement NetworkPolicy

✅ Supports: Calico, Cilium, Weave
❌ Doesn't support: Flannel (basic mode)

↑ Back to Table of Contents

DNS Resolution

CoreDNS Architecture

┌─────────────────────────────────────────────────────────────┐
│                   Kubernetes Cluster                        │
│                                                             │
│  ┌──────────┐                                               │
│  │   Pod    │                                               │
│  │          │   1. DNS Query: my-service.default.svc        │
│  │          │   ────────────────────┐                       │
│  └──────────┘                       │                       │
│                                     ▼                       │
│                          ┌──────────────────┐               │
│                          │    CoreDNS       │               │
│                          │   (kube-system)  │               │
│                          │   ClusterIP:     │               │
│                          │   10.96.0.10     │               │
│                          └────────┬─────────┘               │
│                                   │                         │
│                    2. Resolves to ClusterIP                 │
│                                   │                         │
│                          ┌────────▼─────────┐               │
│                          │   Service        │               │
│                          │   my-service     │               │
│                          │   10.96.0.100    │               │
│                          └──────────────────┘               │
└─────────────────────────────────────────────────────────────┘

DNS Naming Convention

Kubernetes DNS follows this format:

<service-name>.<namespace>.svc.<cluster-domain>

Examples:

my-service - Same namespace (short form)
my-service.default - Specific namespace
my-service.default.svc.cluster.local - Fully qualified (FQDN)

Service DNS Records

# From a pod, resolve a service
nslookup my-service.default.svc.cluster.local

# Output:
# Name: my-service.default.svc.cluster.local
# Address: 10.96.0.100

Pod DNS Records

Pods get DNS records in this format:

<pod-ip-with-dashes>.<namespace>.pod.<cluster-domain>

Example: Pod with IP 10.8.0.5 in default namespace:

10-8-0-5.default.pod.cluster.local

Headless Services

For direct Pod-to-Pod discovery without load balancing:

apiVersion: v1
kind: Service
metadata:
  name: my-headless-service
spec:
  clusterIP: None  # Headless
  selector:
    app: database
  ports:
  - port: 5432

DNS returns all Pod IPs directly:

nslookup my-headless-service.default.svc.cluster.local

# Returns:
# Name: my-headless-service.default.svc.cluster.local
# Address: 10.8.0.10
# Address: 10.8.0.11
# Address: 10.8.1.5

Pod DNS Configuration

Pods can customize DNS settings:

apiVersion: v1
kind: Pod
metadata:
  name: custom-dns-pod
spec:
  containers:
  - name: app
    image: nginx
  dnsPolicy: "None"
  dnsConfig:
    nameservers:
      - 1.1.1.1
    searches:
      - default.svc.cluster.local
      - svc.cluster.local
      - cluster.local
    options:
      - name: ndots
        value: "5"

↑ Back to Table of Contents

Summary Reference

Communication Matrix

Source	Destination	Method	NAT Required	Component
Container	Container (same Pod)	`localhost`	❌ No	Shared namespace
Pod	Pod (same node)	Virtual bridge/veth	❌ No	CNI plugin
Pod	Pod (different node)	Overlay/BGP routing	❌ No	CNI plugin
Pod	Service	ClusterIP	❌ No	kube-proxy
External	Service (NodePort)	NodeIP:NodePort	✅ Yes	kube-proxy
External	Service (LoadBalancer)	External LB IP	✅ Yes	Cloud LB + kube-proxy
External	Service (Ingress)	Ingress rules (L7)	✅ Yes	Ingress Controller
Pod	Internet	Default gateway	✅ Yes (SNAT)	Node iptables

Port Ranges

Type	Port Range	Usage
Container Ports	1-65535	Application listens
Service Ports	1-65535	Service exposes
NodePort	30000-32767	External access via node

Key Concepts Summary

┌────────────────────────────────────────────────────────────┐
│ Networking Layer Stack                                     │
├────────────────────────────────────────────────────────────┤
│ Layer 7 (Application)                                      │
│   └─► Ingress / Gateway API (HTTP/HTTPS routing)           │
│                                                            │
│ Layer 4 (Transport)                                        │
│   └─► Service (TCP/UDP load balancing)                     │
│                                                            │
│ Layer 3 (Network)                                          │
│   └─► Pod IPs, CNI routing, NetworkPolicy                  │
│                                                            │
│ Layer 2 (Data Link)                                        │
│   └─► Virtual bridges, veth pairs                          │
└────────────────────────────────────────────────────────────┘

Quick Command Reference

# Check pod IP and network namespace
kubectl get pod <pod-name> -o wide

# Get service endpoints
kubectl get endpoints <service-name>
kubectl get endpointslices

# Test service connectivity from a pod
kubectl exec -it <pod-name> -- curl <service-name>

# Check DNS resolution
kubectl exec -it <pod-name> -- nslookup <service-name>

# View kube-proxy mode
kubectl logs -n kube-system <kube-proxy-pod>

# Check network policies
kubectl get networkpolicies

# Describe service for details
kubectl describe service <service-name>

# Test external access (NodePort)
curl http://<node-ip>:<node-port>

Best Practices

✅ DO:

Use Services for stable endpoints
Implement NetworkPolicies for security
Use Ingress for HTTP/HTTPS routing
Choose appropriate Service type for your use case
Use DNS names instead of hardcoded IPs
Monitor CNI plugin health
Use headless services for StatefulSets

❌ DON'T:

Hardcode Pod IPs (they change)
Expose NodePort in production (use LoadBalancer/Ingress)
Forget to implement NetworkPolicies
Use host network mode unless necessary
Bypass Services for Pod-to-Pod communication

↑ Back to Table of Contents

📚 Additional Resources

Last Updated: December 2025 Kubernetes Version: v1.28+

Uh oh!

FilesExpand file tree

k8s-networking-fundamentals.md

Latest commit

History

k8s-networking-fundamentals.md

File metadata and controls

🌐 Kubernetes Networking Fundamentals

📋 Table of Contents

Core Network Model

Fundamental Principles

Architecture Components

Key Components

Understanding Kubernetes Networks

1. Node Network (Underlay/Physical Network)

2. Pod Network (Overlay Network)

3. Service Network (ClusterIP Range)

Network Relationships and Data Flow

Network Configuration Summary

Important Network Rules

Checking Your Network Configuration

Real-World Configuration Example

Container-to-Container Communication

Within the Same Pod

Characteristics

Example Communication

Pod-to-Pod Communication

Same Node Communication

How It Works

Cross-Node Communication

CNI Implementation Methods

Example: Cross-Node Pod Communication

Service Discovery and Load Balancing

kube-proxy: How Services Work

The Service Abstraction

Service Types

1. ClusterIP (Default)

2. NodePort

3. LoadBalancer

Load Balancer Types

Cloud vs On-Premises Load Balancing

EndpointSlices

External Access Patterns

Ingress Architecture

Ingress Resource Example

Gateway API (Next Generation)

Network Policies

Default Behavior (No Network Policy)

With Network Policy (Restricted)

Network Policy Example

Important Notes

DNS Resolution

CoreDNS Architecture

DNS Naming Convention

Service DNS Records

Pod DNS Records

Headless Services

Pod DNS Configuration

Summary Reference

Communication Matrix

Port Ranges

Key Concepts Summary

Quick Command Reference

Best Practices

📚 Additional Resources