Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
0ac8470
fix(docs): install docs dependencies using venv
riccardotornesello Nov 29, 2025
c998360
doc(network): migrate Francesco Cheinasso's documents to the main doc…
riccardotornesello Nov 29, 2025
b331c98
Update Path
MircoBarone Dec 2, 2025
a6de227
Update paths
MircoBarone Dec 2, 2025
2d1d727
docs: improving networking overview
MircoBarone Dec 2, 2025
e2ade8d
docs:Example Added
MircoBarone Dec 2, 2025
1452da0
docs:Images Changed
MircoBarone Dec 5, 2025
beb1a80
Docs: images-minor changes
MircoBarone Dec 5, 2025
810b103
docs:Added image S-DNAT
MircoBarone Dec 5, 2025
93a5f50
doc:Added Wireguard/Geneve pictures
MircoBarone Dec 5, 2025
6538451
docs: Wireguard/Geneve links updated
MircoBarone Dec 5, 2025
6470a65
Docs:Minor Changes
MircoBarone Dec 5, 2025
311e0b8
docs:Rewrite route-example
MircoBarone Dec 13, 2025
f3b0027
docs:Updated Links
MircoBarone Dec 13, 2025
1f3c593
docs:Updated Basic Example
MircoBarone Dec 14, 2025
1f9823d
docs: improve FirewallConfiguration examples
MircoBarone Dec 18, 2025
5c1b202
fix(docs): remove committed submodule
riccardotornesello Dec 18, 2025
6653eeb
docs: remove leftovers and format documents
riccardotornesello Dec 18, 2025
4a0bf03
docs: suggest documentation autoreload in readme
riccardotornesello Dec 18, 2025
8078d6b
docs: fix small text
riccardotornesello Dec 18, 2025
014a8d1
fix(docs): fix makefile catchall installation and format markdown
riccardotornesello Dec 18, 2025
7f8861a
docs: add known limitations for Geneve and Kubernetes Services
Gabbe64 Jan 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ liqo_kubeconf*

hack/code-generator
docs/_build
docs/.venv

# kubebuilder generated files
/config
Expand Down
20 changes: 14 additions & 6 deletions docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,11 @@
# from the environment for the first two.
SPHINXOPTS ?= -W -n --keep-going
SPHINXAUTOOPTS ?= -W -n
SPHINXBUILD ?= sphinx-build
SPHINXAUTOBUILD ?= sphinx-autobuild
VENV ?= .venv
PYTHON ?= $(VENV)/bin/python
PIP ?= $(VENV)/bin/pip
SPHINXBUILD ?= $(VENV)/bin/sphinx-build
SPHINXAUTOBUILD ?= $(VENV)/bin/sphinx-autobuild
SOURCEDIR = .
BUILDDIR = _build

Expand All @@ -18,17 +21,22 @@ help:

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
%: install Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

install:
pip3 install -r requirements.txt
@if [ ! -d "$(VENV)" ]; then python3 -m venv $(VENV); fi
@$(PIP) install --upgrade pip setuptools wheel
@$(PIP) install -r requirements.txt

serve: install html
docker run --rm -it -v $(PWD)/_build/html:/usr/share/nginx/html -p 8080:80 nginx
@$(PYTHON) -m http.server 8080 --directory $(BUILDDIR)/html

autoserve: install
@$(SPHINXAUTOBUILD) "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXAUTOOPTS) --port 8081 $(O)
@$(SPHINXAUTOBUILD) "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXAUTOOPTS) --port 8080 $(O)

lint:
markdownlint lint -c ../.markdownlint.yml -i usage/liqoctl **/*.md

clean-venv:
@rm -rf $(VENV)
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,5 +33,5 @@ Images should be put in the [docs/_static/images folder](docs/_static/images).
In case you want to browse the documentation website on your local computer, the Makefile provides a simple way to do so and make it available at [http://localhost:8080](http://localhost:8080).

```bash
make serve
make autoserve
```
1 change: 1 addition & 0 deletions docs/_static/images/architecture/network/SNAT-DNAT.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
13 changes: 13 additions & 0 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,19 @@ subtrees:
- file: advanced/external-ip-remapping.md
- file: advanced/k8s-api-server-proxy.md

- caption: Architecture
entries:
- file: architecture/network/main.md
entries:
- file: architecture/network/basic.md
- file: architecture/network/basic_example.md
- file: architecture/network/leaftoleaf.md
- file: architecture/network/nodeport.md
- file: architecture/network/firewallconfiguration.md
- file: architecture/network/routeconfiguration.md
- file: architecture/network/ip.md
- file: architecture/network/internalnode.md

- caption: Contributing
entries:
- file: contributing/contributing.md
Expand Down
106 changes: 106 additions & 0 deletions docs/architecture/network/basic.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# Basic Concepts

Liqo network is a cloud-native solution that enables seamless communication between Kubernetes clusters. In particular, it allows pod-to-pod communication across clusters. This means that the **services** and **endpointslices** reflection is not managed by the networking module (it is managed by the virtual-kubelet).

The two main areas of the Liqo network are:

- **Internal network**: it manages the traffic inside the cluster and forwards it between nodes and gateways. It uses the Geneve protocol to encapsulate the traffic (the green tunnels in the image below).
- **External network**: it manages the traffic between different clusters. It uses WireGuard to create a secure tunnel between gateways (the orange tunnels in the image below).

![basic](../../_static/images/architecture/network/basev2.excalidraw.png)

## Internal network

In the internal network, every node is connected to each gateway (one for each peering) using a dedicated Geneve tunnel. This architecture ensures that:

- For each node, there is a dedicated Geneve interface on every gateway.
- For each gateway, there is a dedicated Geneve interface on every node.

Each Geneve interface is assigned an IP address from the **internal CIDR**. This allows the interfaces to be reachable and enables you to ping the Geneve interfaces both from the node and from the gateway, facilitating connectivity checks and troubleshooting.

### Details on GENEVE

[Geneve](https://github.com/torvalds/linux/blob/master/drivers/net/geneve.c) is used to traverse the CNI and reach the Gateway efficiently, without altering the original traffic generated by the pod.
The GENEVE interface encapsulates Ethernet traffic into a UDP datagram without encryption (Ethernet over IP).
![basic](../../_static/images/architecture/network/doc-Geneve.drawio.png)
The external IP packet has the node's address as its source address and the gateway as its destination address. The destination port is, by default for Liqo, 6091.
Once encapsulated, the packet can be processed by the CNI without issues, as its destination is clearly defined.
The encapsulated traffic is reintroduced into the CNI and routed to the Gateway. If the Gateway resides on a different node, the traffic may undergo additional encapsulation, resulting in nested encapsulation. This behavior depends on the CNI's handling of inter-node communication and is beyond the scope of Liqo.

#### Known Limitations: Geneve and Kubernetes Services

The use of Kubernetes Services (ClusterIP, NodePort, or LoadBalancer) as endpoints for Geneve tunnels is **not supported**. Kubernetes Services do not represent physical network interfaces but are implemented via DNAT rules and Connection Tracking.
This architecture is incompatible with the Geneve protocol, as the modification of IP headers and the lack of a direct physical binding prevent proper tunnel establishment and packet decapsulation. **Geneve Tunnels must be established directly on Pod or Node IPs.**

This is the reason why internal networking does not use ClusterIPs, but instead relies on dedicated controllers that dynamically track the actual Gateway IP, with specific implementations for the [client](https://github.com/liqotech/liqo/blob/8543b7ff7c34e601b60cde3f1666f3146ed29b71/pkg/liqo-controller-manager/networking/internal-network/client-controller/client_controller.go#L130) and [server](https://github.com/liqotech/liqo/blob/8543b7ff7c34e601b60cde3f1666f3146ed29b71/pkg/liqo-controller-manager/networking/internal-network/server-controller/server_controller.go#L130) roles.

## External network

In the external network, traffic between different clusters is managed.

Each WireGuard tunnel is created inside a dedicated gateway pod. For every peering, a dedicated gateway pod is deployed in both clusters involved in the connection. Each gateway pod establishes a secure WireGuard tunnel that connects only to a single remote cluster. This means that every gateway can point to just one remote cluster, ensuring isolation and security for each inter-cluster connection.

There are two types of gateway pods: **gateway-client** and **gateway-server**.

- The **gateway-client** acts as a WireGuard client, initiating the connection to the remote cluster.
- The **gateway-server** acts as a WireGuard server, accepting incoming connections from remote clients. The gateway-server must expose itself using a Kubernetes Service, which can be of type LoadBalancer, NodePort, or ClusterIP, depending on the cluster's networking setup and requirements.

### Gateway Details

Each gateway in Liqo is equipped with at least three distinct interfaces:

- The default interface, which serves as the direct link to external networks.
- A GENEVE interface for each node within the cluster (at least one in the case of a single-node cluster).
- A tunnel interface for communication with the remote Gateway, WireGuard by default.

The Gateway is also responsible for performing DNAT and SNAT to support the remapping of the remote cluster.

#### Further details about GENEVE

GENEVE-encapsulated traffic arrives at the gateway via the default interface. A GENEVE socket listens on port 6091. Here, the traffic is processed, decapsulated, and reintroduced into the gateway through the GENEVE interface itself. Once the packets are ready for transmission to the remote cluster, they are directed through the WireGuard interface using policy routing.

_Note_: Within the Gateway, a route is also configured to direct all traffic arriving from the tunnel and destined for a local pod back to the appropriate GENEVE interface.

#### Details on Wireguard

[WireGuard](https://www.wireguard.com/) is a secure protocol that encapsulates IP packets within UDP, ensuring encryption of the inner content. Unlike GENEVE, WireGuard operates at Layer 3 (L3), meaning its interface does not possess a MAC address.
In Liqo, during configuration, the local WireGuard interface is set to contact the appropriate endpoint, which in this case is the address of the node hosting the remote Gateway.
Once encapsulated, the traffic is reintroduced into the Gateway and exits via the default interface. It then traverses the node and exits through its primary interface to reach the remote endpoint.
![basic](../../_static/images/architecture/network/doc-Wireguard.drawio.png)

### Recap of Packets flow

The flow of a packet from a pod in one cluster to a pod in another cluster involves several steps:

1. The packet originates from a pod and reaches its node through a virtual Ethernet interface. This is managed by the CNI and is standard Kubernetes networking behavior (Liqo is not responsible for this part).
2. From the node, the packet enters a Geneve tunnel using a specific [route](routeconfiguration.md#remote-cluster-id-node-gw-node).The packet arrives at the other end of the Geneve tunnel, inside the gateway pod.
3. The gateway does DNAT and routes the packet into the WireGuard tunnel([check [this route](routeconfiguration.md#remote-cluster-id-gw-ext-gateway)]).
4. The traffic reaches the gateway pod in the remote cluster. It is decrypted by the WireGuard driver and reinserted into the stack. It is then routed to the correct Geneve interface to reach the node hosting the target pod.This is achieved using the following [routes](routeconfiguration.md#local-cluster-id-node-name-gw-node-gateway). After routing, SNAT is applied.
5. The packet is then forwarded from the node to the target pod, leveraging the CNI.

The return traffic follows the same path in reverse, ensuring symmetric routing between the two pods across clusters.

![Packets flow](../../_static/images/architecture/network/baseflow-v2.excalidraw.png)

## CIDR Remapping

Liqo enables two clusters to peer even if they have the same pod CIDR. Each cluster can independently remap the remote cluster’s pod CIDR to a different, locally chosen CIDR. This remapped CIDR is decided by each cluster on its own and is not shared with other clusters.

For example, if both Cluster A and Cluster B use `10.0.0.0/16` as their pod CIDR, Cluster A can remap Cluster B’s pod CIDR (`10.0.0.0/16`) to a new CIDR, such as `11.0.0.0/16`. As a result, Cluster A will be able to reach a pod in Cluster B with IP `10.0.0.6` using the remapped IP `11.0.0.6`. The same applies in the opposite direction if Cluster B chooses to remap Cluster A’s CIDR.

When a packet directed to a remote pod enters the gateway, its destination IP is the remapped one. The gateway performs a DNAT operation to replace this remapped address (used only within the local cluster) with the corresponding real IP address used by the remote cluster.
![Packets flow](../../_static/images/architecture/network/SNAT-DNAT.svg)

In the example above a packet originating from a local pod destined for a remote pod enters the Gateway with the IP address `11.0.0.6` and will be translated to `10.0.0.6` by the Gateway before being sent to the remote cluster.

In a similar way, when the gateway receives incoming traffic, its source IP is the real one used by the remote cluster, so it is SNATed to the remapped one before being sent to the appropriate pod.

All the remapping logic is managed by the IPAM component of liqo, which tracks the used IPs and CIDRs inside the cluster.

All the firewall rules for managing this remapping is handled inside the gateway pods, ensuring seamless communication even in the presence of overlapping CIDRs.

The same logic is applied also to the **external CIDR**. This means that every cluster will be able to remap its neighboring cluster's external CIDR to a different, locally chosen CIDR. Pay attention that if you don't change the default external CIDR used by Liqo (check the helm values), every cluster will use the same external CIDR and neighbour's external CIDRs will appear always remapped.

## Example

A full example of the entire path can be found [here](./basic_example.md)
115 changes: 115 additions & 0 deletions docs/architecture/network/basic_example.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Pod-to-Pod Example

Let's analyze and debug the complete path of an ICMP packet sent from a pod in the consumer cluster to a pod in the provider cluster.

## Setup

For this example, we will use two Kubernetes-in-Docker (KinD) clusters with only one worker each and Calico as the CNI.

The entire flow of the packet is depicted in the image below

![setup](../../_static/images/architecture/network/doc-example.drawio.png)

Here are the details of the two clusters:

- **Cluster Rome (Local):**
- **Pod:** `nginx-local` (`10.200.1.2`)
- **Gateway:** `gw-local` (`10.200.1.3`)
- **Node:** `rome-worker` (`172.18.0.2`)
- **Cluster Milan (Remote):**
- **Pod:** `nginx-remote` (`10.200.1.4`)
- **Gateway:** `gw-remote` (`10.200.1.5`)
- **Node:** `milan-worker` (`172.18.0.3`)

Since both clusters use `10.200.0.0/16` as the pod CIDR, Liqo remaps the pod CIDR of the remote cluster to `10.201.0.0/16` on the local cluster.

This means that, from the perspective of the local cluster, the remote pod `nginx-remote` will appear to have the IP address `10.201.1.4`.

This examples follows the path of an ICMP Echo Request packet sent from `nginx-local` to `nginx-remote` using the command: `ping 10.201.1.4`.

## The packet flow

### Local Node

The ICMP packet exits the pod's default interface (e.g. `eth0`) and immediately enters the worker node's root network namespace via the other end of the veth pair (e.g., `caliaaaaa`). The node must now decide where to send this packet.

- **Policy Routing:** A custom policy [routing rule](routeconfiguration.md#remote-cluster-id-node-gw-node) matches the destination IP `10.201.0.0/16`. The matching rule is as follows: `10.201.0.0/16 via 10.71.0.3 dev liqo.00000`. This route sends the packet to a special interface named `liqo.00000`, which is a **GENEVE tunnel**.
- **Encapsulation:** This interface encapsulates the _entire L2 Ethernet frame_ (not just the IP packet) inside a new UDP packet.
- **Inner Packet (Original):** `10.200.1.2` > `10.201.1.4`
- **Outer Packet (New):**
- **Source IP:** `172.18.0.2` (the `rome-worker` node)
- **Destination IP:** `10.200.1.3` (the `gw-local` pod)
- **Destination Port:** `6091` (the GENEVE port)

This new, larger UDP packet is then sent via the standard CNI network to the `gw-local` pod.

In this case, with just one node, the packet is sent directly to the `gw-local` pod, however in a multi-node cluster, the CNI may route it to another node first.

```{admonition} TIP
To inspect the policy routing, use the command `ip rule show all` to find the correct table and `ip route show table <table_id>`.
```

### Local Gateway

The encapsulated packet arrives at the gateway's default interface (e.g. `eth0`).

- **Decapsulation:** The gateway's networking stack recognizes the `6091` destination port and directs the packet to its own GENEVE interface (e.g., `liqo.11111`), which is paired with the worker's interface. The gateway strips the outer UDP/IP headers, extracting the original inner packet:
> `SRC: 10.200.1.2` > `DST: 10.201.1.4`
- **DNAT (Destination NAT):** This packet's destination (`10.201.1.4`) is meaningless to the remote cluster. The gateway must translate it back to the pod's _real_ IP. This is done using **`nftables`**. A rule in the `remap-podcidr` table performs a Destination NAT:
> `ip daddr 10.201.0.0/16 ... dnat prefix to 10.200.0.0/16`
- **Packet Transformation:**
- **Before DNAT:** `10.200.1.2` > `10.201.1.4`
- **After DNAT:** `10.200.1.2` > `10.200.1.4`
- **Routing to WireGuard:** Now that the packet has its _real_ destination, it's routed to the inter-cluster tunnel. The [route](routeconfiguration.md#remote-cluster-id-gw-ext-gateway) is: `10.200.0.0/16 via 169.254.18.1 dev liqo-tunnel`

```{admonition} TIP
To know which is the other end of a GENEVE interface, use the command `ip -d link`, both ends must have the same ID.
```

### WireGuard Tunnel

The packet is sent to the `liqo-tunnel` interface, which is (by default) a **WireGuard** interface.

- **Encapsulation:** WireGuard encrypts the IP packet and wraps it in a new UDP packet for transport.
- **Inner Packet (NATTed):** `10.200.1.2` > `10.200.1.4`
- **Outer Packet:**
- **Source IP:** `10.200.1.3` (the `gw-local` pod)
- **Destination IP:** `172.18.0.3` (the `milan-worker` node, where `gw-remote` runs)
- **Node-Level SNAT:** This encrypted packet exits the `gw-local` pod and goes back to the `rome-worker` node. The node performs _another_ SNAT (Source NAT) to make the packet routable on the external network.
- **Before SNAT:** `10.200.1.3:36252` > `172.18.0.3:31864`
- **After SNAT:** `172.18.0.2:17806` > `172.18.0.3:31864`

This final, encrypted, and twice-NATted packet now travels "across the internet" (or in this case, the Docker network) from `rome-worker` to `milan-worker`.

### Remote Node

The encrypted packet arrives at the `milan-worker`'s default interface (e.g. `eth0`).

- **Node-Level DNAT:** Since the packet is addressed to the node itself, the node's kube-proxy intercepts it and performs a Destination NAT to send it to the `gw-remote` pod.
- **Before DNAT:** `172.18.0.2:17806` > `172.18.0.3:31864`
- **After DNAT:** `172.18.0.3:44207` > `10.200.1.5:51840`
- **Routing:** The packet is sent to the pod according to the CNI routing rules. In our example the rule is: `10.200.1.5 dev caliccccc scope link`

### Remote Gateway

The encrypted packet arrives at the `gw-remote`'s default interface (e.g. `eth0`).

- **Decapsulation (WireGuard):** The gateway listens on port `51840` .
> **Decrypted Packet:** `SRC: 10.200.1.2` > `DST: 10.200.1.4`
- **SNAT (Source NAT):** This packet is almost ready, but the _source_ IP (`10.200.1.2`) is unknown to the `nginx-remote` pod and is overlapping with its own Pod CIDR. It must be natted to the _remapped_ IP that Cluster Milan expects. The gateway performs a Source NAT.
- **Before SNAT:** `10.200.1.2` > `10.200.1.4`
- **After SNAT:** `10.201.1.2` > `10.200.1.4`
- **Routing to GENEVE:** The packet is now [routed](routeconfiguration.md#local-cluster-id-node-name-gw-node-gateway) to the destination pod via another GENEVE tunnel that connects the remote gateway to its worker. In our example the rule is: `10.200.1.4 via 10.71.0.2 dev liqo.22222`
- **Encapsulation (GENEVE):** The packet is encapsulated one last time for delivery to the worker node.
- **Inner Packet:** `10.201.1.2` > `10.200.1.4`
- **Outer Packet:** `10.200.1.5` > `172.18.0.3` (Port `6091`)

### Final Delivery

1. The GENEVE packet exits the `gw-remote` pod and enters the `milan-worker` node, which in our case is also the node hosting the target pod.
2. The node's GENEVE interface (`liqo.33333`) decapsulates the packet, retrieving:
> `SRC: 10.201.1.2` > `DST: 10.200.1.4`
3. The node's routing table (created by the CNI plugin) directs this packet to the final pod via the appropriate end of the `veth` pair.
4. The packet travels across the `veth` and arrives at the default interface of the `nginx-remote` pod.

The `nginx-remote` pod successfully receives the ICMP Echo Request from the (remapped) source `10.201.1.2` and sends an Echo Reply, which follows the entire path in reverse.
Loading