Skip to content

Latest commit

 

History

History
424 lines (321 loc) · 17 KB

File metadata and controls

424 lines (321 loc) · 17 KB

Networking

This document explains how networking works in Nightshift — from how chicklets get IP addresses, to how users connect over SSH, to how HTTPS URLs route traffic to individual chicklets.

Overview

There are four network paths into a chicklet:

                          INTERNET
                             │
             ┌───────────────┼───────────────┐
             │               │               │
          :443 (TLS)      :2222 (SSH)    :30000-32767
             │               │            (NodePort)
        ┌────┴────┐    ┌─────┴─────┐         │
        │  Caddy  │    │ SSH Proxy │         │
        └────┬────┘    └─────┬─────┘         │
             │               │               │
        localhost:8080       │               │
             │               │               │
        ┌────┴────┐          │               │
        │   API   │          │               │
        │ Server  │          │               │
        └────┬────┘          │               │
             │               │               │
        reverse proxy   SSH tunnel      K8s Service
             │               │           (NodePort)
             │               │               │
             └───────────────┼───────────────┘
                             │
                     Flannel CNI (10.42.0.0/16)
                             │
                  ┌──────────┼──────────┐
                  │          │          │
              Pod A      Pod B      Pod C
            10.42.x.a  10.42.x.b  10.42.x.c
                  │          │          │
              Kata VM    Kata VM    Kata VM
Path Port Purpose Auth
Caddy → API reverse proxy 443 HTTPS access to chicklet URLs (myapp-myorg.chicklet.io) URL auth mode (public or API key)
SSH proxy 2222 Interactive shell, chicklet console, chicklet exec SSH public key
NodePort 30000-32767 Direct TCP access to exposed ports None
Caddy → API 443 API requests (api.chicklet.io) API key bearer token

Pod Networking (Flannel CNI)

K3s ships with Flannel as its default CNI. Every chicklet pod gets a unique IP from the 10.42.0.0/16 subnet.

How a chicklet gets its IP

  1. The API creates a Chicklet CRD in a K8s namespace — user-<id> for personal chicklets, org-<slug> for org chicklets
  2. The Chicklet controller (internal/controller/workspace.go) creates a pod named cl-<name> in that namespace with RuntimeClassName: kata-clh
  3. K3s schedules the pod and Flannel assigns an IP from 10.42.0.0/16
  4. The Kata runtime (io.containerd.kata.v2) starts a Cloud Hypervisor VM, creates a TAP device on the Flannel bridge, and passes it to the guest kernel
  5. The controller polls pod.Status.PodIP and writes it to Chicklet.Status.PodIP

The pod IP is reachable from the host and from other pods (subject to NetworkPolicy). Since all chicklets run on a single node, Flannel doesn't need cross-node tunneling — traffic stays on the local bridge.

K8s namespaces

Chicklets are isolated into namespaces by owner:

Context Namespace Example
Personal chicklet user-<user-id> user-5
Org chicklet org-<org-slug> org-myteam

Namespaces are created automatically when users register or create orgs. This allows chicklets in different orgs (or from different users) to share the same name without collision.

Kata VM networking

Each chicklet pod runs inside a Kata Containers VM using Cloud Hypervisor. The network path is:

Flannel bridge (cni0)
       │
   veth pair
       │
   TAP device
       │
  virtio-net (guest)
       │
  eth0 inside VM
       │
  chicklet process

Flannel creates a veth pair: one end goes on cni0 (the bridge), the other is handed to the Kata shim. The shim creates a TAP device and attaches it to the VM via virtio-net. Inside the VM, the guest kernel sees a normal eth0 interface with the pod IP assigned. From the network's perspective, the VM behaves exactly like a regular container.


SSH Proxy

The SSH proxy (cmd/proxy/main.go, internal/sshproxy/proxy.go) multiplexes SSH connections from users to individual chicklet pods. It listens on port 2222.

Connection flow

User: ssh -p 2222 myapp@ssh.chicklet.io
                    │
                    ▼
          ┌─────────────────┐
          │   SSH Proxy     │
          │   :2222         │
          │                 │
          │ 1. SSH handshake│
          │ 2. Username =   │
          │    chicklet name│
          │ 3. Validate key │
          │    against K8s  │
          │    secret        │
          └────────┬────────┘
                   │
              4. Look up
              pod IP
                   │
                   ▼
          ┌─────────────────┐
          │ Pod sshd (:22)  │
          │ 10.42.x.x      │
          │                 │
          │ User: chicklet  │
          │ Auth: proxy key │
          └─────────────────┘
  1. User connects to the proxy on port 2222. The SSH username is the chicklet name (e.g., myapp).

  2. Public key validation: The proxy reads the K8s secret cl-<name>-ssh which contains the chicklet's authorized_keys (user SSH keys + proxy's own public key). The user's key must match one of the stored keys.

  3. Pod IP lookup: The proxy queries the Chicklet K8s resource to get status.podIP. Results are cached for 30 seconds to avoid excessive K8s API calls.

  4. Upstream SSH connection: The proxy opens an SSH connection to <podIP>:22 as user chicklet, authenticating with its own Ed25519 host key. The chicklet's sshd trusts this key because the controller added the proxy's public key to the chicklet's authorized_keys.

  5. Channel piping: All SSH channels (session, port forwarding, subsystem) and requests (pty-req, shell, window-change, etc.) are bidirectionally forwarded between the client and the upstream connection.

SSH key trust chain

User's private key  ──authenticates──▶  SSH Proxy
                                            │
                                        Proxy's host key
                                            │
                                    ──authenticates──▶  Pod sshd

The controller ensures this works by building the authorized_keys in the SSH secret:

// internal/controller/workspace.go — ensureSSHSecret()
keys := cl.Spec.SSHKeys                          // user's SSH public keys
if proxyKey := r.getProxyPublicKey(); proxyKey != "" {
    keys = append(keys, proxyKey)                 // proxy's public key
}

The secret is mounted at /ssh-keys/authorized_keys inside the pod. An init script copies it to /home/chicklet/.ssh/authorized_keys, and a background loop re-syncs every 30 seconds to pick up key additions/removals.

Host key generation

On first startup, the proxy generates an Ed25519 key pair and saves it to /var/lib/chicklet/ssh_proxy_host_key. This key is stable across restarts — users won't see host key warnings unless the proxy is reinstalled.


Port Exposure (NodePort Services)

When a user runs chicklet cl ports myapp --add 3000, the API updates Chicklet.Spec.Ports on the K8s resource. The controller reconciles this into a K8s Service.

How it works

  1. API sets spec.ports: [3000] on the Chicklet resource
  2. Controller calls ensureService(), which creates or updates a Service:
apiVersion: v1
kind: Service
metadata:
  name: cl-myapp-svc
  namespace: default
spec:
  type: NodePort
  selector:
    chicklet: myapp
  ports:
    - name: port-3000
      port: 3000
      targetPort: 3000
      protocol: TCP
      # nodePort: assigned by K8s (30000-32767)
  1. K8s assigns a NodePort from the 30000-32767 range
  2. Controller reads back the assigned port and writes it to Chicklet.Status.NodePorts:
NodePorts: { 3000: 31960 }   // container port → node port
  1. Traffic to <node-ip>:31960 is routed by kube-proxy to the pod's port 3000

NodePort vs pod IP

Access method Address Use case
NodePort <public-ip>:31960 Direct external access, any TCP protocol
Pod IP 10.42.x.x:3000 Internal access (API reverse proxy, SSH proxy)

The HTTPS reverse proxy (described below) uses the pod IP directly, not the NodePort, since it runs on the same host.

Lifecycle

  • Stop: When a chicklet is stopped, the controller deletes both the pod and the service. NodePorts are released.
  • Start: A new pod is created. If ports are still in the spec, a new service is created. NodePort numbers may change.
  • Delete: The controller's finalizer cleans up the pod, service, PV, PVC, and SSH secret.

HTTPS Reverse Proxy (Chicklet URLs)

Chicklets belonging to an organization get URLs like https://myapp-myorg.chicklet.io/. This is implemented as a three-layer stack: DNS, TLS, and HTTP reverse proxy.

Request flow

curl https://myapp-myorg.chicklet.io/hello
                    │
                    ▼
             ┌────────────┐
             │   Caddy     │  1. TLS termination (on-demand cert)
             │   :443      │  2. Forward to localhost:8080
             └──────┬──────┘
                    │
                    ▼
             ┌────────────┐
             │ API Server  │  3. Host header: myapp-myorg.chicklet.io
             │ RootHandler │  4. Parse DNS label: "myapp-myorg"
             │             │  5. DB lookup: name="myapp", org slug="myorg"
             │             │  6. Check URL auth mode
             │             │  7. K8s lookup: get pod IP + ports
             └──────┬──────┘
                    │
                    ▼
             ┌────────────┐
             │  Pod sshd   │  8. httputil.ReverseProxy →
             │ 10.42.x.x  │     http://10.42.x.x:3000/hello
             │ :3000       │
             └─────────────┘

Layer 1: DNS

When a chicklet with an org is created, the API creates a DNS A record:

myapp-myorg.chicklet.io  →  A  →  44.216.161.189  (TTL: 60s)

The Vercel Domains API is used for record management. The record ID is stored in the chicklets.dns_record_id column so it can be deleted when the chicklet is removed.

If CHICKLET_VERCEL_TOKEN is not set, a no-op provider is used — no records are created, but the rest of the system still works (URLs are shown in API responses but won't resolve).

Layer 2: TLS (Caddy on-demand)

Caddy handles TLS termination with on-demand certificate issuance:

{
    on_demand_tls {
        ask http://localhost:8080/v1/internal/check-domain
    }
}

api.chicklet.io {
    reverse_proxy localhost:8080
}

:443 {
    tls {
        on_demand
    }
    reverse_proxy localhost:8080
}

When a new subdomain is requested for the first time:

  1. Caddy calls /v1/internal/check-domain?domain=myapp-myorg.chicklet.io
  2. The API parses the DNS label and queries the DB: SELECT ... FROM chicklets JOIN orgs WHERE name || '-' || slug = ?
  3. If the chicklet exists → 200 (proceed with cert issuance). If not → 404 (reject).
  4. Caddy obtains a certificate from Let's Encrypt via HTTP-01 challenge
  5. The cert is cached for future requests

Layer 3: HTTP reverse proxy

The API server's RootHandler() inspects the Host header on every request:

  • api.chicklet.io → normal API routing (chi router)
  • *.chicklet.ioProxyChickletRequest()
  • Everything else → normal API routing

ProxyChickletRequest does the following:

  1. Extracts the DNS label from the host (e.g., myapp-myorg from myapp-myorg.chicklet.io)
  2. Looks up the chicklet by DNS label in the DB (GetChickletByDNSLabel) — returns the chicklet ID, name, namespace, and org ID
  3. Checks the URL auth mode:
    • public: no auth required
    • chicklet: requires Authorization: Bearer <api-key> header, validated against the chicklet owner
  4. Fetches the K8s Chicklet resource from the correct namespace to get status.podIP and status.nodePorts
  5. Picks the first exposed container port from the NodePorts map
  6. Creates an httputil.SingleHostReverseProxy targeting http://<podIP>:<port>
  7. Forwards the request

Error responses

Condition HTTP status Message
DNS label doesn't match any chicklet 404 not found
Auth mode is chicklet and no Bearer token 403 requires authentication
Invalid API key 401 unauthorized
Pod not found in K8s 502 chicklet not available
Chicklet is stopped 503 chicklet is not running
No ports exposed 502 no ports exposed
Pod unreachable 502 bad gateway

Network Isolation

A Kubernetes NetworkPolicy (deploy/networkpolicy.yaml) enforces isolation between chicklets.

Policy rules

Ingress (traffic into chicklets):

Source Ports Purpose
Any (0.0.0.0/0) TCP 22 SSH from proxy (runs on host network)
Any (0.0.0.0/0) TCP 1-65535 NodePort service traffic and API reverse proxy

Egress (traffic out of chicklets):

Destination Ports Purpose
Any UDP/TCP 53 DNS resolution
Any TCP 80, 443 HTTP/HTTPS (apt, pip, npm, curl, git, etc.)
Any except 10.42.0.0/16 All Allow outbound but block pod-to-pod

The except: 10.42.0.0/16 rule is the key isolation mechanism. It allows chicklets to reach the internet but prevents them from reaching other chicklet pods. This means:

  • Chicklet A cannot connect to Chicklet B on any port
  • Chicklet A can reach external services (GitHub, PyPI, npm, etc.)
  • The SSH proxy can reach any chicklet (it runs on the host, outside 10.42.0.0/16)
  • The API server can reach any chicklet (same reason — reverse proxy uses pod IP from the host)

DNS Label Format

Chicklet URLs use the format <chicklet-name>-<org-slug>.<domain>. The full DNS label (the part before the first dot) must be at most 63 characters per the DNS specification.

The API enforces this at creation time:

dnsLabel := req.Name + "-" + orgSlug
if len(dnsLabel) > 63 {
    // rejected
}

The DB lookup for reverse proxying joins chicklet name and org slug:

SELECT c.id, c.name, c.namespace, c.org_id FROM chicklets c
JOIN orgs o ON c.org_id = o.id
WHERE c.name || '-' || o.slug = ?

Since both chicklet names and org slugs can contain hyphens, the label is not parsed — it's looked up as a whole string against the concatenation in the DB.


K8s Resources Per Chicklet

Each running chicklet creates these K8s resources in its namespace (user-<id> or org-<slug>):

Resource Name Namespace Purpose
Namespace user-<id> or org-<slug> Created on user registration / org creation
Pod cl-<name> chicklet's ns The Kata VM running the chicklet workspace
Secret cl-<name>-ssh chicklet's ns SSH authorized_keys (user keys + proxy key)
PersistentVolume cl-<name>-pv cluster-wide Host path at /var/lib/chicklets/<ns>/<name>
PersistentVolumeClaim cl-<name>-pvc chicklet's ns Binds to the PV
Service cl-<name>-svc chicklet's ns NodePort service (only if ports are exposed)

The RuntimeClass kata-clh adds 160 Mi memory and 250m CPU overhead per pod to account for the VM itself.


Subnet Map

Subnet Used by Notes
10.42.0.0/16 K3s/Flannel pod network Default K3s pod CIDR. Chicklet pods get IPs here.
10.43.0.0/16 K3s service network Default K3s service CIDR. NodePort services use ClusterIPs here.
172.17.0.0/16 Docker default bridge Avoid this — Docker installs it by default. Not used by Nightshift.

Listening Ports (Host)

Port Process Protocol Purpose
80 Caddy HTTP ACME HTTP-01 challenge, redirect to HTTPS
443 Caddy HTTPS TLS termination for API and chicklet URLs
2222 chicklet-proxy SSH SSH multiplexer for all chicklets
8080 chicklet-api HTTP API server (Caddy upstream, not exposed directly)
8081 chicklet-api HTTP Prometheus metrics (controller-runtime)
30000-32767 kube-proxy TCP NodePort range for exposed chicklet ports