anthropics
diff --git a/‎claude_agent_sdk/hosting/kubernetes/.gitignore‎
Lines changed: 2 additions & 0 deletions b/‎claude_agent_sdk/hosting/kubernetes/.gitignore‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎claude_agent_sdk/hosting/kubernetes/README.md‎
Lines changed: 269 additions & 0 deletions b/‎claude_agent_sdk/hosting/kubernetes/README.md‎
Lines changed: 269 additions & 0 deletions
diff --git a/‎claude_agent_sdk/hosting/kubernetes/egress-proxy/Dockerfile‎
Lines changed: 9 additions & 0 deletions b/‎claude_agent_sdk/hosting/kubernetes/egress-proxy/Dockerfile‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎claude_agent_sdk/hosting/kubernetes/egress-proxy/nginx.conf‎
Lines changed: 106 additions & 0 deletions b/‎claude_agent_sdk/hosting/kubernetes/egress-proxy/nginx.conf‎
Lines changed: 106 additions & 0 deletions
@@ -0,0 +1,2 @@
+# generated by generate-certs.sh — local-only, never commit
+certs/
@@ -0,0 +1,269 @@
+# Tier 3 — Kubernetes (pod-per-session)
+
+> Part of the [Agent SDK hosting cookbook](../../07_Hosting_the_agent.ipynb).
+> If you haven't picked a hosting tier yet, start there — it covers when a
+> managed option is the better fit and when you actually need this.
+
+Run the agent on a Kubernetes cluster where every session gets its own
+isolated pod, with network-level controls ensuring agent pods can only reach
+the Anthropic API.
+
+```
+                     ┌──────────────────────────────────────────────────┐
+                     │                  Kubernetes                      │
+                     │                                                  │
+  curl / SDK ──────► Gateway (FastAPI)                                  │
+                     │  ├─ creates/deletes agent pods via K8s API       │
+                     │  ├─ routes /sessions/{id}/messages to right pod  │
+                     │  └─ session → pod mapping stored in Redis        │
+                     │                                                  │
+              ┌──────┴──────┐                                           │
+              │             │                                           │
+          Agent Pod    Agent Pod ──► Egress Proxy ──► api.anthropic.com │
+          (session A)  (session B)      ▲                               │
+              │             │            │                               │
+              │     NetworkPolicy: pods can ONLY reach egress-proxy     │
+              │                                                         │
+            Redis (session → pod-IP mapping)                            │
+                     │                                                  │
+                     └──────────────────────────────────────────────────┘
+```
+
+The agent image is the **same one** Tier 1 builds from
+[`hosting/Dockerfile`](../Dockerfile). Same image, different machinery: instead
+of a single container or a Modal sandbox, the gateway gives each session its
+own pod and the cluster enforces what that pod can reach.
+
+> **Before you self-host:** if you just want a hosted agent without running
+> infrastructure, use Anthropic's managed option — see the
+> [Hosting overview](../README.md). This guide is for teams that need the
+> agent on their own Kubernetes cluster (regulated environments, existing
+> platform, custom networking).
+
+
+## Why each piece exists
+
+**Gateway** — Each user session gets its own agent pod. Something has to create
+those pods on demand, route traffic to the right one, and clean them up when
+sessions go idle. That's the gateway. It talks to the Kubernetes API to manage
+pod lifecycles and uses Redis to remember which session maps to which pod IP.
+
+**Egress proxy + NetworkPolicy** — Agents run arbitrary code. This pair ensures
+agent pods can reach `api.anthropic.com` and *nothing else*. The NetworkPolicy
+blocks all outbound traffic except to the egress proxy (port 443) and DNS
+(port 53). The egress proxy terminates TLS from the agent, then re-encrypts the
+request to Anthropic's API. Any attempt to reach the internet, other services,
+or other namespaces is dropped at the network level.
+
+**Redis** — The gateway needs to remember which pod is handling which session.
+When a request arrives, it looks up the session ID in Redis to find the pod IP
+and routes traffic there. Redis persists to disk so mappings survive gateway
+restarts.
+
+**Standby pool** — Pods take 10–30 seconds to start (image pull + container
+boot). The gateway pre-warms a configurable number of standby pods so new
+sessions can claim one instantly instead of waiting. After a pod is claimed,
+the pool replenishes in the background.
+
+## Prerequisites
+
+| Tool | What it's for |
+|------|---------------|
+| [kind](https://kind.sigs.k8s.io/) | Local Kubernetes cluster in Docker |
+| [kubectl](https://kubernetes.io/docs/tasks/tools/) | Applying manifests, inspecting the cluster |
+| [docker](https://docs.docker.com/get-docker/) | Building container images |
+| `openssl` | Generating the egress proxy's TLS certificate |
+| `ANTHROPIC_API_KEY` | Set as env var |
+
+## Quickstart (local, with kind)
+
+```bash
+cd hosting/kubernetes
+export ANTHROPIC_API_KEY=sk-ant-...
+./kind-quickstart.sh
+```
+
+This builds the three images, loads them into a local `kind` cluster, applies
+every manifest, and port-forwards the gateway to `localhost:8080`.
+
+## Talk to it
+
+Same path and shape as Tier 1/2 — only the base URL changes:
+
+```bash
+curl -N -X POST http://localhost:8080/sessions/demo/messages \
+  -H 'Content-Type: application/json' \
+  -d '{"prompt": "What tools do you have?"}'
+```
+
+The first request on a new `session_id` claims a standby pod (or spawns one if
+the pool is empty). Subsequent requests with the same `session_id` route to the
+same pod, so the agent sees a continuous conversation.
+
+Watch the machinery work:
+
+```bash
+kubectl -n claude-agent get pods -w
+# you'll see agent-standby-* pods appear, then one flip to active when you curl
+```
+
+To end a session, go through the gateway so the Redis mapping is cleaned up:
+
+```bash
+curl -X DELETE http://localhost:8080/sessions/demo
+```
+
+(`kubectl delete pod` works too, but leaves a stale `session → pod-IP` entry
+in Redis until the next request on that session 502s.)
+
+## Verify the egress lockdown
+
+The agent runs code the model decides to run. The egress proxy + NetworkPolicy
+mean a prompt-injected agent still can't reach arbitrary hosts. Prove it:
+
+> `kind-quickstart.sh` installs Calico because kind's default CNI (kindnet)
+> doesn't enforce NetworkPolicy. On GKE/EKS/AKS or any Calico/Cilium cluster,
+> enforcement is on by default and this section works unchanged.
+
+```bash
+AGENT_POD=$(kubectl -n claude-agent get pods -l role=agent \
+  -o jsonpath='{.items[0].metadata.name}')
+
+# This should FAIL — Calico drops the route to anything except egress-proxy.
+# (The agent image is slim and has no curl, so we use Python's socket.)
+kubectl -n claude-agent exec "$AGENT_POD" -- python3 -c \
+  "import socket; socket.setdefaulttimeout(5); socket.create_connection(('example.com',443)); print('REACHED — policy NOT enforcing')"
+```
+
+Expected: `OSError: [Errno 101] Network is unreachable` (or a timeout) and a
+non-zero exit. The positive control — that the egress-proxy path *is* open —
+was already proven by the curl above returning model output.
+
+## Standby pool
+
+`STANDBY_POOL_SIZE` (in the `agent-config` ConfigMap) controls how many warm
+pods the gateway keeps ready. Check current state:
+
+```bash
+curl http://localhost:8080/api/pool
+```
+
+## Persistence
+
+`server.py` persists transcripts (and its caller-ID → SDK-ID map) to
+`CLAUDE_CONFIG_DIR=/data`. In this tier that's the pod's ephemeral filesystem,
+so:
+
+- **While the pod is alive** (within the idle-timeout window), follow-up
+  messages resume the conversation exactly as in Tiers 1 and 2.
+- **After the pod is reaped**, `/data` is gone. The next message on that
+  `session_id` gets a fresh pod with no history.
+
+For a cookbook demo this is fine — sessions outlive the curl, not the cluster.
+For production you need durable storage that survives pod recycle. Two options:
+
+1. **Mount a PersistentVolumeClaim** at `/data` instead of the pod's local
+   disk, and have the gateway reattach the same PVC when a session returns.
+   Works with `server.py` as-is, but couples each session to a volume in one
+   zone.
+2. **Mirror `/data` to external storage** with the SDK's
+   [`SessionStore`](https://code.claude.com/docs/en/agent-sdk/session-storage):
+   the local-disk write still happens first; the store is a mirror, and
+   `mirror_error` is non-fatal. This is the approach the notebook's
+   *Making it production-ready* section describes — it needs a small hook in
+   `server.py` that the cookbook hasn't grown yet.
+
+## Deploying to your own cluster
+
+`kind` proves the topology; the manifests are cloud-agnostic. To run on EKS,
+AKS, GKE, OpenShift, or bare metal, swap the image registry and the front door:
+
+```bash
+REG=your.registry.example.com/claude-agent     # ECR, ACR, GHCR, Artifact Registry, ...
+
+# 1. Build and push the three images
+docker build -t $REG/agent:latest -f ../Dockerfile ..
+docker build -t $REG/gateway:latest ./gateway
+docker build -t $REG/egress-proxy:latest ./egress-proxy
+docker push $REG/agent:latest $REG/gateway:latest $REG/egress-proxy:latest
+
+# 2. TLS certs for the egress proxy
+./generate-certs.sh
+
+# 3. Namespace + secrets + config
+kubectl apply -f manifests/namespace.yaml
+kubectl -n claude-agent create secret generic anthropic-api-key \
+    --from-literal=ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY"
+kubectl -n claude-agent create secret generic egress-proxy-tls \
+    --from-file=ca.crt=certs/ca.crt \
+    --from-file=proxy.crt=certs/proxy.crt \
+    --from-file=proxy.key=certs/proxy.key
+kubectl -n claude-agent create configmap agent-config \
+    --from-literal=AGENT_IMAGE=$REG/agent:latest \
+    --from-literal=STANDBY_POOL_SIZE=2
+
+# 4. Apply manifests with your registry substituted
+for f in manifests/*.yaml; do
+  sed "s|REGISTRY_URL|$REG|g" "$f" | kubectl apply -f -
+done
+```
+
+Then expose the `gateway` Service through whatever your cluster uses for
+ingress — a cloud LoadBalancer, an Ingress controller, or a service mesh
+gateway. Three things vary by environment:
+
+- **Registry auth** — your nodes need pull credentials for `$REG`
+  (`imagePullSecrets`, IRSA/Workload Identity, or a public registry).
+- **NetworkPolicy enforcement** — the egress lockdown only works if your CNI
+  enforces `NetworkPolicy` (Cilium, Calico, GKE Dataplane V2, EKS with the
+  VPC CNI policy add-on). On a CNI that ignores it, agent pods can reach the
+  internet.
+- **TLS + auth in front of the gateway** — `GATEWAY_AUTH_TOKEN` is a
+  placeholder. Put your IdP / API gateway in front before exposing this
+  publicly.
+
+## What this doesn't give you
+
+- Real authentication or multi-tenancy (the `authenticate()` stub returns one
+  hard-coded tenant)
+- Durable session storage (see [Persistence](#persistence))
+- Gateway autoscaling or multi-region routing
+- Observability beyond what
+  [`OTEL_EXPORTER_OTLP_ENDPOINT`](../README.md#observability) gives you for free
+
+## Teardown
+
+```bash
+./teardown.sh        # kind delete cluster + remove certs/
+```
+
+## Layout
+
+```
+kubernetes/
+├── README.md
+├── kind-quickstart.sh         # local end-to-end on kind
+├── teardown.sh
+├── generate-certs.sh          # self-signed CA + proxy cert for egress-proxy
+├── gateway/
+│   ├── main.py                # FastAPI: route + reap
+│   ├── k8s.py                 # pod lifecycle + standby pool
+│   ├── proxy.py               # SSE relay
+│   ├── requirements.txt
+│   └── Dockerfile
+├── egress-proxy/
+│   ├── nginx.conf
+│   └── Dockerfile
+└── manifests/
+    ├── namespace.yaml
+    ├── redis.yaml
+    ├── egress-proxy.yaml
+    ├── gateway.yaml           # SA + RBAC + Deployment + Service
+    └── network-policy.yaml
+```
+
+---
+
+The pod lifecycle management (`k8s.py`), egress proxy, and network policy are
+adapted from Anthropic's internal `create-claude-agent` harness by Joe Shamon
+and Ben Lehrburger.
@@ -0,0 +1,9 @@
+# Egress proxy: TLS-terminating nginx that ONLY forwards to api.anthropic.com.
+# Plain http{} reverse-proxy — no stream module needed.
+FROM nginx:alpine
+
+COPY nginx.conf /etc/nginx/nginx.conf
+
+EXPOSE 443 80
+
+CMD ["nginx", "-g", "daemon off;"]
@@ -0,0 +1,106 @@
+# =============================================================================
+# Egress Proxy — nginx configuration
+# =============================================================================
+#
+# WHY THIS EXISTS:
+#   Agent pods can execute arbitrary code (that's the whole point of a coding
+#   agent). Without network controls, a compromised or misbehaving agent could
+#   reach any host on the internet — exfiltrating data, attacking internal
+#   services, or abusing external APIs.
+#
+#   This proxy, combined with a Kubernetes NetworkPolicy, ensures that agent
+#   pods can ONLY reach api.anthropic.com and nothing else:
+#
+#     1. The K8s NetworkPolicy blocks all egress from agent pods except to
+#        this proxy (and DNS).
+#     2. This proxy only forwards requests to api.anthropic.com.
+#
+#   Together, they form a strict allowlist for agent network access.
+#
+# HOW TLS WORKS HERE:
+#   The proxy terminates TLS using a self-signed certificate (generated by
+#   generate-certs.sh). Agent pods trust this certificate via the
+#   NODE_EXTRA_CA_CERTS environment variable, which points to the CA cert
+#   that signed the proxy's certificate. The proxy then makes its own TLS
+#   connection to the real api.anthropic.com upstream.
+#
+#   Traffic flow:
+#     Agent pod --[TLS with self-signed cert]--> egress-proxy --[TLS]--> api.anthropic.com
+#
+# EXTENDING TO MORE ENDPOINTS:
+#   This demo only proxies the Claude API via ANTHROPIC_BASE_URL. For full
+#   telemetry and error reporting (statsig.anthropic.com, sentry.io, claude.ai),
+#   you would use HTTPS_PROXY instead. See:
+#   https://code.claude.com/docs/en/corporate-proxy
+# =============================================================================
+
+user nginx;
+worker_processes auto;
+error_log /var/log/nginx/error.log info;
+pid /run/nginx.pid;
+
+events {
+    worker_connections 1024;
+}
+
+http {
+    access_log /var/log/nginx/access.log;
+
+    # DNS resolver (IPv4 only to avoid Docker network issues)
+    resolver 8.8.8.8 ipv6=off valid=30s;
+
+    # Anthropic API upstream
+    upstream anthropic_api {
+        server api.anthropic.com:443;
+        keepalive 32;
+    }
+
+    # HTTPS server - terminates TLS from agent, proxies to Anthropic API
+    server {
+        listen 443 ssl;
+        server_name egress-proxy;
+
+        # Our self-signed certificate (signed by demo CA)
+        ssl_certificate /etc/nginx/certs/proxy.crt;
+        ssl_certificate_key /etc/nginx/certs/proxy.key;
+        ssl_protocols TLSv1.2 TLSv1.3;
+        ssl_ciphers HIGH:!aNULL:!MD5;
+
+        # Proxy all requests to Anthropic API
+        location / {
+            proxy_pass https://anthropic_api;
+            proxy_ssl_server_name on;
+            proxy_ssl_name api.anthropic.com;
+
+            # Set correct Host header for Cloudflare
+            proxy_set_header Host api.anthropic.com;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_set_header X-Forwarded-Proto $scheme;
+
+            # Timeouts for long API calls
+            proxy_connect_timeout 60s;
+            proxy_send_timeout 300s;
+            proxy_read_timeout 300s;
+
+            # For streaming responses
+            proxy_buffering off;
+            proxy_http_version 1.1;
+            proxy_set_header Connection "";
+        }
+    }
+
+    # HTTP health check
+    server {
+        listen 80;
+
+        location /health {
+            return 200 'OK';
+            add_header Content-Type text/plain;
+        }
+
+        location / {
+            return 403 'Use HTTPS';
+        }
+    }
+}
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+# generated by generate-certs.sh — local-only, never commit`
	`2`	`+certs/`