feat: migrate DefenseClaw POC from Docker to Fly.io#2066
feat: migrate DefenseClaw POC from Docker to Fly.io#2066
Conversation
Provisions an OpenClaw agent inside Docker, pre-configured with a Gram tenant's MCP servers. LLM calls route through Gram's completions endpoint and network egress is locked down via iptables to only allowed hosts. Verified end-to-end: MCP tool calls, chat completions, network allow/block. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace manual iptables-based network policy with OpenShell sandbox, providing kernel-level isolation via Linux namespaces, Landlock LSM, seccomp-BPF, and OPA/Rego network policy enforcement. - Add openshell-sandbox binary install from NVIDIA OCI image - Add OPA Rego policy from DefenseClaw for per-CONNECT evaluation - Route all sandbox HTTP(S) through OpenShell's CONNECT proxy - Rewrite entrypoint as start.py with proper sandbox lifecycle - Add start-openclaw.sh that runs inside the sandbox namespace - Container now requires --privileged for namespace creation
The previous tests used a non-existent web_fetch tool, causing the LLM to hallucinate responses and bypass the proxy entirely. Now tests use curl via nsenter into the gateway's network namespace, where traffic goes through the OpenShell CONNECT proxy with OPA policy enforcement. - Fix HOME not set in start-openclaw.sh (openshell-sandbox doesn't set it) - Fix .env.local values not overriding mise env vars in config.py - Add sandbox_curl() that runs curl inside sandbox network namespace - asdf.com now properly blocked by OPA policy (CONNECT denied)
…iency - Remove unused OPENSHELL_NETWORK_POLICY and GUARDRAIL dicts from config.py (start.py generates the policy YAML directly, these were never consumed) - Fix README and config.py referencing deleted start.sh (now start.py) - Add warning when gateway health check times out in start.py - Remove unconditional sleep before sandbox PID lookup - Check-then-sleep instead of sleep-then-check in poc.py gateway wait loop - Reuse DefenseClaw clone from dc-builder stage instead of cloning twice - Remove git from runtime image (no longer needed)
Policy files (/etc/defenseclaw/) are now root:root 644, matching DefenseClaw's production architecture. The sandbox user can read them but cannot modify them to weaken its own network restrictions. The openshell-sandbox binary loads policy at startup on the host side of the namespace boundary — even if the file were somehow modified, the running proxy would not reload it.
Replace Docker container runtime with Fly.io Firecracker microVMs. OpenShell runs natively on a real VM — no --privileged flag, no nested namespace hacks, no Docker-specific iptables bridging. - Add fly.toml for Fly app configuration - Simplify start.py: remove unshare --mount, use VM's DNS directly - Rewrite poc.py: use fly CLI (deploy, ssh, logs) instead of docker CLI - Update README with Fly architecture and setup instructions
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
I have read the CLA Document and I hereby sign the CLA You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot. |
OpenShell's OPA policy checks the full ancestor chain of every process making network requests. Commands run via fly ssh have /.fly/hallpass in their ancestry, which fails the integrity check. Fix: run network tests from start-openclaw.sh (inside the sandbox), where the process tree is bash -> python3 -> curl. poc.py reads the test results from Fly logs instead of running curl remotely. Also add grant-scopes.py and test-network-policy.py as baked-in scripts to avoid shell quoting issues with fly ssh -C.
- Merge fly_ssh() and fly_ssh_ok() into one function with check param - Move inline import json to top-level - Extract FLY_ORG constant (was hardcoded speakeasy-lab) - Fix config.py docstring referencing Docker container - Add grant-scopes.py and test-network-policy.py to README files table
Summary
--privileged, no nested namespace hacksfly secrets setinstead of command-line env varsWhat changed
docker run --privilegedfly deploy(Firecracker VM)docker build/run/exec/logsfly deploy/ssh/logsunshare --mountmount --bind/etc/resolv.conf-e GRAM_API_KEY=...on CLIfly secrets set(encrypted)Test plan
fly auth loginsucceedsuv run python poc.pydeploys to Fly, starts OpenShell sandbox, passes all 3 testsfly apps destroy defenseclaw-gram-poc --yestears down cleanly