docs: rewrite README around intent-based JIT policy and open integration surface

soc1024 · claude · soc1024 · commit 57ba75e77242 · 2026-02-27T16:27:11.000-05:00
Position against Conseca citation landscape: human approves intent (not code),
LLM drafts scoped-fetch specs from trusted context only, deterministic enforcement.
Move TEE deploy instructions to dstack/DEPLOY.md.

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/README.md b/README.md
@@ -1,53 +1,75 @@
 # OAuth3 Enclave
 
-AI agents need real credentials to act on your behalf — but an agent holding your API key is a liability. Prompt injection means any retrieved content could instruct the agent to misuse it. OAuth extensions (OIDC-A, OBO tokens) require every service to upgrade. And revoking a leaked token doesn't undo the damage.
+An API gateway where every integration is defined just-in-time. There's no fixed set of tools — an agent states its **intent** (a goal, API doc URLs, and which secrets it needs), a trusted LLM inside the enclave drafts a minimal scoped-fetch spec from only that trusted context, a human approves the intent, and the spec compiles into a deterministic sandbox. Credentials never leave the TEE. The agent never sees them.
 
-OAuth3 puts a TEE between the agent and the services. The enclave holds your credentials, the agent never touches them. The agent describes what it wants to do, a human approves a scoped capability, and the agent's code runs inside the enclave's sandbox with only the approved functions available. To external services, it looks like a normal user session — no server changes required.
+This is the [Conseca](https://arxiv.org/abs/2501.17070) architecture (Google, HotOS 2025) made concrete: separate trusted policy generation from untrusted execution. The LLM drafts policy using **only trusted inputs** (the intent spec + authoritative API docs). The policy compiles to a locked-down fetch function — specific URL globs, HTTP methods, body fields, rate limits. No LLM in the enforcement path. No predefined tool registry. No server-side changes.
 
 ```
-Agent                                 TEE (Confidential VM)
-┌──────────┐                    ┌─────────────────────────────┐
-│  Agent   │── capability spec ►│  LLM drafts scoped function │
-│          │                    │  human approves the spec     │
-│          │── orchestration   ►│  code runs in SES sandbox    │
-│          │   code             │  only approved functions     │
-│          │◄── result ─────────│  keys never leave enclave    │
-└──────────┘                    └─────────────────────────────┘
+  Agent                    Human                    TEE (Confidential VM)
+┌─────────┐            ┌───────────┐          ┌──────────────────────────────┐
+│         │── intent ──►           │          │                              │
+│         │  name       │ reviews  │─approve─►│ LLM drafts scoped-fetch spec │
+│         │  goal       │ the      │          │ from intent + API docs only   │
+│         │  doc_urls   │ intent   │          │ (no untrusted input)          │
+│         │  secrets    │          │          │                              │
+│         │             └───────────┘          │ spec compiles to:            │
+│         │                                   │  URL globs, methods,         │
+│         │                                   │  body schema, rate limits    │
+│         │                                   │                              │
+│         │──── orchestration code ──────────►│ runs in SES sandbox          │
+│         │                                   │ github('GET', '/repos/...')   │
+│         │◄──── result ─────────────────────│ credentials injected by TEE  │
+└─────────┘                                   └──────────────────────────────┘
+
+  untrusted ▲                                   trusted ▲
+  prompt injection can't                        attested, deterministic
+  change the approved spec                      no LLM at enforcement time
 ```
 
-### Why not just review the agent's code?
+### Intent, not code review
 
-Because code review doesn't help when the code was influenced by prompt injection. The key insight: **the capability function is written before any untrusted data enters the system.** The agent submits a rigid JSON spec, Haiku drafts a 5-line fetch wrapper from authoritative API docs, and the human approves it — all before the agent processes any external content. Injected instructions can't change the approved capability; they can only affect data flowing through it, inside the sandbox.
+The human approves a **goal** — "create issues on owner/repo using the GitHub API" — not a page of code. The trusted LLM inside the enclave translates that into a scoped-fetch spec: which base URL, which path globs, which HTTP methods, which body fields, which secrets to inject. That spec is the policy. It compiles to a sandbox function that can't do anything outside its scope.
 
-### Open integration surface
+An intent looks like:
+```json
+{
+  "name": "github",
+  "goal": "Create issues on owner/repo",
+  "doc_urls": ["https://docs.github.com/en/rest/issues"],
+  "secret_hints": ["GITHUB_TOKEN"]
+}
+```
+
+The LLM drafts this into a scoped-fetch spec — locked to `POST /repos/owner/repo/issues`, with `Authorization: Bearer {GITHUB_TOKEN}` injected, body restricted to `title` and `body` fields, rate-limited. That's what runs. Nothing else exists in the sandbox.
+
+### Why this matters
 
-Most agent security frameworks assume a **fixed set of tools** — the agent picks from `send_email`, `read_file`, `query_db`, and the security system gates access to that known set. This is true of [Conseca](https://arxiv.org/abs/2501.17070) (Google, HotOS 2025), [Progent](https://arxiv.org/abs/2504.11703), [SEAgent](https://arxiv.org/abs/2601.11893), [MiniScope](https://arxiv.org/abs/2512.11147), [AgentArmor](https://arxiv.org/abs/2508.01249), and others. Their policy languages — whether regex, DSL, or LLM-generated — reference specific tools and endpoints known at policy-definition time.
+Every other agent security framework assumes the set of integrations is **known in advance**. [Progent](https://arxiv.org/abs/2504.11703), [SEAgent](https://arxiv.org/abs/2601.11893), [MiniScope](https://arxiv.org/abs/2512.11147), [AgentArmor](https://arxiv.org/abs/2508.01249) — they all gate access to predefined tools. Their policy languages (regex, DSL, Cedar) reference specific tool names and endpoints. This works for closed systems, but agents operating in the real world need to hit APIs nobody anticipated at design time.
 
-OAuth3 doesn't require a predefined tool registry. An agent can propose a novel integration with any HTTP API, and the system handles it: the agent submits a capability spec, the enclave's LLM drafts scoped code from API docs, a human reviews the concrete function, and it runs sandboxed. The set of possible integrations is open-ended — bounded only by what a human is willing to approve.
+OAuth3 is open-ended: the agent proposes an intent, the human approves a goal, the enclave enforces a spec. The set of possible integrations is unbounded — limited only by what a human is willing to approve.
 
-This also addresses the main critiques the field levels at Conseca's approach:
-- **"Regex policies can't handle complex attacks"** ([ControlValve](https://arxiv.org/abs/2510.17276)) — OAuth3 generates executable code, not regex patterns
-- **"LLM-generated policies are unreliable"** ([MiniScope](https://arxiv.org/abs/2512.11147), [CSAgent](https://arxiv.org/abs/2509.22256)) — the LLM drafts code that a human reviews and that compiles to deterministic constraints; no LLM in the enforcement path
-- **"Domain-specific rules limit open-domain use"** ([PSG-Agent](https://arxiv.org/abs/2509.23614)) — capabilities are generated per-task, not predefined per-domain
+This sidesteps the main critiques the field levels at contextual policy generation:
+- **"Regex policies can't handle complex attacks"** ([ControlValve](https://arxiv.org/abs/2510.17276)) — the spec compiles to URL globs + method + body schema, enforced deterministically
+- **"LLM-generated policies are unreliable"** ([MiniScope](https://arxiv.org/abs/2512.11147), [CSAgent](https://arxiv.org/abs/2509.22256)) — the LLM only drafts from trusted context; a human approves; enforcement has no LLM
+- **"Domain-specific rules can't cover open-domain tasks"** ([PSG-Agent](https://arxiv.org/abs/2509.23614)) — intents are generated per-task for any API
 
-### What's unique
+### Features
 
-- **Open integration surface** — works with any HTTP API without predefined tool definitions. Agents propose novel integrations; humans approve concrete code.
-- **No server changes** — the enclave holds real credentials and proxies requests. To external services it looks like a normal user session.
-- **Credential custody in hardware** — secrets live inside a TEE (dstack CVM). Remote attestation proves what code is running.
-- **Capability-based sandbox** — agent code gets named functions (`github()`, `slack()`), not raw `fetch()`. Each function is locked to specific URL patterns, methods, and body fields.
-- **Account encumbrance** — the password can be rotated *inside* the TEE so even the user can't bypass policies without visibly destroying the encumbrance. Enables DAO-controlled accounts, mandatory CI gates, escrow delegation.
+- **Open integration surface** — any HTTP API, no predefined tools. Agents propose intents; the enclave drafts specs.
+- **Credential custody in hardware** — secrets live in a TEE (dstack CVM). Remote attestation proves what code runs.
+- **No server changes** — to external services it looks like a normal user session.
+- **Deterministic sandbox** — each approved spec compiles to a named function locked to specific URL globs, methods, body fields, and rate limits. No `fetch()`, no escape.
+- **Account encumbrance** — the password can be rotated *inside* the TEE so even the user can't bypass policies without visibly destroying the encumbrance.
 
 ## Quick start
 
 ```bash
 cd proxy && npm install && npm run dev
 ```
 
-Zero config for local dev — JWT secrets auto-generate, SQLite is embedded. Set `ANTHROPIC_API_KEY` in `.env` if you want the LLM to draft capabilities from natural language intent.
+Zero config for local dev — JWT secrets auto-generate, SQLite is embedded. Set `ANTHROPIC_API_KEY` in `.env` for LLM-drafted capabilities.
 
 ```bash
-# Get tokens
 AGENT=$(curl -s -X POST localhost:3737/signup -H 'Content-Type: application/json' \
   -d '{"name":"my-agent"}' | jq -r .token)
 OWNER=$(curl -s -X POST localhost:3737/signup -H 'Content-Type: application/json' \
@@ -58,16 +80,16 @@ curl -s -X POST localhost:3737/secrets \
   -H "Authorization: Bearer $OWNER" -H 'Content-Type: application/json' \
   -d '{"name":"GITHUB_TOKEN","value":"ghp_..."}'
 
-# Request a permit
+# Request a permit via intent (LLM drafts the scoped-fetch spec)
 PERMIT=$(curl -s -X POST localhost:3737/permit \
   -H "Authorization: Bearer $AGENT" -H 'Content-Type: application/json' \
   -d '{
     "description": "List GitHub issues",
-    "capabilities": [{
-      "type": "scoped-fetch", "name": "github",
-      "base_url": "https://api.github.com",
-      "scope": ["/repos/OWNER/REPO/issues"],
-      "auth": {"header":"Authorization","value":"Bearer {GITHUB_TOKEN}"}
+    "intent": [{
+      "name": "github",
+      "goal": "List issues on owner/repo",
+      "doc_urls": ["https://docs.github.com/en/rest/issues"],
+      "secret_hints": ["GITHUB_TOKEN"]
     }]
   }')
 REQ_ID=$(echo $PERMIT | jq -r .request_id)
@@ -82,43 +104,24 @@ curl -s -X POST localhost:3737/approve/$REQ_ID \
 EXEC=$(curl -s -X POST localhost:3737/execute \
   -H "Authorization: Bearer $AGENT" -H 'Content-Type: application/json' \
   -d "{\"permit_id\":\"$PERMIT_ID\",\"action_id\":\"list-issues\",
-       \"code\":\"const r = await github('GET','/repos/OWNER/REPO/issues'); console.log(JSON.stringify(r));\"}")
+       \"code\":\"const r = await github('GET','/repos/owner/repo/issues'); console.log(JSON.stringify(r));\"}")
 
 # Poll for result
 curl -s "localhost:3737/execute/$(echo $EXEC | jq -r .request_id)/status?wait=true" | jq .result
 ```
 
-## Deploy to a TEE
-
-For production on [dstack](https://docs.phala.network/dstack/overview) (Phala CVM):
-
-```bash
-cp dstack/.env.staging dstack/.env
-# Set: JWT_SECRET, ANTHROPIC_API_KEY, PG_PASSWORD, DOMAIN, CLOUDFLARE_API_TOKEN
-
-docker build -t ghcr.io/YOU/oauth3-proxy:latest proxy/
-docker push ghcr.io/YOU/oauth3-proxy:latest
-# Pin digest (attestation requires exact match):
-docker inspect ghcr.io/YOU/oauth3-proxy:latest --format '{{index .RepoDigests 0}}'
-# Update dstack/docker-compose.yml with digest
-
-phala deploy --cvm-id <UUID> -c dstack/docker-compose.yml -e dstack/.env
-```
-
-The CVM runs: dstack-ingress (attested TLS via Cloudflare) → oauth3-proxy → postgres.
+See [DEPLOY.md](dstack/DEPLOY.md) for TEE deployment on dstack/Phala CVM.
 
 ## Project structure
 
 ```
 proxy/src/
-├── server.ts           # HTTP API
+├── server.ts           # HTTP API + intent drafting
 ├── executor.ts         # SES Compartment sandbox
 ├── database.ts         # SQLite (dev) / Postgres (prod)
 ├── auth.ts             # JWT with agent/owner roles
 └── plugins/
-    ├── scoped-fetch.ts # Main plugin: glob-scoped HTTP with rate limits
-    ├── cookie-session.ts
-    └── tiktok-history.ts
+    └── scoped-fetch.ts # URL globs, methods, body schema, rate limits
 dstack/
 ├── docker-compose.yml  # CVM deployment
 └── .env.staging        # Env template
diff --git a/dstack/DEPLOY.md b/dstack/DEPLOY.md
@@ -0,0 +1,18 @@
+# Deploy to a TEE
+
+Production deployment on [dstack](https://docs.phala.network/dstack/overview) (Phala CVM):
+
+```bash
+cp dstack/.env.staging dstack/.env
+# Set: JWT_SECRET, ANTHROPIC_API_KEY, PG_PASSWORD, DOMAIN, CLOUDFLARE_API_TOKEN
+
+docker build -t ghcr.io/YOU/oauth3-proxy:latest proxy/
+docker push ghcr.io/YOU/oauth3-proxy:latest
+# Pin digest (attestation requires exact match):
+docker inspect ghcr.io/YOU/oauth3-proxy:latest --format '{{index .RepoDigests 0}}'
+# Update dstack/docker-compose.yml with digest
+
+phala deploy --cvm-id <UUID> -c dstack/docker-compose.yml -e dstack/.env
+```
+
+The CVM runs: dstack-ingress (attested TLS via Cloudflare) → oauth3-proxy → postgres.