You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: rewrite README around intent-based JIT policy and open integration surface
Position against Conseca citation landscape: human approves intent (not code),
LLM drafts scoped-fetch specs from trusted context only, deterministic enforcement.
Move TEE deploy instructions to dstack/DEPLOY.md.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: README.md
+59-56Lines changed: 59 additions & 56 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,53 +1,75 @@
1
1
# OAuth3 Enclave
2
2
3
-
AI agents need real credentials to act on your behalf — but an agent holding your API key is a liability. Prompt injection means any retrieved content could instruct the agent to misuse it. OAuth extensions (OIDC-A, OBO tokens) require every service to upgrade. And revoking a leaked token doesn't undo the damage.
3
+
An API gateway where every integration is defined just-in-time. There's no fixed set of tools — an agent states its **intent** (a goal, API doc URLs, and which secrets it needs), a trusted LLM inside the enclave drafts a minimal scoped-fetch spec from only that trusted context, a human approves the intent, and the spec compiles into a deterministic sandbox. Credentials never leave the TEE. The agent never sees them.
4
4
5
-
OAuth3 puts a TEE between the agent and the services. The enclave holds your credentials, the agent never touches them. The agent describes what it wants to do, a human approves a scoped capability, and the agent's code runs inside the enclave's sandbox with only the approved functions available. To external services, it looks like a normal user session — no server changes required.
5
+
This is the [Conseca](https://arxiv.org/abs/2501.17070) architecture (Google, HotOS 2025) made concrete: separate trusted policy generation from untrusted execution. The LLM drafts policy using **only trusted inputs** (the intent spec + authoritative API docs). The policy compiles to a locked-down fetch function — specific URL globs, HTTP methods, body fields, rate limits. No LLM in the enforcement path. No predefined tool registry. No server-side changes.
│ │──── orchestration code ──────────►│ runs in SES sandbox │
20
+
│ │ │ github('GET', '/repos/...') │
21
+
│ │◄──── result ─────────────────────│ credentials injected by TEE │
22
+
└─────────┘ └──────────────────────────────┘
23
+
24
+
untrusted ▲ trusted ▲
25
+
prompt injection can't attested, deterministic
26
+
change the approved spec no LLM at enforcement time
16
27
```
17
28
18
-
### Why not just review the agent's code?
29
+
### Intent, not code review
19
30
20
-
Because code review doesn't help when the code was influenced by prompt injection. The key insight: **the capability function is written before any untrusted data enters the system.** The agent submits a rigid JSON spec, Haiku drafts a 5-line fetch wrapper from authoritative API docs, and the human approves it — all before the agent processes any external content. Injected instructions can't change the approved capability; they can only affect data flowing through it, inside the sandbox.
31
+
The human approves a **goal** — "create issues on owner/repo using the GitHub API" — not a page of code. The trusted LLM inside the enclave translates that into a scoped-fetch spec: which base URL, which path globs, which HTTP methods, which body fields, which secrets to inject. That spec is the policy. It compiles to a sandbox function that can't do anything outside its scope.
The LLM drafts this into a scoped-fetch spec — locked to `POST /repos/owner/repo/issues`, with `Authorization: Bearer {GITHUB_TOKEN}` injected, body restricted to `title` and `body` fields, rate-limited. That's what runs. Nothing else exists in the sandbox.
44
+
45
+
### Why this matters
23
46
24
-
Most agent security frameworks assume a **fixed set of tools** — the agent picks from `send_email`, `read_file`, `query_db`, and the security system gates access to that known set. This is true of [Conseca](https://arxiv.org/abs/2501.17070) (Google, HotOS 2025), [Progent](https://arxiv.org/abs/2504.11703), [SEAgent](https://arxiv.org/abs/2601.11893), [MiniScope](https://arxiv.org/abs/2512.11147), [AgentArmor](https://arxiv.org/abs/2508.01249), and others. Their policy languages — whether regex, DSL, or LLM-generated — reference specific tools and endpoints known at policy-definition time.
47
+
Every other agent security framework assumes the set of integrations is **known in advance**. [Progent](https://arxiv.org/abs/2504.11703), [SEAgent](https://arxiv.org/abs/2601.11893), [MiniScope](https://arxiv.org/abs/2512.11147), [AgentArmor](https://arxiv.org/abs/2508.01249) — they all gate access to predefined tools. Their policy languages (regex, DSL, Cedar) reference specific tool names and endpoints. This works for closed systems, but agents operating in the real world need to hit APIs nobody anticipated at design time.
25
48
26
-
OAuth3 doesn't require a predefined tool registry. An agent can propose a novel integration with any HTTP API, and the system handles it: the agent submits a capability spec, the enclave's LLM drafts scoped code from API docs, a human reviews the concrete function, and it runs sandboxed. The set of possible integrations is open-ended — bounded only by what a human is willing to approve.
49
+
OAuth3 is open-ended: the agent proposes an intent, the human approves a goal, the enclave enforces a spec. The set of possible integrations is unbounded — limited only by what a human is willing to approve.
27
50
28
-
This also addresses the main critiques the field levels at Conseca's approach:
-**"LLM-generated policies are unreliable"** ([MiniScope](https://arxiv.org/abs/2512.11147), [CSAgent](https://arxiv.org/abs/2509.22256)) — the LLM drafts code that a human reviews and that compiles to deterministic constraints; no LLM in the enforcement path
31
-
-**"Domain-specific rules limit open-domain use"** ([PSG-Agent](https://arxiv.org/abs/2509.23614)) — capabilities are generated per-task, not predefined per-domain
51
+
This sidesteps the main critiques the field levels at contextual policy generation:
52
+
-**"Regex policies can't handle complex attacks"** ([ControlValve](https://arxiv.org/abs/2510.17276)) — the spec compiles to URL globs + method + body schema, enforced deterministically
53
+
-**"LLM-generated policies are unreliable"** ([MiniScope](https://arxiv.org/abs/2512.11147), [CSAgent](https://arxiv.org/abs/2509.22256)) — the LLM only drafts from trusted context; a human approves; enforcement has no LLM
54
+
-**"Domain-specific rules can't cover open-domain tasks"** ([PSG-Agent](https://arxiv.org/abs/2509.23614)) — intents are generated per-task for any API
32
55
33
-
### What's unique
56
+
### Features
34
57
35
-
-**Open integration surface** — works with any HTTP API without predefined tool definitions. Agents propose novel integrations; humans approve concrete code.
36
-
-**No server changes** — the enclave holds real credentials and proxies requests. To external services it looks like a normal user session.
37
-
-**Credential custody in hardware** — secrets live inside a TEE (dstack CVM). Remote attestation proves what code is running.
38
-
-**Capability-based sandbox** — agent code gets named functions (`github()`, `slack()`), not raw `fetch()`. Each function is locked to specific URL patterns, methods, and body fields.
39
-
-**Account encumbrance** — the password can be rotated *inside* the TEE so even the user can't bypass policies without visibly destroying the encumbrance. Enables DAO-controlled accounts, mandatory CI gates, escrow delegation.
58
+
-**Open integration surface** — any HTTP API, no predefined tools. Agents propose intents; the enclave drafts specs.
59
+
-**Credential custody in hardware** — secrets live in a TEE (dstack CVM). Remote attestation proves what code runs.
60
+
-**No server changes** — to external services it looks like a normal user session.
61
+
-**Deterministic sandbox** — each approved spec compiles to a named function locked to specific URL globs, methods, body fields, and rate limits. No `fetch()`, no escape.
62
+
-**Account encumbrance** — the password can be rotated *inside* the TEE so even the user can't bypass policies without visibly destroying the encumbrance.
40
63
41
64
## Quick start
42
65
43
66
```bash
44
67
cd proxy && npm install && npm run dev
45
68
```
46
69
47
-
Zero config for local dev — JWT secrets auto-generate, SQLite is embedded. Set `ANTHROPIC_API_KEY` in `.env`if you want the LLM to draft capabilities from natural language intent.
70
+
Zero config for local dev — JWT secrets auto-generate, SQLite is embedded. Set `ANTHROPIC_API_KEY` in `.env`for LLM-drafted capabilities.
48
71
49
72
```bash
50
-
# Get tokens
51
73
AGENT=$(curl -s -X POST localhost:3737/signup -H 'Content-Type: application/json' \
52
74
-d '{"name":"my-agent"}'| jq -r .token)
53
75
OWNER=$(curl -s -X POST localhost:3737/signup -H 'Content-Type: application/json' \
@@ -58,16 +80,16 @@ curl -s -X POST localhost:3737/secrets \
0 commit comments