You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(basilica): mint scoped operator key for tenant runtime traffic (#239)
Splits per-tenant proxy auth into two roles so a compromised tenant app
no longer holds admin scope on its own LLMTrace pod:
- admin_key (bootstrap): retained by the caller. Used by the lifecycle
layer to mint per-tenant keys and by the self-service / admin portal.
- api_key (operator): minted on the live proxy via POST /api/v1/auth/keys
after readiness, returned in TenantInstances.api_key. This is the
bearer the tenant's runtime apps use.
provision() now bootstraps the per-pod tenant row via POST /api/v1/tenants,
mints the operator key, and injects LLMTRACE_AUTH_RUNTIME_KEY into the
dashboard env (informational; dashboard wiring is a follow-up).
update(strategy="restart") rediscovers the tenant by label, lists keys,
and re-mints only when the operator record is missing. update(strategy=
"recreate") always re-mints since the DB volume is destroyed.
cli.py emits admin_key alongside api_key. The tenant-lifecycle workflow
masks BOTH keys via ::add-mask:: before any cat result.json and exposes
both as step outputs.
Adds deployments/basilica/tests/ with 19 unit tests that exercise the
real urllib admin-API client against an in-process http.server, plus a
provision() integration test using a fake Basilica client.
Both demonstrate `${VAR}` substitution and the optional override pattern.
300
300
301
-
## Per-tenant API key auth (secure by default)
301
+
## Per-tenant API key auth (two-tier, scoped runtime key)
302
302
303
303
LLMTrace's proxy has built-in API-key auth (`crates/llmtrace-proxy/src/auth.rs`).
304
304
Without it, the public Basilica URL accepts any request — anyone with the
305
305
URL can burn the tenant's upstream quota and pollute their traces. With it
306
306
on, every non-`/health` request must carry `Authorization: Bearer llmt_<key>`
307
307
or get a 401.
308
308
309
-
The lifecycle library enables this **by default** at provision time:
309
+
The lifecycle library enables auth **by default** at provision time, and
310
+
issues TWO keys with distinct scopes:
310
311
311
-
1. Resolves a key, in priority order:
312
+
| Key | Role | Who holds it | What it can do |
313
+
|---|---|---|---|
314
+
| `admin_key` | bootstrap admin | the **caller** (your app / portal) | Mint / list / revoke per-tenant keys, manage tenants, read audit logs, change feature flags. Used by the lifecycle layer itself and by your self-service / admin portal pages |
A 401 means they used the wrong key. A 200 means LLMTrace authenticated
362
-
them, then forwarded upstream (with whatever upstream auth was configured
363
-
in `tenant_secrets`).
381
+
A 401 means they used the wrong key. A 403 with "Insufficient permissions"
382
+
means they tried an admin-only endpoint (`/api/v1/auth/keys`,
383
+
`/api/v1/tenants`, etc.) with the operator key — by design.
384
+
385
+
### Update semantics
386
+
387
+
| Strategy | DB persistence | Operator key behaviour |
388
+
|---|---|---|
389
+
| `recreate` | DB volume is destroyed alongside the pods | Always re-minted. `result.api_key` carries the new operator key; caller must overwrite the stored value |
390
+
| `restart` | Same DB volume — rows survive | The library lists `GET /api/v1/tenants` to rediscover the tenant by label, then lists `GET /api/v1/auth/keys` for that tenant. If a non-revoked `tenant-runtime` operator key is found, `result.api_key` is `None` (carry forward the previously-stored plaintext from your DB — the proxy stores only a hash, so re-fetching the plaintext is impossible). If the record is missing (e.g. DB wipe), a fresh operator key is minted and returned |
391
+
392
+
The admin key follows the same recreate / restart split: pin it via
393
+
`spec.api_key`(or `api_key:` in YAML) if you need it stable across
394
+
recreates; on restart it is simply re-derived from spec and not re-minted.
364
395
365
396
### Disabling auth
366
397
@@ -371,43 +402,41 @@ mesh, a dev sandbox, etc.), turn auth off:
371
402
enable_proxy_auth: false
372
403
```
373
404
374
-
The library skips key generation, doesn't inject `LLMTRACE_AUTH_*`, and
375
-
returns `api_key: null` in the result. The proxy URL is then wide open;
376
-
own the consequences.
377
-
378
-
### Preserving the key across recreates
379
-
380
-
`update(..., strategy="recreate")`calls `provision()` under the hood,
381
-
which will auto-generate a fresh key unless you pass the existing one.
382
-
For plan upgrades where the tenant's apps shouldn't break, fetch the
0 commit comments