Skip to content

Commit 1034670

Browse files
authored
docs(deployments/basilica): embedding the lifecycle in your app (#232)
Adds an "Embedding in your app (skip GitHub Actions)" section covering the alternative integration mode for apps that already have a worker / queue and don't want the Actions wrapper. Covers: - Why skip the workflow (no public log surface, lower latency, native error handling, app-owned trigger semantics) - Distribution options (vendor, submodule, future pip-installable, reimplement against basilica-sdk directly) - Library API at a glance with a complete provision example, subsequent status / update / deprovision shapes - Subprocess invocation for language-agnostic callers (Node / Go / Rust / Ruby) - Async wrapping (asyncio.to_thread adapter for FastAPI / Starlette) - Background-worker pattern (Celery / RQ / dramatiq / etc.) — the recommended production shape with idempotent re-entry, retries, state persistence - Error handling table (ValueError / RuntimeError / TimeoutError semantics + when to retry) - Secret handling without the workflow (direct dict-passing, never serialised to logs) - BASILICA_API_TOKEN auth + rotation - When to still use the workflow (no worker yet, audit, ops, prototype) No code changes; documentation-only.
1 parent 738da08 commit 1034670

1 file changed

Lines changed: 264 additions & 0 deletions

File tree

deployments/basilica/README.md

Lines changed: 264 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -290,6 +290,270 @@ tenant-supplied values).
290290
Basilica. Treat the Basilica account as a trust boundary; revoke + rotate
291291
any tenant key that leaks if the account is compromised.
292292

293+
## Embedding in your app (skip GitHub Actions)
294+
295+
The GitHub Actions workflow is a convenience wrapper around `lifecycle.py`.
296+
If your app already has a worker / queue / scheduler, you can drive
297+
provisioning directly and skip the workflow entirely. Reasons you might
298+
want this:
299+
300+
- **Zero public log surface** — workflow runs in a public repo leak
301+
tenant IDs, URLs, and timestamps; in-process leaves no public trace.
302+
- **Lower latency** — no Actions runner cold-start (typically 10–30s of
303+
startup before the CLI even runs).
304+
- **Tighter secret handling** — bearers never leave the app's process or
305+
its secret store; nothing rides through GitHub's webhook surface.
306+
- **Native error handling** — Python exceptions instead of parsing
307+
step-output JSON.
308+
- **Your app's auth/RBAC/idempotency** wraps the trigger — no
309+
duplicate dispatch issues.
310+
311+
### Dependency footprint
312+
313+
Just two pip packages:
314+
315+
```bash
316+
pip install basilica-sdk PyYAML
317+
```
318+
319+
`PyYAML` is only needed if you load configs from YAML files; if you
320+
construct `TenantSpec` directly in code, you can drop it.
321+
322+
### Distribution options
323+
324+
| Option | When |
325+
|---|---|
326+
| **Vendor** `deployments/basilica/{__init__,lifecycle,cli}.py` + `configs/examples/*.yaml` into your app's repo | Simplest; ~700 LoC total. Lets your app evolve the library independently. Pin the upstream commit in a comment for traceability |
327+
| **Git submodule / subtree** | If you want upstream changes to flow in semi-automatically |
328+
| **`pip install git+https://github.com/techlab-innov/llmtrace.git`** | Not currently set up — the repo isn't packaged as a pip-installable; would need a top-level `pyproject.toml` exposing `deployments.basilica`. Open an issue if you want this |
329+
| **Re-implement using `basilica-sdk` directly** in your app's language (Node, Go, Rust, etc.) | If your app isn't Python. The Python library is ~400 LoC of wrapping; the underlying SDK is what does the work. See `lifecycle.py:213-249` (`_create_component`) for the minimum shape |
330+
331+
### Library API at a glance
332+
333+
```python
334+
from deployments.basilica.lifecycle import (
335+
ComponentSpec, TenantSpec, TenantInstances, InstanceInfo,
336+
provision, update, deprovision, status, make_client,
337+
)
338+
339+
# Make a client once (uses BASILICA_API_TOKEN env, or pass api_key explicitly).
340+
client = make_client(api_key="basilica_...")
341+
342+
# Build a spec for the tenant (everything is required from you — no defaults).
343+
spec = TenantSpec(
344+
tenant_id="acme",
345+
proxy=ComponentSpec(
346+
image="ghcr.io/techlab-innov/llmtrace-proxy:latest",
347+
port=8080,
348+
cpu="2",
349+
memory="4Gi",
350+
replicas=1,
351+
env={
352+
"LLMTRACE_UPSTREAM_URL": "https://api.openai.com",
353+
"OPENAI_API_KEY": tenant_record.openai_api_key, # from your DB
354+
"LLMTRACE_STORAGE_PROFILE": "memory",
355+
"LLMTRACE_ML_ENABLED": "1",
356+
"LLMTRACE_LOG_LEVEL": "info",
357+
"LLMTRACE_LOG_FORMAT": "json",
358+
"RUST_LOG": "info",
359+
},
360+
startup_timeout_seconds=600,
361+
),
362+
dashboard=ComponentSpec(
363+
image="ghcr.io/techlab-innov/llmtrace-dashboard:latest",
364+
port=3000,
365+
cpu="1",
366+
memory="1Gi",
367+
replicas=1,
368+
env={
369+
"HOSTNAME": "0.0.0.0",
370+
"NODE_ENV": "production",
371+
"LLMTRACE_AUTH_ADMIN_KEY": tenant_record.dashboard_admin_key,
372+
},
373+
startup_timeout_seconds=300,
374+
),
375+
# inject_proxy_url_into_dashboard defaults to True
376+
)
377+
378+
# Provision (blocking — see below for the async pattern).
379+
instances: TenantInstances = provision(spec, client=client)
380+
381+
# Persist the UUIDs to your DB — they're the only handle for future ops.
382+
tenant_record.proxy_instance_id = instances.proxy.instance_id
383+
tenant_record.proxy_url = instances.proxy.url
384+
tenant_record.dashboard_instance_id = instances.dashboard.instance_id
385+
tenant_record.dashboard_url = instances.dashboard.url
386+
tenant_record.save()
387+
```
388+
389+
Subsequent ops use the persisted UUIDs:
390+
391+
```python
392+
# Health check before sending traffic
393+
current = status(
394+
tenant_id=tenant_record.id,
395+
proxy_instance_id=tenant_record.proxy_instance_id,
396+
dashboard_instance_id=tenant_record.dashboard_instance_id,
397+
client=client,
398+
)
399+
if current.proxy is None or current.proxy.state != "Active":
400+
# tenant's proxy went missing — recover
401+
402+
# Plan upgrade — recreates with new spec; URL changes
403+
new_instances = update(
404+
spec=upgraded_spec,
405+
proxy_instance_id=tenant_record.proxy_instance_id,
406+
dashboard_instance_id=tenant_record.dashboard_instance_id,
407+
strategy="recreate",
408+
client=client,
409+
)
410+
tenant_record.proxy_instance_id = new_instances.proxy.instance_id # overwrite!
411+
tenant_record.proxy_url = new_instances.proxy.url
412+
# ... same for dashboard, then save
413+
414+
# Stripe cancellation → deprovision
415+
deprovision(
416+
tenant_id=tenant_record.id,
417+
proxy_instance_id=tenant_record.proxy_instance_id,
418+
dashboard_instance_id=tenant_record.dashboard_instance_id,
419+
client=client,
420+
)
421+
```
422+
423+
### Subprocess invocation (language-agnostic)
424+
425+
If your app isn't Python, fork the CLI as a subprocess with the secrets in
426+
the child's environment. The CLI emits JSON to stdout; parse and persist:
427+
428+
```bash
429+
# From a Node / Go / Rust / Ruby worker
430+
OPENAI_API_KEY="$tenant_openai_key" \
431+
LLMTRACE_UPSTREAM_URL="https://api.openai.com" \
432+
BASILICA_API_TOKEN="$platform_token" \
433+
python -m deployments.basilica.cli provision \
434+
--tenant-id "$tenant_id" \
435+
--config /etc/llmtrace/tenant-config.yaml
436+
# stdout is JSON:
437+
# { "tenant_id": "...", "proxy_instance_id": "...", "proxy_url": "...", ... }
438+
```
439+
440+
Exit codes: `0` success, `2` usage error, `3` lifecycle error. The CLI's
441+
`${VAR}` substitution reads from the child's env — same secret-injection
442+
shape as the workflow, just driven by your app instead of the
443+
`Inject tenant secrets` workflow step.
444+
445+
### Async wrapping (FastAPI / Starlette / aiohttp)
446+
447+
The library is synchronous and blocks on `_wait_until_ready` (up to
448+
`startup_timeout_seconds`, default 600s for the proxy). Don't call
449+
`provision` from a request handler — wrap it in a background worker.
450+
451+
Quick adapter for async code:
452+
453+
```python
454+
import asyncio
455+
from deployments.basilica import lifecycle
456+
457+
async def provision_async(spec: lifecycle.TenantSpec) -> lifecycle.TenantInstances:
458+
return await asyncio.to_thread(lifecycle.provision, spec)
459+
```
460+
461+
This runs the blocking call in a thread pool so the event loop stays free.
462+
Same pattern for `update`, `deprovision`, `status`.
463+
464+
### Background-worker pattern (recommended)
465+
466+
The real shape for production:
467+
468+
```python
469+
# Worker task (Celery / RQ / dramatiq / Hatchet / Temporal / etc.)
470+
@worker.task(bind=True, max_retries=3, soft_time_limit=900)
471+
def provision_tenant(self, tenant_id: str) -> None:
472+
tenant = db.session.query(Tenant).get(tenant_id)
473+
if tenant.proxy_instance_id:
474+
return # idempotent: already provisioned
475+
476+
spec = build_spec_from_tenant_record(tenant)
477+
try:
478+
result = lifecycle.provision(spec)
479+
except lifecycle.RuntimeError as exc:
480+
tenant.last_error = str(exc)
481+
tenant.state = "provision_failed"
482+
db.session.commit()
483+
raise self.retry(exc=exc, countdown=60)
484+
485+
tenant.proxy_instance_id = result.proxy.instance_id
486+
tenant.proxy_url = result.proxy.url
487+
tenant.dashboard_instance_id = result.dashboard.instance_id
488+
tenant.dashboard_url = result.dashboard.url
489+
tenant.state = "active"
490+
db.session.commit()
491+
```
492+
493+
Flow: Stripe webhook → enqueue task → worker pulls → calls
494+
`lifecycle.provision()` → persists UUIDs → updates tenant state. Request
495+
handlers stay sub-100ms; the actual provision happens out-of-band.
496+
497+
### Error handling
498+
499+
The library raises three exception classes that your worker should map:
500+
501+
| Exception | Meaning | Recommended action |
502+
|---|---|---|
503+
| `ValueError` | Bad input (invalid `tenant_id` slug, unknown `strategy`, missing required spec field) | 4xx to caller; don't retry |
504+
| `RuntimeError` | Basilica-side failure (deployment entered terminal `Failed` state, `BASILICA_API_TOKEN` missing, etc.) | 5xx + alert; retry with backoff if transient |
505+
| `TimeoutError` | `startup_timeout_seconds` elapsed without ready | 5xx; check Basilica's UI for stuck deployment; consider a manual `deprovision` to clean up |
506+
507+
The CLI maps both `RuntimeError` and `TimeoutError` to exit code 3 with
508+
the message in the JSON output. From a subprocess caller, key on the exit
509+
code rather than parsing the error string.
510+
511+
### Secret handling without the workflow
512+
513+
When the workflow runs, the `Inject tenant secrets` step + `::add-mask::`
514+
machinery exists to keep per-tenant values out of public logs. In your
515+
app, you can do better: keep the values in process memory, pass them
516+
directly into the `ComponentSpec.env` dict, and never serialise them to
517+
anywhere except the Basilica API call itself.
518+
519+
Recommended pattern:
520+
521+
1. Stripe webhook arrives with `tenant_id` + plan.
522+
2. App fetches per-tenant secrets from its store (Vault / AWS Secrets
523+
Manager / encrypted Postgres column).
524+
3. Worker builds `ComponentSpec.env` with the resolved values.
525+
4. `lifecycle.provision()` ships them once to Basilica's
526+
`create_deployment`.
527+
5. The values are never logged. (Basilica stores them as deployment
528+
env — that's the boundary the Basilica account owns.)
529+
530+
No environment variables, no `${VAR}` substitution layer, no GitHub
531+
secrets. Just dict-passing. This is the cleanest secret-flow path you
532+
can build.
533+
534+
### Auth — how the app holds `BASILICA_API_TOKEN`
535+
536+
The token is the only platform-side credential. Store it the same way
537+
your app stores its other infrastructure secrets (env var, secret
538+
manager, IAM-role-derived). Pass it to `make_client(api_key=...)`
539+
explicitly, or set `BASILICA_API_TOKEN` in the worker's env and let
540+
`make_client()` pick it up.
541+
542+
Rotate periodically. Compromised token = whoever holds it can list /
543+
read / modify / delete every tenant's deployment. Treat it like a root
544+
key.
545+
546+
### When to still use the workflow
547+
548+
- You don't have a worker infrastructure yet and want to get a first
549+
tenant up without writing one.
550+
- You want GitHub's audit log of every dispatch (in addition to your
551+
app's own log) — useful for compliance.
552+
- You want operators to fire ad-hoc lifecycle ops via the `gh` CLI from
553+
their laptop.
554+
- You're in early prototyping and the workflow's 25-minute timeout is a
555+
useful safety net.
556+
293557
## Provider examples
294558

295559
Per-tenant `tenant_secrets` shape for each provider. The proxy forwards

0 commit comments

Comments
 (0)