Skip to content

Commit 13310cf

Browse files
committed
API channel
1 parent ac132ed commit 13310cf

42 files changed

Lines changed: 2986 additions & 239 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/appendices/api-reference.md

Lines changed: 147 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,10 @@ The REST API serves two roles: the web channel interface (browser SPA talks dire
44

55
Base URL: `/v1`
66

7-
All endpoints require JWT authentication unless noted otherwise. The JWT is sent as `Authorization: Bearer <token>`.
7+
All endpoints require authentication unless noted otherwise. Two token types are accepted:
8+
9+
- **JWT access tokens** (`Authorization: Bearer eyJ...`) -- for interactive users. Required on everything except `/v1/api/*`.
10+
- **Service-account tokens** (`Authorization: Bearer surg_sk_...`) -- for programmatic clients. Accepted **only** on `/v1/api/*`; refused elsewhere. See [Service-Account Admin CRUD](#service-accounts-admin).
811

912
## Auth Endpoints
1013

@@ -323,6 +326,112 @@ Add an MCP server configuration.
323326

324327
Remove an MCP server.
325328

329+
## Feedback (API Channel)
330+
331+
Service-account clients — typically an automated judge grading pipeline
332+
output — record feedback against an `llm.response` or `expert.result`
333+
event through the same handler that serves the web UI, mounted under
334+
the `/v1/api/*` prefix so SA tokens can reach it.
335+
336+
### `POST /v1/api/sessions/{session_id}/events/{event_id}/feedback`
337+
338+
**Request:**
339+
```json
340+
{
341+
"rating": "up",
342+
"score": 0.87,
343+
"criteria": {"correctness": 0.9, "relevance": 0.85},
344+
"rationale": "Matches the reference; arithmetic is correct."
345+
}
346+
```
347+
348+
- `rating` (required, `"up"` or `"down"`) — binary bucket used by
349+
training-data selectors.
350+
- `score` (optional, `0.0-1.0`) — numeric grade when the principal is a
351+
judge; ignored by bucket-oriented selectors.
352+
- `criteria` (optional dict of string → float) — per-axis grades.
353+
- `rationale` (optional, max 10,000 chars) — free-form text the judge
354+
produced.
355+
- `reason` (optional, max 500 chars) — the shorter, human-UI-friendly
356+
explanation; interchangeable with `rationale` on the server side.
357+
358+
**Response (201):**
359+
```json
360+
{
361+
"event_id": 42,
362+
"event_type": "user.feedback",
363+
"source": "judge"
364+
}
365+
```
366+
367+
`source` is `"judge"` when the caller presented a service-account token
368+
and `"user"` when the caller presented an interactive JWT. Stored on
369+
the event's JSONB payload so downstream training-data selection and
370+
dashboards can weight the two independently.
371+
372+
**Idempotency.** Dedupe is keyed on `(session_id, event_id, principal)`
373+
where `principal` is the caller's `user_id` for JWT callers and
374+
`service_account_id` for SA callers. A retry from the same principal
375+
returns the original feedback event unchanged; feedback from a user
376+
and from a judge on the same turn coexist as two independent events.
377+
378+
## Prompts (API Channel)
379+
380+
Fire-and-forget prompt submission for non-interactive clients. Requires a service-account token. Results are read back from the `events` table by `session_id`. See [Channels / API](../channels/api.md) for the end-to-end pipeline workflow.
381+
382+
### `POST /v1/api/prompts`
383+
384+
Submit a single prompt.
385+
386+
**Request:**
387+
```json
388+
{
389+
"prompt": "Write a haiku about distributed systems.",
390+
"idempotency_key": "dataset-42/row-1337",
391+
"metadata": {"dataset_id": "ds_123", "row_index": 1337}
392+
}
393+
```
394+
395+
- `prompt` (required, max 200,000 chars).
396+
- `idempotency_key` (optional, max 200 chars) -- two submissions with the same key + org resolve to the same session; the second returns `deduplicated: true` and enqueues no new work.
397+
- `metadata` (optional dict) -- stored on `sessions.config['pipeline_metadata']`; the pipeline joins results back to its dataset via this field.
398+
399+
**Response (202):**
400+
```json
401+
{
402+
"session_id": "8f...",
403+
"event_id": 42,
404+
"deduplicated": false,
405+
"error": null
406+
}
407+
```
408+
409+
### `POST /v1/api/prompts:batch`
410+
411+
Submit up to 100 prompts in one round-trip. Each entry is processed independently; partial failures surface per-slot, not as a whole-request 500 (unless every entry fails).
412+
413+
**Request:**
414+
```json
415+
{
416+
"prompts": [
417+
{"prompt": "...", "idempotency_key": "row-1"},
418+
{"prompt": "...", "idempotency_key": "row-2"}
419+
]
420+
}
421+
```
422+
423+
**Response (202):**
424+
```json
425+
{
426+
"results": [
427+
{"session_id": "...", "event_id": 1, "deduplicated": false, "error": null},
428+
{"session_id": "...", "event_id": 2, "deduplicated": true, "error": null}
429+
]
430+
}
431+
```
432+
433+
Input order is preserved so the caller can zip results back to its input rows.
434+
326435
## Admin
327436

328437
These endpoints require admin permissions.
@@ -347,6 +456,43 @@ Create a user in an organization.
347456

348457
Install the Slack bot for an organization.
349458

459+
### Service Accounts (Admin) {#service-accounts-admin}
460+
461+
Issue and revoke service-account tokens that authenticate the API channel. All endpoints require the `admin` permission. Tokens have the prefix `surg_sk_`; the raw value is returned once on creation and is not recoverable.
462+
463+
#### `POST /v1/admin/service-accounts`
464+
465+
Issue a new token.
466+
467+
**Request:**
468+
```json
469+
{"org_id": "00000000-...", "name": "dataset-gen-v1"}
470+
```
471+
472+
**Response (201):**
473+
```json
474+
{
475+
"id": "uuid",
476+
"org_id": "00000000-...",
477+
"name": "dataset-gen-v1",
478+
"token_prefix": "surg_sk_abcd1234",
479+
"created_at": "2025-01-01T00:00:00Z",
480+
"last_used_at": null,
481+
"revoked_at": null,
482+
"token": "surg_sk_<44 chars>"
483+
}
484+
```
485+
486+
Store the `token` immediately -- only the `token_prefix` is persisted afterwards.
487+
488+
#### `GET /v1/admin/service-accounts?org_id={id}`
489+
490+
List service accounts for an org. `token` is never returned.
491+
492+
#### `DELETE /v1/admin/service-accounts/{id}`
493+
494+
Revoke a service account. Immediate in the revoking process; peer API/worker processes converge within 60 seconds (the in-memory auth cache's TTL). A second delete on the same id returns 404.
495+
350496
## Health and Metrics
351497

352498
### `GET /health`

docs/appendices/glossary.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@
44
|---|---|
55
| **ABAC** | Attribute-Based Access Control. Policy rules that evaluate attributes of the user, session, tool arguments, or environment to make access decisions. Example: "allow `refund_user` only if `amount < 1000`". |
66
| **AGT** | Agent Governance Toolkit. Microsoft's open-source library for agent policy enforcement, MCP security scanning, and capability modeling. Surogates uses AGT's `PolicyEngine`, `MCPSecurityScanner`, and `CapabilityModel`. |
7-
| **Channel** | The user-facing interface. Surogates has no CLI. Users interact through channels: the web chat UI and Slack. Each channel has an adapter that normalizes platform messages into the internal API. |
7+
| **API Channel** | The programmatic channel. Non-interactive clients (synthetic-data pipelines, batch jobs) submit prompts via `POST /v1/api/prompts` with a service-account token. Sessions have `channel="api"` and no user identity; results are read directly from the `events` table. |
8+
| **Channel** | The user-facing interface. Surogates has no CLI. Users interact through channels: web, Slack, Telegram, and the API channel for programmatic clients. Each has an adapter (or, for web/API, a REST endpoint set) that normalizes inbound messages into the internal API. |
89
| **Channel Identity** | A mapping between a platform-specific user ID (e.g., Slack user `U03ABCDEF`) and an internal Surogates user. Enables cross-channel session sharing. |
910
| **Cursor** | The last fully-processed event ID for a session. Used for crash recovery -- the new worker replays events after the cursor. Also used by SSE clients to resume event streams without data loss. |
1011
| **Delivery Outbox** | A PostgreSQL table that acts as a durable queue for outbound messages. Channel adapters claim rows, send messages, and mark them as delivered. Redis nudges are a latency optimization, not the source of truth. |
@@ -20,6 +21,7 @@
2021
| **Org** | Organization. The top-level tenant boundary. Each org has its own users, skills, memory, credentials, MCP servers, and policies. |
2122
| **Saga** | A tracked sequence of tool calls with automatic rollback. When a step fails, previously completed steps are compensated in reverse order -- builtin tools via filesystem checkpoints, MCP tools via declared undo operations. Named after the [saga pattern](https://microservices.io/patterns/data/saga.html) from distributed systems. |
2223
| **Sandbox** | An isolated execution environment where the LLM's generated code runs. In development: a subprocess in a temp directory. In production: a dedicated K8s pod with s3fs-fuse workspace mount. Also called "the hands". |
24+
| **Service Account** | An org-scoped principal used by non-interactive clients to authenticate against the API channel. Issued by an admin via `POST /v1/admin/service-accounts`; produces a long-lived `surg_sk_...` bearer token that is accepted only on `/v1/api/*` routes and carries no user identity. |
2325
| **Session** | A conversation between a user and an agent. Backed by an append-only event log in PostgreSQL. Sessions survive crashes -- any worker can resume from the last event. |
2426
| **Session Source** | Metadata about where a message came from: platform, chat ID, chat type, user ID, thread ID. Used to route messages to the correct session. |
2527
| **Skill** | A reusable, prompt-based behavior defined in a `SKILL.md` file. Skills are loaded from three layers (platform > org > user) with last-wins precedence. |

docs/architecture/index.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Surogates follows the three-component model: decouple the brain from the hands,
77
```
88
+-----------------------------------------------------------------+
99
| Channel Adapters |
10-
| Web Chat UI (SPA) | Slack |
10+
| Web SPA | Slack | Telegram | API (service account) |
1111
+---------------+-------+---------+---------+------------+--------+
1212
|
1313
+---------------v-------------------------------------------------+
@@ -110,6 +110,21 @@ The sandbox runs the full `surogates` Python package. A `tool-executor` script a
110110
5. Adapter formats payload -> sends via platform API -> marks row delivered
111111
```
112112

113+
### API Channel (Programmatic)
114+
115+
```
116+
1. Pipeline: POST /v1/api/prompts with a service-account token (surg_sk_...)
117+
2. API Server: resolve service account -> create session (channel="api",
118+
user_id=NULL) -> emit user.message event -> enqueue to Redis -> 202
119+
3. Worker: dequeue -> wake(session_id) -> harness loop -> events emitted
120+
4. Pipeline: reads results back from the `events` table keyed by session_id
121+
(no streaming, no SSE). `sessions.status` indicates completion.
122+
```
123+
124+
API-channel sessions never appear in the delivery outbox -- pipelines pull
125+
directly from PostgreSQL. See [Channels / API](../channels/api.md) for the
126+
request/response schema and idempotency semantics.
127+
113128
### Crash Recovery
114129

115130
```

docs/audit/views.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,12 @@ External consumers should prefer views over raw JSONB queries — adding
1111
new keys to an event's JSONB payload never breaks a view-backed
1212
query, because the view's column list stays fixed.
1313

14+
**A note on `user_id`.** The column is `NULL` for every event in an
15+
API-channel session (sessions submitted via `POST /v1/api/prompts`,
16+
owned by a service account instead of a user). Dashboards that group
17+
by `user_id` should also group by `channel` to avoid collapsing
18+
every service-account session into a single "unknown user" bucket.
19+
1420
| View | Driven by | Purpose |
1521
|---|---|---|
1622
| [`v_session_tree`](#v_session_tree) | `sessions.parent_id` | Recursive ancestry for expert-delegation subtrees. |

docs/background-jobs/index.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -160,6 +160,11 @@ The training collector extracts successful conversation trajectories from the ev
160160
4. Write to tenant-{org_id}/shared/skills/{expert}/training/
161161
```
162162
163+
Sessions from every channel (web, Slack, Telegram, API) are considered
164+
training candidates. Synthetic-data pipelines that submit prompts via
165+
`POST /v1/api/prompts` feed successful trajectories back into expert
166+
fine-tuning exactly like human-driven sessions.
167+
163168
### Usage
164169
165170
```bash

docs/channels/api.md

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
# API Channel
2+
3+
The API channel is a programmatic, fire-and-forget interface for non-interactive clients -- synthetic data generation pipelines, batch evaluation jobs, and any other workload that submits prompts from outside the web or messaging channels. Authentication is by org-scoped API key ("service-account token"); no user identity is involved.
4+
5+
The API channel is not a chat interface -- it accepts a prompt, creates a session, queues it for the worker, and returns the session identifier. Results are read directly from the `events` and `sessions` database tables.
6+
7+
## When to use it
8+
9+
| Use case | Example |
10+
|---|---|
11+
| Synthetic training-data generation | A pipeline iterates over dataset rows, submits each prompt as a session, and later sweeps the `events` table for `llm.response` rows to harvest completions. |
12+
| Automated evaluations | A scorer submits thousands of prompts in parallel and reads `events.data` for downstream metrics. |
13+
| Scheduled bulk work | A cron job dispatches org-wide prompt runs. |
14+
15+
Do **not** use the API channel for interactive experiences -- use the [web channel](web.md) instead, which streams tokens and tool calls live over SSE.
16+
17+
## Authentication
18+
19+
The client presents an API key in the `Authorization: Bearer` header. API keys have the prefix `surg_sk_` and are issued to an org by an admin:
20+
21+
```
22+
POST /v1/admin/service-accounts
23+
Authorization: Bearer <admin-jwt>
24+
25+
{
26+
"org_id": "00000000-...",
27+
"name": "dataset-gen-v1"
28+
}
29+
```
30+
31+
The raw token is returned **exactly once** in the response body (`token`). Store it immediately -- the server keeps only a SHA-256 hash and cannot recover the plaintext. List and revoke endpoints live under the same `/v1/admin/service-accounts` prefix.
32+
33+
API keys may only authenticate requests to routes under `/v1/api/*`. Presenting one anywhere else returns 403. Conversely, the `/v1/api/*` routes reject interactive JWTs so the two principal types stay cleanly separated.
34+
35+
## Submitting a prompt
36+
37+
```
38+
POST /v1/api/prompts
39+
Authorization: Bearer surg_sk_...
40+
41+
{
42+
"prompt": "Write a haiku about distributed systems.",
43+
"idempotency_key": "dataset-42/row-1337",
44+
"metadata": {
45+
"dataset_id": "ds_123",
46+
"row_index": 1337,
47+
"experiment": "baseline-v3"
48+
}
49+
}
50+
```
51+
52+
Response (`202 Accepted`):
53+
54+
```json
55+
{
56+
"session_id": "8f...",
57+
"event_id": 42,
58+
"deduplicated": false
59+
}
60+
```
61+
62+
The worker picks the session off the Redis queue and processes it asynchronously. The pipeline owns the returned `session_id` and uses it to read results from the database.
63+
64+
### Idempotency
65+
66+
`idempotency_key` is an optional client-supplied string scoped per org. Two requests from the same org with the same key resolve to the **same** session:
67+
68+
- first call -> `deduplicated: false`, new session created
69+
- second call -> `deduplicated: true`, original `session_id` returned, no new work queued
70+
71+
Use this to make pipeline retries safe under timeouts or restarts. Keys from different orgs do not collide.
72+
73+
### Metadata passthrough
74+
75+
Anything in `metadata` is persisted onto `sessions.config['pipeline_metadata']`. The pipeline joins results back to its source dataset by querying for sessions with specific metadata values -- no side-table required.
76+
77+
## Submitting a batch
78+
79+
```
80+
POST /v1/api/prompts:batch
81+
Authorization: Bearer surg_sk_...
82+
83+
{
84+
"prompts": [
85+
{"prompt": "...", "idempotency_key": "row-1", "metadata": {"i": 1}},
86+
{"prompt": "...", "idempotency_key": "row-2", "metadata": {"i": 2}}
87+
]
88+
}
89+
```
90+
91+
Each entry is accepted independently. The response preserves input order so callers can zip results back to their input rows. Up to 100 prompts per request.
92+
93+
## Reading results
94+
95+
Each submitted prompt becomes a session (`channel='api'`). The pipeline reads:
96+
97+
| Signal | Source |
98+
|---|---|
99+
| Final LLM answer | `events` rows with `type = 'llm.response'` for the session |
100+
| Tool calls / tool results | `events` rows with `type IN ('tool.call', 'tool.result')` |
101+
| Completion status | `sessions.status` (`active`, `idle`, `completed`, `failed`) |
102+
| Cost / token usage | `sessions.input_tokens`, `sessions.output_tokens`, `sessions.estimated_cost_usd` |
103+
| Pipeline metadata | `sessions.config->'pipeline_metadata'` |
104+
105+
The `v_session_messages` view returns conversation-shaped events in training-data format; the `v_response_feedback` and `v_tool_invocations` views expose related signals. See [docs/audit/views.md](../audit/views.md) for the full catalog.
106+
107+
## Recording judge feedback
108+
109+
Pipelines that run an automated judge over their outputs record the judge's grade by `POST /v1/api/sessions/{session_id}/events/{event_id}/feedback`, authenticated with the same service-account token. The endpoint accepts binary `rating` (required), a numeric `score`, per-axis `criteria`, and a free-form `rationale`. The stored event carries `source: "judge"` so downstream training-data selection can weight judge feedback independently from human thumbs. See [Appendix B: Feedback (API Channel)](../appendices/api-reference.md#feedback-api-channel) for the full schema and idempotency semantics.
110+
111+
## Interaction with other subsystems
112+
113+
- **Training data**: API sessions participate in `TrainingDataCollector` exports on the same footing as every other channel -- successful expert delegations and skill invocations from pipeline-submitted prompts are eligible for fine-tuning.
114+
- **Idle reset**: the session-reset CronJob resets API sessions in place without running the memory-flush agent -- service accounts have no per-user memory.
115+
- **Memory**: API sessions use the org-shared memory directory, not user-scoped memory.
116+
- **Permissions**: API keys carry no permissions; access is scoped entirely by org membership. They cannot reach admin, auth, or any other `/v1/` routes.

docs/channels/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ There is no CLI. All user interaction happens through channels. A channel is an
99
| **[Web](web.md)** | Browser-based chat UI with real-time streaming, session management, and workspace browsing |
1010
| **[Slack](slack.md)** | Socket Mode integration with DMs, @mentions, threading, file attachments, and multi-workspace support |
1111
| **[Telegram](telegram.md)** | Bot API integration with DMs, groups, forum topics, media handling, and fallback IP transport for restricted networks |
12+
| **[API](api.md)** | Fire-and-forget programmatic channel for synthetic data pipelines and batch jobs. Service-account auth, idempotent submission, results read from database tables. |
1213

1314
## Session Routing
1415

docs/index.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ Built on Kubernetes, Surogates implements the [Managed Agents architecture](http
3939
### [5. Multi-Tenancy](multi-tenancy/index.md)
4040
- Tenant model (orgs, users, channel identities)
4141
- Authentication (database provider)
42+
- Service-account tokens for programmatic access (API channel)
4243
- Per-org provider configuration
4344
- JWT token flow (issuance, refresh, validation)
4445
- Tenant context and credential vault
@@ -48,6 +49,8 @@ Built on Kubernetes, Surogates implements the [Managed Agents architecture](http
4849
- Channel adapter protocol
4950
- [Web](channels/web.md) -- browser chat UI with real-time streaming, session management, workspace browsing
5051
- [Slack](channels/slack.md) -- setup guide, Socket Mode, DMs, @mentions, threading, file attachments, multi-workspace
52+
- [Telegram](channels/telegram.md) -- Bot API, DMs, groups, forum topics, media handling
53+
- [API](channels/api.md) -- fire-and-forget programmatic channel for synthetic-data pipelines and batch jobs
5154
- Session routing and response delivery (durable outbox, Redis nudges)
5255

5356
### [7. Tools](tools/index.md)

0 commit comments

Comments
 (0)