Feature: Runtime provisioning of agents (add/update/remove without restart)

## Problem

Multi-agent LettaBot deployments load the `agents:` array from config once at startup. There is no runtime mechanism to add, update, or remove an agent while the server is running — any change to the agent roster requires editing yaml and restarting the process.

For operators provisioning agents programmatically (per-tenant, per-user, dynamic workloads), this forces a full process restart per provisioning event. Restarts are not destructive (state persists via `lettabot-agent.json`, credentials, and the Letta server), but they interrupt in-flight work, break open WebSocket/Socket-Mode connections to channel providers, and make programmatic provisioning awkward.

## Current behavior (source references)

- `src/main.ts` iterates `agents:` once and calls `gateway.addAgent(name, session)` per entry. After startup the gateway is effectively frozen.
- `LettaGateway.addAgent()` (`src/core/gateway.ts`) is only invoked during boot. There is no corresponding `removeAgent()` called from the runtime, no `updateAgent()`, and no public entry point that would rehydrate from config.
- `src/api/server.ts` exposes `/api/v1/messages`, `/api/v1/chat`, `/api/v1/chat/async`, `/api/v1/conversation`, `/api/v1/conversations`, `/api/v1/pairing/:channel`, `/api/v1/status`, `/v1/models`, and `/v1/chat/completions`. Nothing for agent lifecycle.
- No SIGHUP handler, no yaml file watcher for the agents array, no `lettabot reload` CLI command.
- The only reload-like mechanism in the codebase is per-adapter: `reloadConfigAt` in `src/channels/bluesky/adapter.ts` refreshes a Bluesky agent's feed config via a runtime-state JSON file. It is scoped to a specific adapter's feed settings, not agent lifecycle, and does not generalize to add/remove.

## Proposed approach

Primary: a REST API surface for agent lifecycle, consistent with the existing `/api/v1/*` routes and protected by the same auth.

```
POST   /api/v1/agents              # add a new agent from inline config
GET    /api/v1/agents              # list currently loaded agents
GET    /api/v1/agents/:name        # describe one agent
PATCH  /api/v1/agents/:name        # update channels/features/conversations
DELETE /api/v1/agents/:name        # deregister and tear down channel adapters
POST   /api/v1/agents/reload       # re-read config file and diff against live state
```

Request body shape should mirror the existing `agents:` array element so config can be shuttled between yaml and API without transformation.

Lifecycle semantics that need to be correct:
- Channel adapters (Telegram, Slack, Discord, Signal, WhatsApp, Bluesky) must be started/stopped cleanly, including graceful disconnect for stateful adapters.
- `lettabot-agent.json` must be updated in place so added agents persist across restarts.
- Pairing/credential files under `~/.lettabot/credentials/` for the new agent's channels must be created on first pairing as today.
- Cron jobs scoped per agent (`CRON_STORE_PATH`) must be reinitialized when adding, and drained/cancelled when removing.
- Heartbeat loops must start/stop with the agent.
- The `LETTABOT_CONFIG_YAML` path (Docker/cloud deploys) should be reconcilable with runtime state — either \"yaml is the frozen boot-time declaration, runtime API layers on top\" or \"yaml is the source of truth and the reload endpoint diffs yaml against live state.\" Either is defensible; the behavior needs to be documented.

Secondary (lower priority, optional): a CLI equivalent (`lettabot agents add|remove|list|reload`) that wraps the API for ops workflows.

## Use cases

- Programmatic provisioning of agents per tenant or per user from a control plane.
- Onboarding flows where a user request creates a new agent, attaches channels, and makes it live without operator intervention.
- Rolling out or rolling back a single agent without affecting the rest of a multi-agent fleet.
- CI/CD pipelines that declare the agent roster in yaml and can trigger a reload after deploy instead of a full process restart.

## Related

- #653 (Config parity across ADE, yaml, slash commands, API) — addresses runtime configurability of settings on existing agents but does not cover agent lifecycle.
- #107 (unify Letta Code and LettaBot agent management) — adjacent; a runtime agent API would be a natural surface for this unification.

## Reported by

Surfaced in Discord by gerwitz (self-hosted operator provisioning agents at runtime). Current workaround is \"edit yaml, restart LettaBot\" under a process manager with fast restart (systemd, pm2, Docker restart policy). Restart is non-destructive but disruptive for programmatic provisioning.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Runtime provisioning of agents (add/update/remove without restart) #694

Problem

Current behavior (source references)

Proposed approach

Use cases

Related

Reported by

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature: Runtime provisioning of agents (add/update/remove without restart) #694

Description

Problem

Current behavior (source references)

Proposed approach

Use cases

Related

Reported by

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions