Problem
Multi-agent LettaBot deployments load the agents: array from config once at startup. There is no runtime mechanism to add, update, or remove an agent while the server is running — any change to the agent roster requires editing yaml and restarting the process.
For operators provisioning agents programmatically (per-tenant, per-user, dynamic workloads), this forces a full process restart per provisioning event. Restarts are not destructive (state persists via lettabot-agent.json, credentials, and the Letta server), but they interrupt in-flight work, break open WebSocket/Socket-Mode connections to channel providers, and make programmatic provisioning awkward.
Current behavior (source references)
src/main.ts iterates agents: once and calls gateway.addAgent(name, session) per entry. After startup the gateway is effectively frozen.
LettaGateway.addAgent() (src/core/gateway.ts) is only invoked during boot. There is no corresponding removeAgent() called from the runtime, no updateAgent(), and no public entry point that would rehydrate from config.
src/api/server.ts exposes /api/v1/messages, /api/v1/chat, /api/v1/chat/async, /api/v1/conversation, /api/v1/conversations, /api/v1/pairing/:channel, /api/v1/status, /v1/models, and /v1/chat/completions. Nothing for agent lifecycle.
- No SIGHUP handler, no yaml file watcher for the agents array, no
lettabot reload CLI command.
- The only reload-like mechanism in the codebase is per-adapter:
reloadConfigAt in src/channels/bluesky/adapter.ts refreshes a Bluesky agent's feed config via a runtime-state JSON file. It is scoped to a specific adapter's feed settings, not agent lifecycle, and does not generalize to add/remove.
Proposed approach
Primary: a REST API surface for agent lifecycle, consistent with the existing /api/v1/* routes and protected by the same auth.
POST /api/v1/agents # add a new agent from inline config
GET /api/v1/agents # list currently loaded agents
GET /api/v1/agents/:name # describe one agent
PATCH /api/v1/agents/:name # update channels/features/conversations
DELETE /api/v1/agents/:name # deregister and tear down channel adapters
POST /api/v1/agents/reload # re-read config file and diff against live state
Request body shape should mirror the existing agents: array element so config can be shuttled between yaml and API without transformation.
Lifecycle semantics that need to be correct:
- Channel adapters (Telegram, Slack, Discord, Signal, WhatsApp, Bluesky) must be started/stopped cleanly, including graceful disconnect for stateful adapters.
lettabot-agent.json must be updated in place so added agents persist across restarts.
- Pairing/credential files under
~/.lettabot/credentials/ for the new agent's channels must be created on first pairing as today.
- Cron jobs scoped per agent (
CRON_STORE_PATH) must be reinitialized when adding, and drained/cancelled when removing.
- Heartbeat loops must start/stop with the agent.
- The
LETTABOT_CONFIG_YAML path (Docker/cloud deploys) should be reconcilable with runtime state — either "yaml is the frozen boot-time declaration, runtime API layers on top" or "yaml is the source of truth and the reload endpoint diffs yaml against live state." Either is defensible; the behavior needs to be documented.
Secondary (lower priority, optional): a CLI equivalent (lettabot agents add|remove|list|reload) that wraps the API for ops workflows.
Use cases
- Programmatic provisioning of agents per tenant or per user from a control plane.
- Onboarding flows where a user request creates a new agent, attaches channels, and makes it live without operator intervention.
- Rolling out or rolling back a single agent without affecting the rest of a multi-agent fleet.
- CI/CD pipelines that declare the agent roster in yaml and can trigger a reload after deploy instead of a full process restart.
Related
Reported by
Surfaced in Discord by gerwitz (self-hosted operator provisioning agents at runtime). Current workaround is "edit yaml, restart LettaBot" under a process manager with fast restart (systemd, pm2, Docker restart policy). Restart is non-destructive but disruptive for programmatic provisioning.
Problem
Multi-agent LettaBot deployments load the
agents:array from config once at startup. There is no runtime mechanism to add, update, or remove an agent while the server is running — any change to the agent roster requires editing yaml and restarting the process.For operators provisioning agents programmatically (per-tenant, per-user, dynamic workloads), this forces a full process restart per provisioning event. Restarts are not destructive (state persists via
lettabot-agent.json, credentials, and the Letta server), but they interrupt in-flight work, break open WebSocket/Socket-Mode connections to channel providers, and make programmatic provisioning awkward.Current behavior (source references)
src/main.tsiteratesagents:once and callsgateway.addAgent(name, session)per entry. After startup the gateway is effectively frozen.LettaGateway.addAgent()(src/core/gateway.ts) is only invoked during boot. There is no correspondingremoveAgent()called from the runtime, noupdateAgent(), and no public entry point that would rehydrate from config.src/api/server.tsexposes/api/v1/messages,/api/v1/chat,/api/v1/chat/async,/api/v1/conversation,/api/v1/conversations,/api/v1/pairing/:channel,/api/v1/status,/v1/models, and/v1/chat/completions. Nothing for agent lifecycle.lettabot reloadCLI command.reloadConfigAtinsrc/channels/bluesky/adapter.tsrefreshes a Bluesky agent's feed config via a runtime-state JSON file. It is scoped to a specific adapter's feed settings, not agent lifecycle, and does not generalize to add/remove.Proposed approach
Primary: a REST API surface for agent lifecycle, consistent with the existing
/api/v1/*routes and protected by the same auth.Request body shape should mirror the existing
agents:array element so config can be shuttled between yaml and API without transformation.Lifecycle semantics that need to be correct:
lettabot-agent.jsonmust be updated in place so added agents persist across restarts.~/.lettabot/credentials/for the new agent's channels must be created on first pairing as today.CRON_STORE_PATH) must be reinitialized when adding, and drained/cancelled when removing.LETTABOT_CONFIG_YAMLpath (Docker/cloud deploys) should be reconcilable with runtime state — either "yaml is the frozen boot-time declaration, runtime API layers on top" or "yaml is the source of truth and the reload endpoint diffs yaml against live state." Either is defensible; the behavior needs to be documented.Secondary (lower priority, optional): a CLI equivalent (
lettabot agents add|remove|list|reload) that wraps the API for ops workflows.Use cases
Related
Reported by
Surfaced in Discord by gerwitz (self-hosted operator provisioning agents at runtime). Current workaround is "edit yaml, restart LettaBot" under a process manager with fast restart (systemd, pm2, Docker restart policy). Restart is non-destructive but disruptive for programmatic provisioning.