-
Notifications
You must be signed in to change notification settings - Fork 5
Agent Farm Setup
Agent Farm mode turns enrolled slaves into distributed compute nodes. Master's team hires remote teammates; each teammate's turns execute on a different machine via signed agent.execute envelopes. Results stream back through the same mailbox as local teammates.
- You have multiple machines and want to pool compute
- You want to isolate risky agent work on a dedicated device (sandboxed with different filesystem access)
- You have ACP CLI subscriptions tied to specific devices
- You want to distribute cost across multiple provider accounts
- You're building a team with heterogeneous requirements (one teammate needs GPU, another doesn't)
- Master is set up (see Master Setup Guide)
- At least one slave is enrolled in farm role. The second titlebar icon on a slave lets you pick
WorkforceorFarm— pick Farm. - The slave has ACP runtimes detected (the runtimes you want to farm out — Claude Code, Gemini, etc.)
- Farm slave is reachable from master (normal fleet connectivity)
Regular team creation flow. The team lives on master; farm teammates will be added remotely.
- Team → + Hire Teammate
- Pick a template from the gallery (or Custom)
- In the Backend step:
-
Local(default) — runs on master's machine -
Farm— pick a farm-role device from the dropdown
-
- For Farm: pick the Runtime (slave's detected runtimes are listed; each shows "on device" if confirmed available)
- Set the Tool allowlist — master-enforced. The farm slave will refuse any tool call not in this list.
- Set the Timeout — default 120s per turn. Farm teammates are subject to network latency; plan for 2-3x local timing.
- Hire
The moment you click Hire, master fires team.farm_provision to the selected slave. Slave:
- Creates a mirror team with the same
teamId - Creates a local Lead ACP session (cached, idle timeout 30min)
- Creates a farm teammate slot bound to the remote slot ID
- Shows the team in the slave's Teams UI with a badge:
Mirror of master's farm slot · read-only
The slave operator sees the team appear in real-time. They can view the live messages + activity but can't interact (the farm team is master-orchestrated).
- Master's Lead decides to delegate to the farm teammate (via regular mailbox)
- Master builds an
agent.executeenvelope:{ "jobId": "<uuid>", "slotId": "<remote slot ID>", "messages": [...], "model": "claude-sonnet-3.5", "temperature": 0.7, "toolsAllowlist": ["mcp.team.send_message", "mcp.team.update_task"], "timeoutMs": 120000 } - Master signs with Ed25519, queues in
fleet_commandstable - Slave polls, picks up the command
- Farm executor verifies envelope signature + replay nonce
- Routes the turn through its cached Lead ACP session (or spawns if not cached)
- Streams the output back as a multipart response to master
- On turn completion, master's Lead sees the farm teammate's reply in the team mailbox — identical shape to a local teammate's reply
A team can mix local + farm teammates freely. From the Lead's perspective, it's one team with one mailbox. The farm distinction is invisible to the coordination logic.
When hiring a farm teammate, you set toolsAllowlist[]. The slave's farm executor enforces this list:
- Tools in the list: allowed
- Tools not in the list: blocked with
policy_deniedack reason
Recommended allowlists by role:
Backend engineer on farm:
[
"mcp.filesystem.read:/Users/*/projects/**",
"mcp.filesystem.write:/Users/*/projects/**",
"mcp.team.send_message",
"mcp.team.update_task",
"mcp.shell.exec"
]Research analyst on farm:
[
"mcp.web.fetch",
"mcp.team.send_message",
"mcp.team.update_task"
]Start narrow and widen as needed. Every denied call shows in the activity log on both master + slave sides.
Each farm slave has implicit capacity:
- Concurrent turns — bounded by slave's hardware + ACP runtime concurrency (typically 1-2 concurrent agent turns per device for smooth UX)
- Memory — each cached ACP session uses ~300-500 MB
- Cost — farm slave's ACP runtime auth determines who pays (the subscription is the slave's, though the master can mandate a specific account via the slave's managed config)
Master's farm picker shows each device's concurrent jobs in flight so you can balance.
If the slave goes offline mid-turn:
- Master's timeout fires (default 120s after dispatch)
- Ack comes back as
timeout(or no ack at all after grace) - Master's team sees a failure message in the farm teammate's mailbox
- Lead can re-delegate or assign to a different teammate
Operational guidance: set timeouts generously (2-3x the expected turn duration). Farm slaves under load + network latency add variance.
Master-side auto-resume works: on master restart, Leads wake + re-read the board. Tasks owned by farm teammates continue as normal.
Slave-side: if slave is offline when master's Lead wakes + tries to delegate, the command queues in master's DB. When slave comes back online, it polls + picks up the queued command.
Fleet → Dashboard → [device]: shows farm stats (jobs completed, failed, timed out, avg latency).
Slave's own Observability → Activity Log shows fleet.command.executed for each agent.execute it received.
Team → Settings → Farm stats per team shows per-teammate latency distribution. Outliers (p95/p99) surface network + runtime issues.
See Troubleshooting#farm-teammate-gives-no-response for the full list.
Frequent ones:
-
runtime_unavailable— slave doesn't have the picked runtime. Install it on slave + re-detect. -
policy_denied— tool allowlist blocks a call the agent wants to make. Expand the allowlist. -
runtime_timeout— turn exceededtimeoutMs. Increase on master's team settings. - Slave's mirror team shows "Conversation not found" — slave didn't materialize the mirror properly. Manually un-hire + re-hire from master.
- All envelopes are Ed25519-signed end-to-end; no unauthenticated turn can execute
- Replay nonces prevent duplicate execution
- Tool allowlists cap blast radius even if an agent goes off-script
- Admin re-auth is not required for
agent.execute(performance); it's required for destructive commands only
The threat model: master is trusted. Farm slaves are semi-trusted. If a slave is compromised, the attacker has access to the slave's own workspace + the JWT (which only grants farm-command participation; no access to master's config bundle push authority).
- Fleet Mode Overview — the big picture
- Hiring Agents from the Gallery — the hire flow with Farm option
- Command Center — non-farm command dispatch
- Fleet Command Types — envelope specs
-
/docs/feature/fleet/README.md— canonical protocol spec
TitanX · Enterprise AI Agent Orchestration · Apache-2.0
Docs: Wiki · Technical docs · Releases · Security
Last updated for v2.5.1 — report doc issue · contribute to the wiki
📖 Getting Started
🧩 Core Concepts
- Architecture Overview
- Agents and Teams
- Agent Gallery and Templates
- ACP Runtimes
- MCP Servers
- Workspaces
- Reasoning Bank
👤 End-User Guides
- Hiring Agents from the Gallery
- The Sprint Board
- Conversations and Chat UI
- Using Custom Assistants
- Skills Hub
- Cron and Scheduled Tasks
- Observability
- Caveman Mode
🌐 Fleet Mode
- Fleet Mode Overview
- Master Setup Guide
- Slave Enrollment
- Agent Farm Setup
- Publishing Agent Templates
- Command Center
- Device Forensics and Revocation
🌙 Dream Mode
- Dream Mode Overview
- Enabling Dream Mode
- Dream Pass Internals
- Consolidated Learnings Dashboard
- Privacy and Redaction
🔒 Security
- Security Model
- IAM Policies
- Audit Logging
- Device Identity and Signing
- Secrets Management
- Compliance and Data Residency
🛠 Developer
- Development Setup
- Project Structure
- Code Conventions
- Testing
- Adding an ACP Runtime
- Adding an MCP Server
- Pull Request Workflow
📘 Reference
- Configuration Keys
- Environment Variables
- IPC Channels
- Database Schema
- Fleet Command Types
- Telemetry Shape
- CLI and Keyboard Shortcuts
❓ Help
🔗 Outside the wiki
v2.5.1 · 50+ pages · Contribute