Skip to content

Reduce orchestrator latency by collapsing per-integration delegation tools #1335

Description

@Al629176

Summary

Cut orchestrator request size and tool-selection overhead by collapsing the N synthesised per-integration delegation tools into a single delegate_to_integrations_agent tool, then dedupe the routing guidance that exists in both prompt.md and prompt.rs. Add a numeric tool budget only as a safety net if the structural change doesn't get the visible-tool count low enough on its own.

Problem / Context

The orchestrator is the front-line agent, so its tool schema and prompt size directly affect TTFT. Today:

  • src/openhuman/agent/agents/orchestrator/agent.toml declares 22 named direct tools plus subagents = [...].
  • src/openhuman/tools/orchestrator_tools.rs::collect_orchestrator_tools expands SubagentEntry::Skills("*") into one synthesised delegate_<toolkit> tool per connected Composio integration (gmail, github, notion, …). Each carries its own description string.
  • src/openhuman/channels/runtime/dispatch.rs unions named direct tools with synthesised delegation tools and logs visible_tool_count.
  • Routing/delegation guidance is duplicated across prompt.md (168 lines) and prompt.rs, so prompt bytes drift up alongside the tool surface.

The structural problem is the fan-out: as a user connects more integrations, the orchestrator's visible schema grows linearly even though every one of those tools dispatches to the same integrations_agent with a different skill_filter. The orchestrator doesn't actually need to see per-integration tools — it just needs to know "route this to integrations, with toolkit X."

Why a hard "≤ 50 tools" cap is the wrong primary lever

  • It's arbitrary — the 51st integration gets cut off based on registration order, not value.
  • It doesn't address the schema-size problem at the source; it just truncates.
  • It risks regressing the direct-first behavior the original issue calls out (the orchestrator could lose access to a legitimate route).

A budget cap is fine as a safety net, but the structural fix should come first.

Approach

  1. Collapse per-integration delegation into one tool (primary win).
    In orchestrator_tools.rs, replace the loop that emits one SkillDelegationTool per connected integration with a single delegation tool — e.g. delegate_to_integrations — that takes the toolkit slug as an argument. The integrations_agent already owns the per-skill surface; the orchestrator only needs the routing handle. Connected toolkits can be enumerated in the tool description (a short list) so the orchestrator still knows which integrations are available without each one being a separate schema entry.

    Expected effect: visible tool count drops from 22 + N_integrations + named_archetypes to roughly 22 + named_archetypes + 1, independent of how many integrations the user connects.

  2. Dedupe routing guidance between prompt.md and prompt.rs. Pick one source of truth for direct-first / delegation rules. Trim duplicated phrasing. Preserve memory, current-time, scheduling, clarification, and core specialist delegation guidance.

  3. Measure before/after using openhuman agent dump-prompt --agent orchestrator --json (or scripts/debug-agent-prompts.sh) and report prompt bytes + visible tool count in the PR.

  4. Optional safety-net cap. If after (1) and (2) the visible tool count is still uncomfortably high (e.g. heavy named-archetype growth later), introduce a simple budget in dispatch.rs that logs and trims, with a deterministic priority order. Don't add this preemptively.

Acceptance criteria

  • Per-integration fan-out eliminatedcollect_orchestrator_tools produces a constant number of delegation tools regardless of how many Composio integrations are connected. The integration list is conveyed via tool description / arguments, not via separate tool entries.
  • Integration routing preserved — external-service work still flows orchestrator → integrations_agent → toolkit; the orchestrator does not gain direct Composio action execution.
  • Essential behavior preserved — memory, current-time, scheduling, clarification, and core specialist delegation remain available.
  • Prompt deduped — routing/delegation guidance lives in a single source between prompt.md and prompt.rs; direct-first behavior is unchanged.
  • Measurement captured — PR description reports before/after orchestrator prompt bytes and visible tool count from the debug prompt dump path.
  • Regression safety — unit tests in orchestrator_tools.rs cover the collapsed delegation path (zero integrations, one, many) and confirm tool count is constant in the integration dimension. Existing direct-first delegation tests still pass.
  • Diff coverage ≥ 80% — meets the changed-lines coverage gate enforced by .github/workflows/coverage.yml.

Non-goals

  • Hard-capping the orchestrator at exactly 50 tools. A cap may be added as a safety net only if the structural change above leaves the count too high.
  • Restructuring the integrations_agent itself.
  • Changing CLI/RPC surfaces.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    agentBuilt-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/.performancePerformance improvements or regressions.rust-coreCore Rust runtime in src/: CLI, core_server, shared infrastructure.taskWork item that is not primarily a bug or a feature.

    Type

    No fields configured for Task.

    Projects

    Status
    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions