Skip to content

[FEATURE] Investigate per-round role-persistent subagents via wait/send_message #966

@ncrispino

Description

@ncrispino

Background

Currently all subagents in MassGen are task-completion style: launched to do one bounded job, they complete and terminate. The orchestrator rebuilds context from scratch for each new subagent invocation even when the same logical role recurs many times within a round.

This issue investigates a second pattern: role-persistent subagents, scoped per round.

The Two Patterns

Task-completion subagent

  • Launched once for a bounded job with a clear completion point
  • No benefit from knowing about prior invocations of the same type
  • Can be parallelized
  • Examples: validate this SVG, generate this image, count elements matching a criterion

Role-persistent subagent (per-round)

  • Launched once at the start of a round with a role/job spec
  • Sits idle using the wait tool between invocations
  • Woken by a targeted send_message from the main agent when there's work matching its role
  • Does its work, writes output to a shared location, calls wait again
  • Terminated at round end
  • The key benefit: accumulated context within the round — it was there for all prior calls and doesn't need to re-read summaries

What "Accumulated Context" Actually Buys You

Without this pattern, recurring work is either handled inline (consuming main agent's context window) or spawned as fresh task-completion agents each time (expensive setup, no cross-call awareness).

A role-persistent agent can reason like: "The last three validation passes all flagged the same axis label issue — this is a systematic problem in the generation loop, not a one-off." A fresh agent each time would just report the same failure three times without connecting them.

Candidate Roles for Within-Round Persistence

These are roles where the work type genuinely recurs multiple times within a single agent's round and benefits from accumulated context:

  • Incremental validator: During iterative artifact generation (e.g., SVG built section by section), woken after each section is written. Knows all prior sections and can catch cross-section inconsistencies that per-section task agents would miss.
  • Changedoc manager: Woken each time the main agent makes a design decision. Maintains the running decision log, can flag contradictions, knows the full history without re-reading.
  • Shell output interpreter: When a round involves many repeated shell/tool calls with similar output structure (element counts, diffs, render checks), a specialist woken per output can parse and summarize — keeping the main agent's context clean of raw tool noise.
  • Research/fact-lookup specialist: If the main agent needs to validate multiple claims or look up multiple references during a round, a specialist handles each query and maintains the lookup history to avoid re-querying the same thing.

Questions to Investigate

  1. Protocol design: What does the wake message contain? Just a path to new content, or a structured payload? How does the role agent signal completion back to main?
  2. Orchestrator/task-plan integration: How does the main agent decide to launch a role agent vs handle work inline? Explicit task plan annotation? Heuristic based on work type?
  3. Failure handling: What if the role agent crashes mid-round? Fall back to inline, or restart with replayed messages?
  4. Context accumulation ceiling: Role agents grow their context over the round. At what point does accumulated context become a liability? Is there a flush/summarize protocol?
  5. Decision rule for which pattern to use: Hypothesis — if the quality of invocation N depends on knowing results 1..N-1, it's role-persistent. If each invocation is independent, it's task-completion.
  6. Infrastructure gap: wait tool exists. Does targeted send_message (main → specific named agent, not broadcast to all peers) already work, or does it need to be added?

Suggested Starting Point

Pick a run where a single agent's round involves 5+ repeated validation/interpretation passes of the same type. Compare: (a) main agent handles inline, (b) fresh task-completion subagent per pass, (c) role-persistent subagent woken per pass. Measure context saved on main agent and consistency of outputs across the three approaches.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions