Skip to content

[Feature]: Document Intelligence Agent Team (A2A) #5523

@Schlaflied

Description

@Schlaflied

Problem Statement

The Hive roadmap explicitly lists Queen Bee (Orchestrator) and Worker Bee (Executor) as unchecked. No existing template demonstrates this pattern in practice.

Every current template in examples/templates/ follows a linear pipeline: nodes execute sequentially, each passing output to the next. This is insufficient for tasks that require parallel specialist analysis followed by synthesis — a pattern fundamental to professional knowledge work (legal review, due diligence, compliance audits, policy analysis).

The reference implementation (AI Legal Agent Team) demonstrates the core pattern: multiple specialist agents each examine the same document independently, then a coordinator agent synthesizes their findings into a unified output. This is A2A coordination, not a pipeline — and it has no equivalent in the current Hive template library.


Proposed Solution

Add a Document Intelligence Agent Team template under examples/templates/document_intelligence_agent_team/:

Architecture: Queen Bee + Worker Bees (A2A)

Document Input
      ↓
 [Coordinator]  ←── Queen Bee: task distribution + synthesis
  ↙    ↓    ↘
[R]   [A]   [S]  ←── Worker Bees: parallel specialist analysis
  ↘    ↓    ↙
 [Coordinator]
      ↓
 Structured Report

Agent roles:

Agent Role Responsibility
Coordinator Queen Bee Distributes document to all Worker Bees; collects and synthesizes findings; resolves contradictions across agents
Researcher Worker Bee Identifies key entities, dates, amounts, and external references; flags missing information
Analyst Worker Bee Examines internal consistency; cross-references facts within the document; detects contradictions
Strategist Worker Bee Assesses implications, risks, and recommended actions based on findings

Node pipeline (within Coordinator agent):

Intake → Parallel Dispatch → Collect Findings → Cross-Reference → Consensus / Outlier Detection → Synthesis → Structured Report


Key Distinction from Existing Templates

Linear pipeline templates Document Intelligence Agent Team
Execution pattern Sequential nodes Parallel A2A dispatch + synthesis
Architecture Single agent Queen Bee + Worker Bees
Cross-referencing None Coordinator reconciles contradictions across agents
Roadmap coverage Worker nodes only First template to implement Queen Bee / Worker Bee pattern
Input Web search / structured prompts Existing document(s) — no external retrieval required
Domain Domain-specific General-purpose (legal, compliance, contracts, research)

Alternatives Considered

  • Extending deep_research_agent: Designed for web retrieval, not document analysis. Linear architecture cannot represent parallel specialist roles.
  • Extending competitive_intel_agent: Domain-specific to competitive intelligence; A2A coordination pattern not present.
  • Single-agent document summary: Loses specialist depth and cross-referencing capability; does not demonstrate Queen Bee pattern.

Per-Agent Model Configuration

Each Worker Bee agent specifies its own "model" field in agent.json (consistent with the existing node schema, where "model": null inherits the framework default). This means different Worker Bees can run on different underlying models:

Worker Bee Example model assignment Rationale
Researcher gpt-4o Broad knowledge retrieval
Analyst claude-sonnet-4-6 Careful document reasoning
Strategist gemini-2.0-flash Structured output and risk framing
Coordinator configurable Synthesizes across all three

This is not just a multi-role pattern — it is a multi-model cross-reference. The Coordinator synthesizes outputs from agents with genuinely different model architectures, producing a consensus that is more robust than any single model's answer. The config.py exposes worker_models as a top-level configuration key so users can swap models without touching agent logic.


Real-World Validation

This proposal is directly motivated by a documented manual workflow: a user preparing a legal evidence package for a tribunal proceeding used three separate LLMs (Gemini, GPT, Claude) as independent specialist analysts — each reviewing the same document set and providing analysis from a different angle. The user then manually cross-referenced the outputs, identifying contradictions (one model recommended an incorrect procedural step that two others flagged as invalid) before synthesizing a final strategy.

A concrete example: to determine defensible compensation amounts for a tribunal filing, the user submitted the same document set to all three models and asked each to independently estimate abatement percentages per incident. The models returned divergent ranges. The user identified the overlapping consensus range across all three outputs — and those figures became the final amounts filed with the tribunal.

This is not just contradiction detection. It is quantitative consensus synthesis: each Worker Bee produces an independent estimate; the Coordinator identifies the intersection and flags outliers. The output is a range with an explicit confidence basis — more defensible than any single model's answer.

Manual step Automated equivalent
User queries three LLMs separately Coordinator dispatches to Worker Bees in parallel
User reads and compares divergent ranges Coordinator runs cross-reference node
User finds overlapping consensus manually Coordinator computes intersection, flags outliers
User incorporates final range into document Coordinator outputs structured report with confidence annotation

The pattern is validated. The template makes it reproducible.


Additional Context

  • Directly implements the Queen Bee (Orchestrator) + Worker Bee (Executor) pattern from the Hive roadmap — the first template to do so.
  • No new tool dependencies: document ingestion (PDF/text) uses existing file tools; optional Tavily for external reference lookup.
  • Generalizes beyond legal: the same A2A pattern applies to compliance review, due diligence, policy analysis, and research synthesis.
  • Reference implementation (awesome-llm-apps) validates real-world demand for this pattern.
  • Happy to implement if assigned.

Implementation Ideas

examples/templates/document_intelligence_agent_team/
├── __init__.py
├── __main__.py
├── coordinator/
│   ├── agent.py          # Queen Bee: dispatch + synthesis logic
│   └── agent.json
├── workers/
│   ├── researcher.py     # Worker Bee: entity/fact extraction
│   ├── analyst.py        # Worker Bee: internal consistency check
│   └── strategist.py     # Worker Bee: risk/implication assessment
├── config.py             # analysis_type, output_format, domain_context
├── tools.py              # load_document, save_report
└── README.md

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions