Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,20 @@ After implementing any feature that involves passing parameters through multiple

MassGen is a multi-agent system that coordinates multiple AI agents to solve complex tasks through parallel processing, intelligence sharing, and consensus building. Agents work simultaneously, observe each other's progress, and vote to converge on the best solution.

### Design Principles: The Quality Matrix

MassGen's strength comes from two orthogonal dimensions working together:

| | **Parallel** (same task, N agents) | **Decomposition** (subtasks, owned) |
|------------------------|------------------------------------|-------------------------------------|
| **Enforcing Refinement** | Agents iterate until quality is genuinely achieved, not just adequate | Each subtask owner refines until their piece meets quality gates |
| **Ensuring Depth in Roles** | Strong personas give agents distinct creative visions, producing diverse high-quality attempts | Persona specialization ensures deep domain fit per subtask |

- **Enforcing refinement** (currently: checklist-gated voting, gap analysis, improvements echo) = controls *how much* agents iterate and ensures each iteration is *worth it*. Multiple agents are key here: each round's evaluator sees all agents' prior answers, can identify unique strengths across them, and synthesizes the best elements into the next attempt. Without refinement, agents settle for "good enough."
- **Ensuring depth in roles** = persona/role generation that gives agents strong opinionated visions. A bare user prompt is rarely enough for quality output — the persona fills in the creative direction. Consider using a preliminary MassGen call to generate rich personas/briefs before the main execution run.

Neither dimension alone is sufficient. Refinement without strong roles produces polished mediocrity. Strong roles without refinement produces ambitious first drafts that never mature.

## Essential Commands

All commands use `uv run` prefix:
Expand Down Expand Up @@ -85,7 +99,7 @@ cli.py → orchestrator.py → chat_agent.py → backend/*.py
4. Add capabilities to `backend/capabilities.py`
5. Update `config_validator.py`

**MCP Integration** (`mcp_tools/`): Model Context Protocol for external tools. `client.py` handles multi-server connections, `security.py` validates operations.
**MCP Integration** (`mcp_tools/`): Model Context Protocol for external tools. `client.py` handles multi-server connections, `security.py` validates operations. Some tools have dual paths: SDK (in-process, for ClaudeCode) and stdio (config.toml-based, for Codex). **Stdio MCP servers run inside Docker where `massgen` is NOT installed** — never import from `massgen` in stdio servers. Pre-compute any needed values in the orchestrator and pass via JSON specs files. Also note Codex sometimes sends tool args as JSON strings instead of dicts — always add a `json.loads()` fallback.

**Streaming Buffer** (`backend/_streaming_buffer_mixin.py`): Tracks partial responses during streaming for compression recovery.

Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -258,13 +258,14 @@ uv run massgen --quickstart
The `--setup` command will:
- Configure your API keys (OpenAI, Anthropic, Google, xAI)
- Offer to set up Docker images for code execution
- Offer to install skills (openskills, Anthropic collection)
- Offer to install skills (openskills, Anthropic/OpenAI/Vercel collections, Agent Browser skill, Crawl4AI)

The `--quickstart` command will:
- Ask how many agents you want (1-5, default 3)
- Ask which backend/model for each agent
- For GPT-5x models, ask for `reasoning.effort` (`low|medium|high`; Codex GPT-5 models also include `xhigh`)
- Auto-detect Docker availability and configure execution mode
- If Docker mode is selected, show a Skills step where you can choose package(s) (`openskills`-based Anthropic/OpenAI/Vercel/Agent Browser plus Crawl4AI) and install them in-place with live status
- Create a ready-to-use config and launch into interactive TUI mode

**🖥️ Textual TUI (Default Display Mode):**
Expand Down
3 changes: 2 additions & 1 deletion README_PYPI.md
Original file line number Diff line number Diff line change
Expand Up @@ -257,13 +257,14 @@ uv run massgen --quickstart
The `--setup` command will:
- Configure your API keys (OpenAI, Anthropic, Google, xAI)
- Offer to set up Docker images for code execution
- Offer to install skills (openskills, Anthropic collection)
- Offer to install skills (openskills, Anthropic/OpenAI/Vercel collections, Agent Browser skill, Crawl4AI)

The `--quickstart` command will:
- Ask how many agents you want (1-5, default 3)
- Ask which backend/model for each agent
- For GPT-5x models, ask for `reasoning.effort` (`low|medium|high`; Codex GPT-5 models also include `xhigh`)
- Auto-detect Docker availability and configure execution mode
- If Docker mode is selected, show a Skills step where you can choose package(s) (`openskills`-based Anthropic/OpenAI/Vercel/Agent Browser plus Crawl4AI) and install them in-place with live status
- Create a ready-to-use config and launch into interactive TUI mode

**🖥️ Textual TUI (Default Display Mode):**
Expand Down
1 change: 1 addition & 0 deletions docs/source/quickstart/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -512,6 +512,7 @@ Sensible defaults guidance:
* Usually this is lower than fully parallel voting mode settings.
* Add ``max_new_answers_global`` for deterministic total coordination budget.
* Keep other answer-control parameters at defaults unless you need stricter behavior.
* Keep fairness defaults enabled (``fairness_enabled: true``, ``fairness_lead_cap_answers: 2``, ``max_midstream_injections_per_round: 2``) to prevent fast agents from repeatedly lapping slower peers and causing restart churn.

Quickstart note:

Expand Down
7 changes: 7 additions & 0 deletions docs/source/quickstart/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,13 @@ For advanced features like isolated code execution:

These are optional - basic MassGen works without them.

.. note::

In ``uv run massgen --quickstart``, when Docker mode is selected the wizard includes
a Skills step where you can select package(s) and install them immediately with
on-page status updates (Anthropic/OpenAI/Vercel collections, Agent Browser skill,
and Crawl4AI). Use ``--setup-skills`` to retry or pre-install manually.
Comment on lines +187 to +192
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a runnable example + expected output for the new Skills step and surface it in Recent Releases.

This note introduces new quickstart behavior; please add a minimal command + sample output snippet for the Skills step, and update docs/source/index.rst “Recent Releases” if this is a new feature.

✏️ Suggested RST snippet
 .. note::

    In ``uv run massgen --quickstart``, when Docker mode is selected the wizard includes
    a Skills step where you can select package(s) and install them immediately with
    on-page status updates (Anthropic/OpenAI/Vercel collections, Agent Browser skill,
    and Crawl4AI). Use ``--setup-skills`` to retry or pre-install manually.
+
+   Example:
+
+   .. code-block:: bash
+
+      uv run massgen --quickstart
+
+   .. code-block:: text
+
+      [Skills] Installing vercel-labs/agent-browser ... done

Based on learnings and coding guidelines: Documentation for new features must include runnable commands with expected output, and docs/source/index.rst "Recent Releases" should be updated.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
.. note::
In ``uv run massgen --quickstart``, when Docker mode is selected the wizard includes
a Skills step where you can select package(s) and install them immediately with
on-page status updates (Anthropic/OpenAI/Vercel collections, Agent Browser skill,
and Crawl4AI). Use ``--setup-skills`` to retry or pre-install manually.
.. note::
In ``uv run massgen --quickstart``, when Docker mode is selected the wizard includes
a Skills step where you can select package(s) and install them immediately with
on-page status updates (Anthropic/OpenAI/Vercel collections, Agent Browser skill,
and Crawl4AI). Use ``--setup-skills`` to retry or pre-install manually.
Example:
.. code-block:: bash
uv run massgen --quickstart
.. code-block:: text
[Skills] Installing vercel-labs/agent-browser ... done
🤖 Prompt for AI Agents
In `@docs/source/quickstart/installation.rst` around lines 187 - 192, Add a
minimal runnable example and expected output for the new Skills step in the
quickstart note: show the exact command "uv run massgen --quickstart" (and an
example using "--setup-skills" for retry/pre-install), include a short terminal
transcript demonstrating the Skills step prompts and on-page status updates
(e.g., selection prompt, "Installing Anthropic...", "Installed: Anthropic",
success summary), and append a brief bullet in docs/source/index.rst under
"Recent Releases" announcing the Skills-step feature and linking to the
quickstart page; update the note text in installation.rst (the uv run massgen
--quickstart block) to include this runnable snippet and the expected output
example.


Development Installation
========================

Expand Down
23 changes: 23 additions & 0 deletions docs/source/reference/yaml_schema.rst
Original file line number Diff line number Diff line change
Expand Up @@ -575,6 +575,9 @@ Full multi-agent configuration demonstrating all 6 configuration levels:
max_new_answers_per_agent: 2 # Cap new answers per agent (null=unlimited)
max_new_answers_global: 8 # Cap total new answers across all agents (null=unlimited)
answer_novelty_requirement: "balanced" # How different new answers must be (lenient/balanced/strict)
fairness_enabled: true # Keep coordination pacing balanced (default: true)
fairness_lead_cap_answers: 2 # Max lead in answer revisions vs slowest active peer
max_midstream_injections_per_round: 2 # Cap injected unseen source updates per round

# Advanced settings
skip_coordination_rounds: false # Normal coordination
Expand Down Expand Up @@ -1038,6 +1041,8 @@ Voting and Answer Control

These parameters control coordination behavior to balance quality and duration.

Fairness controls are designed to solve a common multi-agent failure mode: fast agents can repeatedly submit revisions while slower peers are still working, which creates uneven effort, restart churn, and noisy coordination loops. With fairness enabled (default), agents stay within a bounded revision lead and wait for peer updates before terminal decisions.

.. list-table::
:header-rows: 1

Expand All @@ -1061,6 +1066,18 @@ These parameters control coordination behavior to balance quality and duration.
- string
- No
- Controls how different new answers must be from existing ones to prevent rephrasing. **Options:** ``"lenient"`` (default) - no similarity checks (fastest); ``"balanced"`` - reject if >70% token overlap, requires meaningful differences; ``"strict"`` - reject if >50% token overlap, requires substantially different solutions.
* - ``fairness_enabled``
- boolean
- No
- Enable fairness pacing controls across both ``coordination_mode: voting`` and ``coordination_mode: decomposition``. **Default:** ``true``.
* - ``fairness_lead_cap_answers``
- integer
- No
- Maximum allowed lead in answer revisions over the slowest active peer. When exceeded, ``new_answer`` is blocked until peers catch up. **Default:** ``2`` (set ``0`` for strict lockstep).
* - ``max_midstream_injections_per_round``
- integer
- No
- Maximum unseen source-agent updates injected mid-stream into a single agent during one round. Helps prevent fast models from receiving runaway update fanout. **Default:** ``2``.

**Example Configurations:**

Expand All @@ -1073,6 +1090,9 @@ Fast but thorough (recommended for balanced evaluation):
max_new_answers_per_agent: 2 # But cap at 2 tries
max_new_answers_global: 8 # Stop global churn in long runs
answer_novelty_requirement: "balanced" # Must actually improve
fairness_enabled: true
fairness_lead_cap_answers: 2
max_midstream_injections_per_round: 2

Maximum quality with bounded time:

Expand Down Expand Up @@ -1107,6 +1127,9 @@ Decomposition mode (recommended defaults):
# Add a global cap for deterministic total coordination budget.
max_new_answers_global: 9
answer_novelty_requirement: "balanced"
fairness_enabled: true
fairness_lead_cap_answers: 2
max_midstream_injections_per_round: 2

Timeout Configuration
~~~~~~~~~~~~~~~~~~~~~
Expand Down
19 changes: 19 additions & 0 deletions massgen/agent_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,9 @@ class AgentConfig:
max_new_answers_per_agent: Maximum number of new answers each agent can provide (None = unlimited)
max_new_answers_global: Maximum number of new answers across all agents (None = unlimited)
answer_novelty_requirement: How different new answers must be from existing ones ("lenient", "balanced", "strict")
fairness_enabled: Enable fairness controls across all coordination modes (default: True)
fairness_lead_cap_answers: Maximum allowed lead in answer revisions over slowest active peer
max_midstream_injections_per_round: Maximum unseen source updates injected per agent per round
"""

# Core backend configuration (includes tool enablement)
Expand All @@ -245,9 +248,13 @@ class AgentConfig:

# Voting behavior configuration
voting_sensitivity: str = "lenient"
voting_threshold: Optional[int] = None # Numeric threshold for ROI-style voting (e.g., 15 = 15% improvement required)
max_new_answers_per_agent: Optional[int] = None
max_new_answers_global: Optional[int] = None
answer_novelty_requirement: str = "lenient"
fairness_enabled: bool = True
fairness_lead_cap_answers: int = 2
max_midstream_injections_per_round: int = 2

# Agent customization
agent_id: Optional[str] = None
Expand Down Expand Up @@ -943,9 +950,13 @@ def to_dict(self) -> Dict[str, Any]:
# Access private attribute to avoid deprecation warning
"custom_system_instruction": self._custom_system_instruction,
"voting_sensitivity": self.voting_sensitivity,
"voting_threshold": self.voting_threshold,
"max_new_answers_per_agent": self.max_new_answers_per_agent,
"max_new_answers_global": self.max_new_answers_global,
"answer_novelty_requirement": self.answer_novelty_requirement,
"fairness_enabled": self.fairness_enabled,
"fairness_lead_cap_answers": self.fairness_lead_cap_answers,
"max_midstream_injections_per_round": self.max_midstream_injections_per_round,
"timeout_config": {
"orchestrator_timeout_seconds": self.timeout_config.orchestrator_timeout_seconds,
"initial_round_timeout_seconds": self.timeout_config.initial_round_timeout_seconds,
Expand Down Expand Up @@ -988,9 +999,13 @@ def from_dict(cls, data: Dict[str, Any]) -> "AgentConfig":
agent_id = data.get("agent_id")
custom_system_instruction = data.get("custom_system_instruction")
voting_sensitivity = data.get("voting_sensitivity", "lenient")
voting_threshold = data.get("voting_threshold")
max_new_answers_per_agent = data.get("max_new_answers_per_agent")
max_new_answers_global = data.get("max_new_answers_global")
answer_novelty_requirement = data.get("answer_novelty_requirement", "lenient")
fairness_enabled = data.get("fairness_enabled", True)
fairness_lead_cap_answers = data.get("fairness_lead_cap_answers", 2)
max_midstream_injections_per_round = data.get("max_midstream_injections_per_round", 2)

# Handle timeout_config
timeout_config = TimeoutConfig()
Expand Down Expand Up @@ -1020,9 +1035,13 @@ def from_dict(cls, data: Dict[str, Any]) -> "AgentConfig":
message_templates=message_templates,
agent_id=agent_id,
voting_sensitivity=voting_sensitivity,
voting_threshold=voting_threshold,
max_new_answers_per_agent=max_new_answers_per_agent,
max_new_answers_global=max_new_answers_global,
answer_novelty_requirement=answer_novelty_requirement,
fairness_enabled=fairness_enabled,
fairness_lead_cap_answers=fairness_lead_cap_answers,
max_midstream_injections_per_round=max_midstream_injections_per_round,
timeout_config=timeout_config,
coordination_config=coordination_config,
)
Expand Down
5 changes: 5 additions & 0 deletions massgen/api_params_handler/_api_params_handler_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,10 +124,15 @@ def get_base_excluded_params(self) -> Set[str]:
"debug_delay_after_n_tools",
# Per-agent voting sensitivity (coordination config, not API param)
"voting_sensitivity",
"voting_threshold",
# Decomposition mode parameters (handled by orchestrator, not passed to API)
"coordination_mode",
"presenter_agent",
"subtask",
# Fairness controls (handled by orchestrator, not passed to API)
"fairness_enabled",
"fairness_lead_cap_answers",
"max_midstream_injections_per_round",
}

def build_base_api_params(
Expand Down
5 changes: 5 additions & 0 deletions massgen/backend/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -337,10 +337,15 @@ def get_base_excluded_config_params(cls) -> set:
"debug_delay_after_n_tools",
# Per-agent voting sensitivity (coordination config, not API param)
"voting_sensitivity",
"voting_threshold",
# Decomposition mode parameters (handled by orchestrator, not passed to API)
"coordination_mode",
"presenter_agent",
"subtask",
# Fairness controls (handled by orchestrator, not passed to API)
"fairness_enabled",
"fairness_lead_cap_answers",
"max_midstream_injections_per_round",
}

@abstractmethod
Expand Down
Loading