massgen · ncrispino · Feb 7, 2026 · Feb 7, 2026 · Feb 8, 2026 · Feb 8, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -36,6 +36,20 @@ After implementing any feature that involves passing parameters through multiple
 
 MassGen is a multi-agent system that coordinates multiple AI agents to solve complex tasks through parallel processing, intelligence sharing, and consensus building. Agents work simultaneously, observe each other's progress, and vote to converge on the best solution.
 
+### Design Principles: The Quality Matrix
+
+MassGen's strength comes from two orthogonal dimensions working together:
+
+|                        | **Parallel** (same task, N agents) | **Decomposition** (subtasks, owned) |
+|------------------------|------------------------------------|-------------------------------------|
+| **Enforcing Refinement** | Agents iterate until quality is genuinely achieved, not just adequate | Each subtask owner refines until their piece meets quality gates |
+| **Ensuring Depth in Roles** | Strong personas give agents distinct creative visions, producing diverse high-quality attempts | Persona specialization ensures deep domain fit per subtask |
+
+- **Enforcing refinement** (currently: checklist-gated voting, gap analysis, improvements echo) = controls *how much* agents iterate and ensures each iteration is *worth it*. Multiple agents are key here: each round's evaluator sees all agents' prior answers, can identify unique strengths across them, and synthesizes the best elements into the next attempt. Without refinement, agents settle for "good enough."
+- **Ensuring depth in roles** = persona/role generation that gives agents strong opinionated visions. A bare user prompt is rarely enough for quality output — the persona fills in the creative direction. Consider using a preliminary MassGen call to generate rich personas/briefs before the main execution run.
+
+Neither dimension alone is sufficient. Refinement without strong roles produces polished mediocrity. Strong roles without refinement produces ambitious first drafts that never mature.
+
 ## Essential Commands
 
 All commands use `uv run` prefix:
@@ -85,7 +99,7 @@ cli.py → orchestrator.py → chat_agent.py → backend/*.py
 4. Add capabilities to `backend/capabilities.py`
 5. Update `config_validator.py`
 
-**MCP Integration** (`mcp_tools/`): Model Context Protocol for external tools. `client.py` handles multi-server connections, `security.py` validates operations.
+**MCP Integration** (`mcp_tools/`): Model Context Protocol for external tools. `client.py` handles multi-server connections, `security.py` validates operations. Some tools have dual paths: SDK (in-process, for ClaudeCode) and stdio (config.toml-based, for Codex). **Stdio MCP servers run inside Docker where `massgen` is NOT installed** — never import from `massgen` in stdio servers. Pre-compute any needed values in the orchestrator and pass via JSON specs files. Also note Codex sometimes sends tool args as JSON strings instead of dicts — always add a `json.loads()` fallback.
 
 **Streaming Buffer** (`backend/_streaming_buffer_mixin.py`): Tracks partial responses during streaming for compression recovery.
 

diff --git a/README.md b/README.md
@@ -258,13 +258,14 @@ uv run massgen --quickstart
 The `--setup` command will:
 - Configure your API keys (OpenAI, Anthropic, Google, xAI)
 - Offer to set up Docker images for code execution
-- Offer to install skills (openskills, Anthropic collection)
+- Offer to install skills (openskills, Anthropic/OpenAI/Vercel collections, Agent Browser skill, Crawl4AI)
 
 The `--quickstart` command will:
 - Ask how many agents you want (1-5, default 3)
 - Ask which backend/model for each agent
 - For GPT-5x models, ask for `reasoning.effort` (`low|medium|high`; Codex GPT-5 models also include `xhigh`)
 - Auto-detect Docker availability and configure execution mode
+- If Docker mode is selected, show a Skills step where you can choose package(s) (`openskills`-based Anthropic/OpenAI/Vercel/Agent Browser plus Crawl4AI) and install them in-place with live status
 - Create a ready-to-use config and launch into interactive TUI mode
 
 **🖥️ Textual TUI (Default Display Mode):**

diff --git a/README_PYPI.md b/README_PYPI.md
@@ -257,13 +257,14 @@ uv run massgen --quickstart
 The `--setup` command will:
 - Configure your API keys (OpenAI, Anthropic, Google, xAI)
 - Offer to set up Docker images for code execution
-- Offer to install skills (openskills, Anthropic collection)
+- Offer to install skills (openskills, Anthropic/OpenAI/Vercel collections, Agent Browser skill, Crawl4AI)
 
 The `--quickstart` command will:
 - Ask how many agents you want (1-5, default 3)
 - Ask which backend/model for each agent
 - For GPT-5x models, ask for `reasoning.effort` (`low|medium|high`; Codex GPT-5 models also include `xhigh`)
 - Auto-detect Docker availability and configure execution mode
+- If Docker mode is selected, show a Skills step where you can choose package(s) (`openskills`-based Anthropic/OpenAI/Vercel/Agent Browser plus Crawl4AI) and install them in-place with live status
 - Create a ready-to-use config and launch into interactive TUI mode
 
 **🖥️ Textual TUI (Default Display Mode):**

diff --git a/docs/source/quickstart/configuration.rst b/docs/source/quickstart/configuration.rst
@@ -512,6 +512,7 @@ Sensible defaults guidance:
 * Usually this is lower than fully parallel voting mode settings.
 * Add ``max_new_answers_global`` for deterministic total coordination budget.
 * Keep other answer-control parameters at defaults unless you need stricter behavior.
+* Keep fairness defaults enabled (``fairness_enabled: true``, ``fairness_lead_cap_answers: 2``, ``max_midstream_injections_per_round: 2``) to prevent fast agents from repeatedly lapping slower peers and causing restart churn.
 
 Quickstart note:
 

diff --git a/docs/source/quickstart/installation.rst b/docs/source/quickstart/installation.rst
@@ -184,6 +184,13 @@ For advanced features like isolated code execution:
 
 These are optional - basic MassGen works without them.
 
+.. note::
+
+   In ``uv run massgen --quickstart``, when Docker mode is selected the wizard includes
+   a Skills step where you can select package(s) and install them immediately with
+   on-page status updates (Anthropic/OpenAI/Vercel collections, Agent Browser skill,
+   and Crawl4AI). Use ``--setup-skills`` to retry or pre-install manually.
-.. note::
-
-   In ``uv run massgen --quickstart``, when Docker mode is selected the wizard includes
-   a Skills step where you can select package(s) and install them immediately with
-   on-page status updates (Anthropic/OpenAI/Vercel collections, Agent Browser skill,
-   and Crawl4AI). Use ``--setup-skills`` to retry or pre-install manually.
+.. note::
+
+   In ``uv run massgen --quickstart``, when Docker mode is selected the wizard includes
+   a Skills step where you can select package(s) and install them immediately with
+   on-page status updates (Anthropic/OpenAI/Vercel collections, Agent Browser skill,
+   and Crawl4AI). Use ``--setup-skills`` to retry or pre-install manually.
+
+   Example:
+
+   .. code-block:: bash
+
+      uv run massgen --quickstart
+
+   .. code-block:: text
+
+      [Skills] Installing vercel-labs/agent-browser ... done
-.. note::
-
-   In ``uv run massgen --quickstart``, when Docker mode is selected the wizard includes
-   a Skills step where you can select package(s) and install them immediately with
-   on-page status updates (Anthropic/OpenAI/Vercel collections, Agent Browser skill,
-   and Crawl4AI). Use ``--setup-skills`` to retry or pre-install manually.
+.. note::
+
+   In ``uv run massgen --quickstart``, when Docker mode is selected the wizard includes
+   a Skills step where you can select package(s) and install them immediately with
+   on-page status updates (Anthropic/OpenAI/Vercel collections, Agent Browser skill,
+   and Crawl4AI). Use ``--setup-skills`` to retry or pre-install manually.
+
+   Example:
+
+   .. code-block:: bash
+
+      uv run massgen --quickstart
+
+   .. code-block:: text
+
+      [Skills] Installing vercel-labs/agent-browser ... done
+
 Development Installation
 ========================
 

diff --git a/docs/source/reference/yaml_schema.rst b/docs/source/reference/yaml_schema.rst
@@ -575,6 +575,9 @@ Full multi-agent configuration demonstrating all 6 configuration levels:
      max_new_answers_per_agent: 2           # Cap new answers per agent (null=unlimited)
      max_new_answers_global: 8              # Cap total new answers across all agents (null=unlimited)
      answer_novelty_requirement: "balanced" # How different new answers must be (lenient/balanced/strict)
+     fairness_enabled: true                 # Keep coordination pacing balanced (default: true)
+     fairness_lead_cap_answers: 2           # Max lead in answer revisions vs slowest active peer
+     max_midstream_injections_per_round: 2  # Cap injected unseen source updates per round
 
      # Advanced settings
      skip_coordination_rounds: false        # Normal coordination
@@ -1038,6 +1041,8 @@ Voting and Answer Control
 
 These parameters control coordination behavior to balance quality and duration.
 
+Fairness controls are designed to solve a common multi-agent failure mode: fast agents can repeatedly submit revisions while slower peers are still working, which creates uneven effort, restart churn, and noisy coordination loops. With fairness enabled (default), agents stay within a bounded revision lead and wait for peer updates before terminal decisions.
+
 .. list-table::
    :header-rows: 1
 
@@ -1061,6 +1066,18 @@ These parameters control coordination behavior to balance quality and duration.
      - string
      - No
      - Controls how different new answers must be from existing ones to prevent rephrasing. **Options:** ``"lenient"`` (default) - no similarity checks (fastest); ``"balanced"`` - reject if >70% token overlap, requires meaningful differences; ``"strict"`` - reject if >50% token overlap, requires substantially different solutions.
+   * - ``fairness_enabled``
+     - boolean
+     - No
+     - Enable fairness pacing controls across both ``coordination_mode: voting`` and ``coordination_mode: decomposition``. **Default:** ``true``.
+   * - ``fairness_lead_cap_answers``
+     - integer
+     - No
+     - Maximum allowed lead in answer revisions over the slowest active peer. When exceeded, ``new_answer`` is blocked until peers catch up. **Default:** ``2`` (set ``0`` for strict lockstep).
+   * - ``max_midstream_injections_per_round``
+     - integer
+     - No
+     - Maximum unseen source-agent updates injected mid-stream into a single agent during one round. Helps prevent fast models from receiving runaway update fanout. **Default:** ``2``.
 
 **Example Configurations:**
 
@@ -1073,6 +1090,9 @@ Fast but thorough (recommended for balanced evaluation):
      max_new_answers_per_agent: 2         # But cap at 2 tries
      max_new_answers_global: 8            # Stop global churn in long runs
      answer_novelty_requirement: "balanced"  # Must actually improve
+     fairness_enabled: true
+     fairness_lead_cap_answers: 2
+     max_midstream_injections_per_round: 2
 
 Maximum quality with bounded time:
 
@@ -1107,6 +1127,9 @@ Decomposition mode (recommended defaults):
      # Add a global cap for deterministic total coordination budget.
      max_new_answers_global: 9
      answer_novelty_requirement: "balanced"
+     fairness_enabled: true
+     fairness_lead_cap_answers: 2
+     max_midstream_injections_per_round: 2
 
 Timeout Configuration
 ~~~~~~~~~~~~~~~~~~~~~

diff --git a/massgen/agent_config.py b/massgen/agent_config.py
@@ -235,6 +235,9 @@ class AgentConfig:
         max_new_answers_per_agent: Maximum number of new answers each agent can provide (None = unlimited)
         max_new_answers_global: Maximum number of new answers across all agents (None = unlimited)
         answer_novelty_requirement: How different new answers must be from existing ones ("lenient", "balanced", "strict")
+        fairness_enabled: Enable fairness controls across all coordination modes (default: True)
+        fairness_lead_cap_answers: Maximum allowed lead in answer revisions over slowest active peer
+        max_midstream_injections_per_round: Maximum unseen source updates injected per agent per round
     """
 
     # Core backend configuration (includes tool enablement)
@@ -245,9 +248,13 @@ class AgentConfig:
 
     # Voting behavior configuration
     voting_sensitivity: str = "lenient"
+    voting_threshold: Optional[int] = None  # Numeric threshold for ROI-style voting (e.g., 15 = 15% improvement required)
     max_new_answers_per_agent: Optional[int] = None
     max_new_answers_global: Optional[int] = None
     answer_novelty_requirement: str = "lenient"
+    fairness_enabled: bool = True
+    fairness_lead_cap_answers: int = 2
+    max_midstream_injections_per_round: int = 2
 
     # Agent customization
     agent_id: Optional[str] = None
@@ -943,9 +950,13 @@ def to_dict(self) -> Dict[str, Any]:
             # Access private attribute to avoid deprecation warning
             "custom_system_instruction": self._custom_system_instruction,
             "voting_sensitivity": self.voting_sensitivity,
+            "voting_threshold": self.voting_threshold,
             "max_new_answers_per_agent": self.max_new_answers_per_agent,
             "max_new_answers_global": self.max_new_answers_global,
             "answer_novelty_requirement": self.answer_novelty_requirement,
+            "fairness_enabled": self.fairness_enabled,
+            "fairness_lead_cap_answers": self.fairness_lead_cap_answers,
+            "max_midstream_injections_per_round": self.max_midstream_injections_per_round,
             "timeout_config": {
                 "orchestrator_timeout_seconds": self.timeout_config.orchestrator_timeout_seconds,
                 "initial_round_timeout_seconds": self.timeout_config.initial_round_timeout_seconds,
@@ -988,9 +999,13 @@ def from_dict(cls, data: Dict[str, Any]) -> "AgentConfig":
         agent_id = data.get("agent_id")
         custom_system_instruction = data.get("custom_system_instruction")
         voting_sensitivity = data.get("voting_sensitivity", "lenient")
+        voting_threshold = data.get("voting_threshold")
         max_new_answers_per_agent = data.get("max_new_answers_per_agent")
         max_new_answers_global = data.get("max_new_answers_global")
         answer_novelty_requirement = data.get("answer_novelty_requirement", "lenient")
+        fairness_enabled = data.get("fairness_enabled", True)
+        fairness_lead_cap_answers = data.get("fairness_lead_cap_answers", 2)
+        max_midstream_injections_per_round = data.get("max_midstream_injections_per_round", 2)
 
         # Handle timeout_config
         timeout_config = TimeoutConfig()
@@ -1020,9 +1035,13 @@ def from_dict(cls, data: Dict[str, Any]) -> "AgentConfig":
             message_templates=message_templates,
             agent_id=agent_id,
             voting_sensitivity=voting_sensitivity,
+            voting_threshold=voting_threshold,
             max_new_answers_per_agent=max_new_answers_per_agent,
             max_new_answers_global=max_new_answers_global,
             answer_novelty_requirement=answer_novelty_requirement,
+            fairness_enabled=fairness_enabled,
+            fairness_lead_cap_answers=fairness_lead_cap_answers,
+            max_midstream_injections_per_round=max_midstream_injections_per_round,
             timeout_config=timeout_config,
             coordination_config=coordination_config,
         )

diff --git a/massgen/api_params_handler/_api_params_handler_base.py b/massgen/api_params_handler/_api_params_handler_base.py
@@ -124,10 +124,15 @@ def get_base_excluded_params(self) -> Set[str]:
             "debug_delay_after_n_tools",
             # Per-agent voting sensitivity (coordination config, not API param)
             "voting_sensitivity",
+            "voting_threshold",
             # Decomposition mode parameters (handled by orchestrator, not passed to API)
             "coordination_mode",
             "presenter_agent",
             "subtask",
+            # Fairness controls (handled by orchestrator, not passed to API)
+            "fairness_enabled",
+            "fairness_lead_cap_answers",
+            "max_midstream_injections_per_round",
         }
 
     def build_base_api_params(

diff --git a/massgen/backend/base.py b/massgen/backend/base.py
@@ -337,10 +337,15 @@ def get_base_excluded_config_params(cls) -> set:
             "debug_delay_after_n_tools",
             # Per-agent voting sensitivity (coordination config, not API param)
             "voting_sensitivity",
+            "voting_threshold",
             # Decomposition mode parameters (handled by orchestrator, not passed to API)
             "coordination_mode",
             "presenter_agent",
             "subtask",
+            # Fairness controls (handled by orchestrator, not passed to API)
+            "fairness_enabled",
+            "fairness_lead_cap_answers",
+            "max_midstream_injections_per_round",
         }
 
     @abstractmethod