Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions docs/modules/composition.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,20 @@ Phase 2: Per-dimension parallel analysis (methodology personas per dimension)
Phase 3: Cross-dimension synthesis (integration personas, synthesis eval criteria)
```

### Pattern 5: Ensemble (Produce → Vote → Synthesize)

For tasks where independent diversity matters more than iterative refinement. Each agent produces their best answer in isolation, then agents vote, and the winner synthesizes insights from all others.

```
Phase 1: Independent parallel production (disable_injection: true)
Phase 2: Vote on best answer (defer_voting_until_all_answered: true)
Phase 3: Winner synthesizes from all (final_answer_strategy: synthesize)
```

This is the default pattern for multi-agent subagent runs. It maximizes answer diversity by preventing agents from anchoring on each other's work, while still producing a high-quality synthesized final answer.

See ``docs/source/reference/yaml_schema.rst`` and ``configs/voting/ensemble_mode.yaml`` for configuration details.

Comment on lines +146 to +147
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

fd 'ensemble_mode.yaml'
rg -n "ensemble_mode.yaml" docs/modules/composition.md docs/source/reference/yaml_schema.rst

Repository: massgen/MassGen

Length of output: 205


🏁 Script executed:

fd . configs/ -name "*ensemble*" -o -name "*voting*"
ls -la configs/ 2>/dev/null || echo "No configs directory found"

Repository: massgen/MassGen

Length of output: 333


🏁 Script executed:

find . -name "*.yaml" -o -name "*.yml" | head -20
find . -type d -name "config*" 2>/dev/null
cat docs/source/reference/yaml_schema.rst | head -50

Repository: massgen/MassGen

Length of output: 3065


🏁 Script executed:

rg -i "voting|ensemble" docs/ massgen/
ls -la massgen/configs/
find massgen/configs -type f -name "*.yaml" | sort

Repository: massgen/MassGen

Length of output: 50374


🏁 Script executed:

find . -type d -name "voting"
find . -name "*ensemble*mode*" -type f
# Check what configs actually exist
ls -la massgen/configs/
head -20 docs/modules/composition.md | tail -10

Repository: massgen/MassGen

Length of output: 2438


🏁 Script executed:

ls -la massgen/configs/voting/
find massgen/configs/voting -type f

Repository: massgen/MassGen

Length of output: 314


🏁 Script executed:

sed -n '130,160p' docs/modules/composition.md

Repository: massgen/MassGen

Length of output: 2020


🏁 Script executed:

# Check if ensemble_mode.yaml is mentioned anywhere else in docs
rg "ensemble_mode" docs/ massgen/

Repository: massgen/MassGen

Length of output: 201


The referenced config file ensemble_mode.yaml does not exist.

massgen/configs/voting/ensemble_mode.yaml is not in the repository. The voting directory contains only gemini_gpt_voting_sensitivity.yaml. Either create this example config file or correct the documentation reference to point to an existing config.

Additionally, the path is missing the massgen/ prefix—should be massgen/configs/voting/ not configs/voting/.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/modules/composition.md` around lines 146 - 147, The docs reference an
example config "configs/voting/ensemble_mode.yaml" that doesn't exist and also
omits the project prefix; update the documentation or repo: either add the
missing example file at massgen/configs/voting/ensemble_mode.yaml or change the
reference in docs/modules/composition.md to point to an existing file (e.g.,
massgen/configs/voting/gemini_gpt_voting_sensitivity.yaml) and correct the path
to include the "massgen/" prefix; also ensure the related reference to
docs/source/reference/yaml_schema.rst remains correct.

## Checklist Gate Criteria for Special Primitives

The default checklist items (E1-E4) are designed for general task output. But special primitives — persona generation, task decomposition, evaluation criteria generation, and analytical tasks like prompt crafting or log analysis — have well-defined quality characteristics that don't require another level of prompt generation to specify.
Expand Down
86 changes: 82 additions & 4 deletions docs/modules/coordination_workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -446,13 +446,19 @@ Checklist mode is policy, not the core coordination primitive:
4. in that task-driven branch, the parent uses `get_task_plan` as the source of truth, may open the evaluator artifact paths for rationale, and does not call `submit_checklist` or `propose_improvements`
5. in that task-driven branch, the parent implements, verifies, and submits via `new_answer` directly; for pure text artifacts, the final artifact body goes straight into `new_answer.content`
6. if structured `next_tasks` are missing or invalid, the parent uses the returned critique/spec packet as the diagnostic basis for checklist submission
7. in that fallback branch, the parent saves or copies that packet into its workspace as the diagnostic report and calls `submit_checklist`
7. in that degraded fallback branch, the parent references the surfaced `critique_packet.md` path directly as `report_path` and calls `submit_checklist`
8. if checklist returns `status=validation_error`, fix payload/report and call `submit_checklist` again
9. if accepted iterate verdict, call `propose_improvements`
10. implement plan (use `improvement_spec` from the evaluator packet as richer guidance when present)
11. write/update `memory/short_term/verification_latest.md` with replayable verification steps/artifacts
11. write/update `memory/short_term/verification_latest.md` using the replay contract sections
`Verification Contract`, `Inputs and Artifacts`, `Replay Steps`, `Latest Verification Result`,
and `Stale If`
12. submit via `new_answer` (or terminal action)
- round evaluator contract notes:
- support matrix:
- core path: orchestrator-managed stage -> synthesized evaluator packet -> valid `next_tasks.json` -> parent executes one task-driven next-round thesis
- degraded fallback: canonical packet exists but structured handoff is missing or invalid, so checklist submission uses `critique_packet.md` as the diagnostic basis
- advanced / non-default: branches such as `round_evaluator_refine` remain available for specific uses, but they are not the default story
- returns a packet with `criteria_interpretation`, `criterion_findings`, `cross_answer_synthesis`, `preserve`, `improvement_spec`, `verification_plan`, and `evidence_gaps`
- the packet is critique/spec guidance only, not a checklist payload or terminal recommendation
- the inline `verdict_block` is intentionally minimal and carries verdict metadata (`verdict` + `scores`) rather than the full task handoff
Expand All @@ -468,6 +474,7 @@ Checklist mode is policy, not the core coordination primitive:
- generated child YAML for `round_evaluator` always mounts the shared temp-workspace root read-only
- `refine=false` keeps the evaluator child checklist-free; `refine=true` may inherit the parent checklist gate
- checklist-enabled `round_evaluator` child runs use a dedicated default criteria preset for evaluator-packet quality when no child-specific criteria are configured
- `round_evaluator_transformation_pressure` biases how hard the evaluator searches for a larger thesis shift (`gentle`, `balanced`, `aggressive`) while still keeping correctness-first execution and one committed next-round thesis
- checklist result contract:
- accepted path: `status=accepted` + `verdict`
- invalid path: `status=validation_error`, `requires_resubmission=true`, no `verdict`
Expand Down Expand Up @@ -569,9 +576,16 @@ orchestrator:
checklist_first_answer: false
```

## Self-Improvement and Evaluator Rescue Cycle
## Round Evaluator Normal Path

MassGen agents self-improve iteratively within each round, then submit their best answer. This cycle is by design — agents should push themselves to plateau before asking for external feedback. The round evaluator exists to rescue agents from plateaus they cannot break through alone.
MassGen agents self-improve iteratively within each round, then submit their
best answer. The round evaluator is the post-answer stage for **material self-improvement**
that should either surface a materially better next-round thesis or declare
local convergence for the current run.

This is especially useful for open-ended self-improvement loops. The evaluator
should keep helping the system find the next meaningful frontier of work rather
than certifying the current answer after a few cosmetic chores.

### The plateau problem

Expand Down Expand Up @@ -648,6 +662,70 @@ Not every criterion can be driven to 10/10. Distinguish between two cases:

The evaluator's `ceiling_status` signal lets the orchestrator make informed convergence decisions rather than burning rounds on diminishing returns.

## Transformation Pressure

`round_evaluator_transformation_pressure` is the evaluator-specific knob for
how aggressively the managed stage should search for a larger thesis change
before settling for incremental refinement.

Current behavior:

- `gentle`: exploit the current thesis longer; prefer corrective work unless
there is strong ceiling evidence.
- `balanced`: the default middle ground; allow bigger restructuring once the
current line of improvement starts plateauing.
- `aggressive`: treat open-ended tasks as frontier-seeking; push for
transformative shifts sooner when repeated rounds are not producing
step-change improvement.

Machine-learning intuition that may be useful here:

- **Exploration vs. exploitation**: `incremental_refinement` exploits the
current thesis; `thesis_shift` explores a different solution basin.
- **Local minima escape**: repeated small gains can indicate that more polish
is not enough and that a different approach should be tried.
- **Annealing / schedules**: effective transformation pressure could rise after
consecutive incremental rounds instead of remaining fixed.
- **Validation signal**: evaluator verification evidence should ground the push
toward transformation so the system does not chase novelty for novelty's
sake.
- **Catastrophic forgetting prevention**: the `preserve` list and final
preserve/regression verification protect strengths that should survive larger
pivots.
- **Ensemble distillation**: multiple evaluator agents may surface several
promising directions, but the managed stage should still collapse them into
one committed next thesis for the parent to execute.

This knob biases how strongly the evaluator searches for a higher frontier. It
does not override correctness-first fixes, and it does not turn the handoff
into an unresolved menu of incompatible strategies.

## Ensemble Pattern

The **ensemble pattern** is a coordination strategy that sits between iterative
voting and decomposition. Agents produce answers independently (no peer
visibility), then vote on the best, and the winner synthesizes insights from all
others.

Set it up with existing orchestrator parameters:

```yaml
orchestrator:
disable_injection: true
defer_voting_until_all_answered: true
max_new_answers_per_agent: 1
final_answer_strategy: "synthesize"
```

This is also the default pattern for multi-agent subagent runs (via
`SubagentOrchestratorConfig` defaults: `disable_injection: true`,
`defer_voting_until_all_answered: true`).

When voting has occurred, the `synthesize` strategy gives the winner a
winner-biased prompt ("your answer was selected as the best — use it as the
primary basis and incorporate strongest elements from others"). Without voting,
the prompt is neutral ("synthesize the strongest parts across all answers").

## Related Docs

- `docs/modules/architecture.md` - core system architecture and backend hierarchy
Expand Down
31 changes: 18 additions & 13 deletions docs/modules/subagents.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ Behavior:
- `inherited` runs subagents in the same runtime boundary as the parent
- `delegated` uses the outbox pattern: the container writes a request file, a trusted host-side `SubagentLaunchWatcher` creates an isolated container per subagent (see [Delegated Mode](#delegated-mode-codex--docker) below)
- Codex+Docker auto-detection: when the backend is Codex with `command_line_execution_mode: docker` and no explicit fallback/prefix is set, the orchestrator automatically switches to `delegated` mode if a delegation directory is available. This replaces the old `inherited` fallback with secure per-subagent container isolation.
- Host-side Codex background MCP clients rewrite delegated subagent server configs back to `isolated` before connecting. That client already runs on the host, so it should launch the child run directly instead of pretending it is still inside the parent container.

## Delegated Mode (Codex + Docker)

Expand Down Expand Up @@ -184,30 +185,34 @@ Built-in profiles:
### Round Evaluator Loop

`coordination.round_evaluator_before_checklist: true` enables the single-parent
manual/prompt-guided v1 flow:
orchestrator-managed round-evaluator stage:

- round 1: parent builds and submits its first answer normally
- round 2+: the parent launches one blocking `round_evaluator` subagent before
checklist submission unless the separate orchestrator-managed gate is enabled
- round 2+: the orchestrator launches one blocking `round_evaluator` subagent
before checklist submission
- the round evaluator returns a critique/spec packet with `criteria_interpretation`, `criterion_findings`, `cross_answer_synthesis`, `preserve`, `improvement_spec`, `verification_plan`, and `evidence_gaps`
- the parent saves or copies that packet into its workspace as the diagnostic
report used for `submit_checklist`
- core path: if valid `next_tasks.json` is present, the parent treats it as the
one committed next-round thesis, calls `get_task_plan`, implements, verifies,
and submits via `new_answer`
- degraded fallback: if valid `next_tasks.json` is missing or invalid, the
parent uses `critique_packet.md` as the diagnostic basis for `submit_checklist`
- the parent does not run a second full self-evaluation pass; additional
verification is only for explicit `evidence_gaps`
- the parent still owns `submit_checklist`, `propose_improvements`, `new_answer`, and `vote`
- generated child YAML for `round_evaluator` always mounts the shared
temp-workspace root read-only
- with `refine: false`, `round_evaluator` remains a quick critique-only child
run and omits checklist-gated child settings
- with `refine: true`, `round_evaluator` may inherit the parent checklist gate;
when no child-specific criteria are configured, it falls back to a built-in
`round_evaluator` criteria preset for judging the critique packet itself
- if the evaluator child times out before producing `critique_packet.md`, the
orchestrator degrades back to the normal parent-owned checklist flow for that
answer set instead of terminating coordination immediately
- `round_evaluator_refine` remains an advanced/non-default branch
- when the child run is using presenter-stage `synthesize`/`winner_present`, it
keeps `skip_final_presentation: false`
- `round_evaluator_transformation_pressure` controls how aggressively the
evaluator seeks a larger thesis shift: `gentle`, `balanced`, or `aggressive`

`coordination.orchestrator_managed_round_evaluator: true` is a separate,
currently gated mode that lets the orchestrator launch that same blocking
`round_evaluator` before round 2+.
`coordination.orchestrator_managed_round_evaluator: true` remains required for
this stage and keeps the launch reserved for the orchestrator rather than a
manual parent prompt pattern.
Comment on lines +188 to +215
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Make the dual-flag prerequisite explicit at the top of this section.

This now reads as if round_evaluator_before_checklist enables the stage by itself, but a few lines later orchestrator_managed_round_evaluator is still described as required. Please state both required flags together in the opening sentence or validation list so readers do not land on an invalid partial config.

Based on learnings, module documentation in docs/modules/ is the authoritative source for complex subsystems and should be updated precisely when behavior changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/modules/subagents.md` around lines 188 - 215, Update the opening of this
section to explicitly state that both
coordination.orchestrator_managed_round_evaluator and the
round_evaluator_before_checklist flag are required together to enable the
orchestrator-managed round-evaluator flow; revise the first sentence/validation
list to mention both flags by name (round_evaluator_before_checklist and
orchestrator_managed_round_evaluator) and note that omitting either produces a
degraded fallback, and ensure any subsequent sentences that implied a
single-flag enablement are adjusted to reflect the dual-flag prerequisite.


Validation constraints for this mode:

Expand Down
76 changes: 76 additions & 0 deletions docs/source/reference/yaml_schema.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1032,6 +1032,22 @@ Coordination Configuration
- object
- No
- Background subagent configuration (``enabled``, ``injection_strategy``)
* - ``round_evaluator_before_checklist``
- boolean
- No
- Enable the orchestrator-managed round-evaluator stage before round-2+ checklist decisions (default: ``false``). Requires ``orchestrator_managed_round_evaluator: true`` and checklist-gated voting.
* - ``orchestrator_managed_round_evaluator``
- boolean
- No
- Treat the synthesized round-evaluator task handoff as the normal post-answer self-improvement path (default: ``false``).
* - ``round_evaluator_refine``
- boolean
- No
- Advanced/non-default option that lets the evaluator child run iterate before producing its packet (default: ``false``).
* - ``round_evaluator_transformation_pressure``
Comment on lines +1043 to +1047
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix wording in round_evaluator_refine description.

The phrase “run iterate” reads as a typo and is confusing in the schema docs.

✏️ Suggested wording fix
-  - Advanced/non-default option that lets the evaluator child run iterate before producing its packet (default: ``false``).
+  - Advanced/non-default option that lets the evaluator child iterate before producing its packet (default: ``false``).
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
* - ``round_evaluator_refine``
- boolean
- No
- Advanced/non-default option that lets the evaluator child run iterate before producing its packet (default: ``false``).
* - ``round_evaluator_transformation_pressure``
* - ``round_evaluator_refine``
- boolean
- No
- Advanced/non-default option that lets the evaluator child iterate before producing its packet (default: ``false``).
* - ``round_evaluator_transformation_pressure``
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/source/reference/yaml_schema.rst` around lines 1043 - 1047, The
description for round_evaluator_refine is unclear due to the phrase "run
iterate"; update the text to a clearer wording such as "allow the evaluator
child to perform iterative refinement before producing its packet (default:
``false``)" so it accurately conveys that the evaluator will execute its
iterative refinement process prior to emitting the packet; modify the schema doc
entry for round_evaluator_refine accordingly.

- string
- No
- Bias on how aggressively the evaluator seeks a larger thesis change. Supported values: ``gentle``, ``balanced``, ``aggressive``. Default: ``balanced``.

.. note::

Expand Down Expand Up @@ -1160,6 +1176,66 @@ Decomposition mode (recommended defaults):
fairness_lead_cap_answers: 2
max_midstream_injections_per_round: 2

Ensemble pattern (recommended defaults):

.. code-block:: yaml

orchestrator:
# Agents work independently — no peer answer injection
disable_injection: true
# Wait for all agents to finish before voting begins
defer_voting_until_all_answered: true
# Each agent produces 1 answer (adjustable)
max_new_answers_per_agent: 1
# Winner synthesizes from all answers
final_answer_strategy: "synthesize"

The **ensemble pattern** is a coordination strategy where agents produce answers
independently (no peer visibility), then vote on the best answer, and the winner
synthesizes insights from all others into a refined final answer.

**When to use ensemble mode:**

- You want diverse, independent perspectives without agents anchoring on each
other's work
- The task benefits from competitive parallel attempts rather than iterative
refinement (e.g., creative writing, design proposals, solution brainstorming)
- You want faster coordination — single round of production + vote, no
multi-round iteration

**Subagent default:** Multi-agent subagent runs use ensemble defaults
automatically (``disable_injection: true``, ``defer_voting_until_all_answered:
true``). Override by setting these fields explicitly in
``subagent_orchestrator`` config.
Comment on lines +1206 to +1209
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid splitting inline literals across lines in RST.

The inline literal around defer_voting_until_all_answered: true is broken by a newline, which can cause Sphinx/RST parsing warnings.

🧩 Suggested RST-safe formatting
-**Subagent default:** Multi-agent subagent runs use ensemble defaults
-automatically (``disable_injection: true``, ``defer_voting_until_all_answered:
-true``). Override by setting these fields explicitly in
-``subagent_orchestrator`` config.
+**Subagent default:** Multi-agent subagent runs use ensemble defaults
+automatically (``disable_injection: true``,
+``defer_voting_until_all_answered: true``). Override by setting these fields
+explicitly in ``subagent_orchestrator`` config.
As per coding guidelines: `docs/source/**/*.rst`: “Ensure proper RST syntax”.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
**Subagent default:** Multi-agent subagent runs use ensemble defaults
automatically (``disable_injection: true``, ``defer_voting_until_all_answered:
true``). Override by setting these fields explicitly in
``subagent_orchestrator`` config.
**Subagent default:** Multi-agent subagent runs use ensemble defaults
automatically (``disable_injection: true``,
``defer_voting_until_all_answered: true``). Override by setting these fields
explicitly in ``subagent_orchestrator`` config.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/source/reference/yaml_schema.rst` around lines 1206 - 1209, The inline
literal for defer_voting_until_all_answered: true is split across a newline
causing RST parsing warnings; update the sentence so the entire inline literal
(and related literal `disable_injection: true`) stays on a single line or use a
single inline literal span for the whole clause (e.g., ``disable_injection:
true, defer_voting_until_all_answered: true``) and keep the reference to
`subagent_orchestrator` unchanged so Sphinx parses correctly.


.. list-table:: Ensemble vs Standard Voting vs Decomposition
:header-rows: 1

* - Aspect
- Standard voting
- Ensemble pattern
- Decomposition
* - Peer visibility
- Agents see each other's answers
- Agents work in isolation
- Agents see subtask assignments
* - Iteration
- Multiple refinement rounds
- Single round of production
- Multiple rounds per subtask
* - Voting
- After iterative refinement
- After all answers produced
- No voting (presenter assembles)
* - Final answer
- Winner presents
- Winner synthesizes from all
- Presenter integrates subtasks
* - Best for
- Deep quality refinement
- Diverse perspectives, speed
- Complex multi-part tasks

Comment on lines +1179 to +1238
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify Recent Releases section exists and contains feature keywords.
rg -n "Recent Releases|round_evaluator_transformation_pressure|Ensemble" docs/source/index.rst docs/source/reference/yaml_schema.rst

Repository: massgen/MassGen

Length of output: 449


🏁 Script executed:

cat -n docs/source/index.rst | sed -n '195,250p'

Repository: massgen/MassGen

Length of output: 2605


Update docs/source/index.rst v0.1.61 release notes to include ensemble pattern and transformation_pressure features.

The ensemble pattern coordination strategy with round_evaluator_transformation_pressure and ensemble defaults (documented in yaml_schema.rst lines 1179–1238) is not mentioned in the Recent Releases section. The v0.1.61 entry currently only describes the round evaluator subagent; update it to also reference the new ensemble pattern and coordination options introduced in this release.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/source/reference/yaml_schema.rst` around lines 1179 - 1238, Update the
v0.1.61 release notes in the Recent Releases section to mention the new ensemble
coordination pattern and the related configuration options introduced in this
release: reference the ensemble defaults (disable_injection,
defer_voting_until_all_answered, max_new_answers_per_agent,
final_answer_strategy) and the round evaluator option
round_evaluator_transformation_pressure alongside the existing round
evaluator/subagent notes so the entry covers both the ensemble pattern and the
round evaluator subagent features.

Timeout Configuration
~~~~~~~~~~~~~~~~~~~~~

Expand Down
11 changes: 11 additions & 0 deletions massgen/agent_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,7 @@ class CoordinationConfig:
orchestrator_managed_round_evaluator: bool = False # Gate orchestrator-owned round_evaluator launch; default prompt-guidance only
round_evaluator_skip_synthesis: bool = False # Skip synthesis stage; pass all raw critiques to parent directly
round_evaluator_refine: bool = False # Allow evaluator agents to iterate (multi-round with voting)
round_evaluator_transformation_pressure: str = "balanced" # "gentle" | "balanced" | "aggressive"
enable_quality_rethink_on_iteration: bool = False # Auto-inject quality_rethinking spawn task on iteration 2+
enable_novelty_on_iteration: bool = False # Auto-inject novelty/quality spawn task on iteration 2+
novelty_injection: str = "none" # "none" | "gentle" | "moderate" | "aggressive"
Expand All @@ -227,6 +228,7 @@ def __post_init__(self):
self._validate_subagent_runtime_config()
self._validate_drift_conflict_policy()
self._validate_novelty_injection()
self._validate_round_evaluator_transformation_pressure()
self._validate_learning_capture_mode()
self._validate_pre_collab_voting_threshold()
self._validate_improvements()
Expand Down Expand Up @@ -294,6 +296,14 @@ def _validate_novelty_injection(self):
f"Invalid novelty_injection: '{self.novelty_injection}'. " f"Must be one of: {sorted(valid_values)}",
)

def _validate_round_evaluator_transformation_pressure(self):
"""Validate round_evaluator_transformation_pressure setting."""
valid_values = {"gentle", "balanced", "aggressive"}
if self.round_evaluator_transformation_pressure not in valid_values:
raise ValueError(
"Invalid round_evaluator_transformation_pressure: " f"'{self.round_evaluator_transformation_pressure}'. " f"Must be one of: {sorted(valid_values)}",
)

def _validate_learning_capture_mode(self):
"""Validate learning_capture_mode setting."""
valid_values = {"round", "verification_and_final_only", "final_only"}
Expand Down Expand Up @@ -1151,6 +1161,7 @@ def to_dict(self) -> dict[str, Any]:
"planning_mode_instruction": self.coordination_config.planning_mode_instruction,
"max_orchestration_restarts": self.coordination_config.max_orchestration_restarts,
"drift_conflict_policy": self.coordination_config.drift_conflict_policy,
"round_evaluator_transformation_pressure": self.coordination_config.round_evaluator_transformation_pressure,
}

# Handle debug fields
Expand Down
1 change: 1 addition & 0 deletions massgen/api_params_handler/_api_params_handler_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,7 @@ def get_base_excluded_params(self) -> set[str]:
"orchestrator_managed_round_evaluator", # Gate for orchestrator-owned round_evaluator launch
"round_evaluator_skip_synthesis", # Skip synthesis; pass raw critiques to parent directly
"round_evaluator_refine", # Allow evaluator agents to iterate (multi-round with voting)
"round_evaluator_transformation_pressure", # Coordination-only bias for evaluator thesis boldness
"enable_quality_rethink_on_iteration", # Coordination-only quality task injection toggle
"enable_novelty_on_iteration", # Coordination-only novelty task injection toggle
"novelty_injection", # Novelty pressure level (none/gentle/moderate/aggressive)
Expand Down
1 change: 1 addition & 0 deletions massgen/backend/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -343,6 +343,7 @@ def get_base_excluded_config_params(cls) -> set:
"orchestrator_managed_round_evaluator", # Gate for orchestrator-owned round_evaluator launch
"round_evaluator_skip_synthesis", # Skip synthesis; pass raw critiques to parent directly
"round_evaluator_refine", # Allow evaluator agents to iterate (multi-round with voting)
"round_evaluator_transformation_pressure", # Coordination-only bias for evaluator thesis boldness
"enable_quality_rethink_on_iteration", # Coordination-only quality task injection toggle
"enable_novelty_on_iteration", # Coordination-only novelty task injection toggle
"novelty_injection", # Novelty pressure level (none/gentle/moderate/aggressive)
Expand Down
Loading
Loading