Skip to content

Commit a8db42c

Browse files
authored
Merge pull request #987 from massgen/docs_for_v0.1.61
docs: docs for v0.1.61
2 parents 01e5cd1 + a0b1cab commit a8db42c

File tree

12 files changed

+286
-136
lines changed

12 files changed

+286
-136
lines changed

CHANGELOG.md

Lines changed: 24 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
## Recent Releases
1111

12+
**v0.1.61 (March 9, 2026)** - Round Evaluator Paradigm
13+
New round evaluator subagent type that delegates evaluation to specialized evaluator subagents for deeper quality assessment. Major orchestrator refactoring with improved evaluation prompts, task plan injection, and subagent fixes.
14+
1215
**v0.1.60 (March 6, 2026)** - Multimodal Tools, Subagent Enhancements & GPT-5.4
1316
Rewritten read_media with clearer schema and MediaCallLedgerHook for media call tracking. Subagent enhancements: inherit_spawning_agent_backend, final_answer_strategy, per-agent subagent_agents. GPT-5.4 as default OpenAI flagship. Decomp mode cooperates with checklist workflow. Codex prompt caching calculation fix for pricing accuracy.
1417

@@ -18,11 +21,29 @@ Planning improvements with auto-added improvements to task plan and plan review
1821
**v0.1.58 (March 2, 2026)** - Multimodal Revamp, Nvidia NIM Backend & Quality Rethinking
1922
Comprehensive multimodal revamp with ElevenLabs TTS/STT, Nano Banana 2 image generation, and Grok multimedia. Nvidia NIM backend for NVIDIA Inference Microservices. Quality rethinking subagent for per-element craft improvements. Smarter checklists with improve/preserve listings. Logging architecture refactor and CLI mode flags.
2023

21-
**v0.1.57 (February 27, 2026)** - Delegated Subagent Protocol & Builder Subagent
22-
File-based delegation protocol for container-to-host subagent spawning. New builder subagent type for large artifact generation with fresh context. Claude Code reasoning parameters for updated SDK. Smarter convergence with substantiveness tracking and diagnostic report gating.
23-
2424
---
2525

26+
## [0.1.61] - 2026-03-09
27+
28+
### Added
29+
- **Round Evaluator Subagent Type** ([#986](https://github.com/massgen/MassGen/pull/986)): New `round_evaluator` subagent type that delegates evaluation to specialized evaluator subagents for deeper quality assessment
30+
- **`round_evaluator_example.yaml` Config** ([#986](https://github.com/massgen/MassGen/pull/986)): New example config for the round evaluator paradigm
31+
32+
### Changed
33+
- **Orchestrator Refactoring** ([#986](https://github.com/massgen/MassGen/pull/986)): Major orchestrator refactoring (+1,189 lines) to support the round evaluation workflow
34+
- **Evaluation Prompts** ([#986](https://github.com/massgen/MassGen/pull/986)): Improved evaluation prompts for clearer, more actionable feedback with task plan injection
35+
- **Simplified Config** ([#986](https://github.com/massgen/MassGen/pull/986)): Simplified config handling for evaluation parameters
36+
- **SUBAGENT.md Generality** ([#986](https://github.com/massgen/MassGen/pull/986)): Improved SUBAGENT.md for broader subagent compatibility
37+
38+
### Fixed
39+
- **Session Resumption** ([#986](https://github.com/massgen/MassGen/pull/986)): Fixed resumption from already-resumed logs
40+
- **Round Evaluation Prompts** ([#986](https://github.com/massgen/MassGen/pull/986)): Improved round evaluation prompt clarity
41+
42+
### Technical Details
43+
- **Major Focus**: Round evaluator paradigm — delegated evaluation to specialized subagents
44+
- **PRs Merged**: [#986](https://github.com/massgen/MassGen/pull/986) (improve_verification_time)
45+
- **Contributors**: @ncrispino (8 commits), @HenryQi (1 commit)
46+
2647
## [0.1.60] - 2026-03-06
2748

2849
### Added

CONTRIBUTING.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -359,7 +359,7 @@ Create a `.env` file in the `massgen` directory as described in [README](README.
359359

360360
## 🔧 Development Workflow
361361

362-
> **Important**: Our next version is v0.1.61. If you want to contribute, please contribute to the `dev/v0.1.61` branch (or `main` if dev/v0.1.61 doesn't exist yet).
362+
> **Important**: Our next version is v0.1.62. If you want to contribute, please contribute to the `dev/v0.1.62` branch (or `main` if dev/v0.1.62 doesn't exist yet).
363363
364364
### 1. Create Feature Branch
365365

@@ -368,7 +368,7 @@ Create a `.env` file in the `massgen` directory as described in [README](README.
368368
git fetch upstream
369369

370370
# Create feature branch from dev/v0.1.60 (or main if dev branch doesn't exist yet)
371-
git checkout -b feature/your-feature-name upstream/dev/v0.1.61
371+
git checkout -b feature/your-feature-name upstream/dev/v0.1.62
372372
```
373373

374374
### 2. Make Your Changes
@@ -507,7 +507,7 @@ git push origin feature/your-feature-name
507507
```
508508

509509
Then create a pull request on GitHub:
510-
- Base branch: `dev/v0.1.61` (or `main` if dev branch doesn't exist yet)
510+
- Base branch: `dev/v0.1.62` (or `main` if dev branch doesn't exist yet)
511511
- Compare branch: `feature/your-feature-name`
512512
- Add clear description of changes
513513
- Link any related issues
@@ -617,7 +617,7 @@ Have a significant feature idea not covered by existing tracks?
617617
- [ ] Tests pass locally
618618
- [ ] Documentation is updated if needed
619619
- [ ] Commit messages follow convention
620-
- [ ] PR targets `dev/v0.1.61` branch (or `main` if dev branch doesn't exist yet)
620+
- [ ] PR targets `dev/v0.1.62` branch (or `main` if dev branch doesn't exist yet)
621621

622622
### PR Description Should Include
623623

README.md

Lines changed: 30 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ This project started with the "threads of thought" and "iterative refinement" id
6969
<details open>
7070
<summary><h3>🆕 Latest Features</h3></summary>
7171

72-
- [v0.1.59 Features](#-latest-features-v0159)
72+
- [v0.1.61 Features](#-latest-features-v0161)
7373
</details>
7474

7575
<details open>
@@ -122,15 +122,15 @@ This project started with the "threads of thought" and "iterative refinement" id
122122
<details open>
123123
<summary><h3>🗺️ Roadmap</h3></summary>
124124

125-
- [Recent Achievements (v0.1.59)](#recent-achievements-v0159)
126-
- [Previous Achievements (v0.0.3 - v0.1.58)](#previous-achievements-v003---v0158)
125+
- [Recent Achievements (v0.1.61)](#recent-achievements-v0161)
126+
- [Previous Achievements (v0.0.3 - v0.1.60)](#previous-achievements-v003---v0160)
127127
- [Key Future Enhancements](#key-future-enhancements)
128128
- Bug Fixes & Backend Improvements
129129
- Advanced Agent Collaboration
130130
- Expanded Model, Tool & Agent Integrations
131131
- Improved Performance & Scalability
132132
- Enhanced Developer Experience
133-
- [v0.1.60 Roadmap](#v0160-roadmap)
133+
- [v0.1.62 Roadmap](#v0162-roadmap)
134134
</details>
135135

136136
<details open>
@@ -155,23 +155,22 @@ This project started with the "threads of thought" and "iterative refinement" id
155155

156156
---
157157

158-
## 🆕 Latest Features (v0.1.60)
158+
## 🆕 Latest Features (v0.1.61)
159159

160-
**🎉 Released: March 6, 2026**
160+
**🎉 Released: March 9, 2026**
161161

162-
**What's New in v0.1.60:**
163-
- **🛠️ Multimodal Tool Improvements** - Rewritten `read_media` with clearer schema and `MediaCallLedgerHook` for tracking media calls.
164-
- **🤖 Subagent Enhancements** - `inherit_spawning_agent_backend` for automatic backend inheritance, `final_answer_strategy` for child orchestrator policy, per-agent `subagent_agents` override.
165-
- **🧠 GPT-5.4** - New default OpenAI flagship model across all coordination modes.
166-
- **🔄 Decomp + Checklist Cooperation** - Decomp mode works with checklist workflow for quality-gated subtask iteration.
162+
**What's New in v0.1.61:**
163+
- **🔄 Round Evaluator Paradigm** - New `round_evaluator` subagent type that delegates evaluation to specialized evaluator subagents for deeper quality assessment.
164+
- **📝 Evaluation Improvements** - Improved evaluation prompts with task plan injection for context-aware assessment.
165+
- **🔧 Orchestrator Refactoring** - Major orchestrator refactoring (+1,189 lines) to support the round evaluation workflow.
167166

168-
**Try v0.1.60 Features:**
167+
**Try v0.1.61 Features:**
169168
```bash
170169
# Install or upgrade
171170
pip install --upgrade massgen
172171

173-
# Choose backend 'openai' with model 'gpt-5.4' in the setup wizard to start using GPT-5.4
174-
uv run massgen --quickstart
172+
# Try the round evaluator paradigm
173+
uv run massgen --config @examples/features/round_evaluator_example.yaml "Create a website for an AI startup with polished visuals and interactive elements"
175174
```
176175

177176
[See full release history and examples](massgen/configs/README.md#release-history--examples)
@@ -1233,31 +1232,27 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch
12331232

12341233
⚠️ **Early Stage Notice:** As MassGen is in active development, please expect upcoming breaking architecture changes as we continue to refine and improve the system.
12351234

1236-
### Recent Achievements (v0.1.60)
1237-
1238-
**🎉 Released: March 6, 2026**
1235+
### Recent Achievements (v0.1.61)
12391236

1240-
#### Multimodal Tools
1241-
- **Rewritten `read_media` Tool** ([#978](https://github.com/massgen/MassGen/pull/978)): Clearer schema, better error handling, and improved naming
1242-
- **`MediaCallLedgerHook`**: New hook for tracking `read_media` and `generate_media` tool calls
1237+
**🎉 Released: March 9, 2026**
12431238

1244-
#### Subagent Enhancements
1245-
- **`inherit_spawning_agent_backend`** ([#978](https://github.com/massgen/MassGen/pull/978)): Subagents automatically inherit the spawning agent's backend configuration
1246-
- **`final_answer_strategy`**: Configurable child orchestrator final-answer policy (winner_reuse, winner_present, synthesize)
1247-
- **Per-Agent `subagent_agents`**: Per-agent override for subagent agent configs; orchestrator config file support with robust JSON parsing
1239+
#### Round Evaluator Paradigm
1240+
- **Round Evaluator Subagent Type** ([#986](https://github.com/massgen/MassGen/pull/986)): New `round_evaluator` subagent type that delegates evaluation to specialized evaluator subagents for deeper quality assessment
1241+
- **Orchestrator Refactoring**: Major orchestrator refactoring (+1,189 lines) to support the round evaluation workflow
1242+
- **New Config**: `round_evaluator_example.yaml` for easy adoption
12481243

1249-
#### Model & Coordination
1250-
- **GPT-5.4 Support** ([#978](https://github.com/massgen/MassGen/pull/978)): New default OpenAI flagship model added to the model registry
1251-
- **Decomp + Checklist Cooperation**: Decomposition mode works with the checklist workflow for quality-gated subtask iteration
1252-
- **Improved Verification Round Time**: Better `verification_latest` prompts for faster verification rounds
1244+
#### Evaluation Improvements
1245+
- **Improved Evaluation Prompts** ([#986](https://github.com/massgen/MassGen/pull/986)): Clearer, more actionable feedback with task plan injection
1246+
- **Simplified Config**: Simplified config handling for evaluation parameters
1247+
- **SUBAGENT.md Generality**: Improved SUBAGENT.md for broader subagent compatibility
12531248

12541249
#### Fixes
1255-
- **Checklist & Proposal Injections**: More reliable checklist behavior with improved proposal injection
1256-
- **Codex Prompt Caching**: Fixed prompt caching calculation for pricing accuracy
1257-
- **Task Plan Refresh**: Fixed task plan refresh during quality rounds
1258-
- **Skill Prefix Handling**: Fixed edge cases in skill prefix resolution
1250+
- **Session Resumption** ([#986](https://github.com/massgen/MassGen/pull/986)): Fixed resumption from already-resumed logs
1251+
- **Round Evaluation Prompts**: Improved round evaluation prompt clarity
1252+
1253+
### Previous Achievements (v0.0.3 - v0.1.60)
12591254

1260-
### Previous Achievements (v0.0.3 - v0.1.59)
1255+
**Multimodal Tools, Subagent Enhancements & GPT-5.4 (v0.1.60)**: Rewritten read_media with clearer schema and MediaCallLedgerHook. Subagent enhancements with inherit_spawning_agent_backend, final_answer_strategy, per-agent subagent_agents. GPT-5.4 as default OpenAI flagship. Decomp mode cooperates with checklist workflow. Codex prompt caching fix.
12611256

12621257
**Quality Round Improvements (v0.1.59)**: Auto-add improvements to task plan, plan review enhancements. Better eval gen config, checklist fixes, Gemini tool name normalization for MCP. Subagent behavior adjustments, Docker skill write access fixes. Video gen skill adjustments and impact metric restoration.
12631258

@@ -1522,9 +1517,9 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch
15221517

15231518
We welcome community contributions to achieve these goals.
15241519

1525-
### v0.1.60 Roadmap
1520+
### v0.1.62 Roadmap
15261521

1527-
Version 0.1.60 focuses on improving skill use and exploration:
1522+
Version 0.1.62 focuses on improving skill use and exploration:
15281523

15291524
#### Planned Features
15301525
- **Improve Skill Use and Exploration** ([#873](https://github.com/massgen/MassGen/issues/873)): Local skill execution, skill registry with hierarchical organization, and skill consolidation workflow

README_PYPI.md

Lines changed: 30 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ This project started with the "threads of thought" and "iterative refinement" id
6868
<details open>
6969
<summary><h3>🆕 Latest Features</h3></summary>
7070

71-
- [v0.1.59 Features](#-latest-features-v0159)
71+
- [v0.1.61 Features](#-latest-features-v0161)
7272
</details>
7373

7474
<details open>
@@ -121,15 +121,15 @@ This project started with the "threads of thought" and "iterative refinement" id
121121
<details open>
122122
<summary><h3>🗺️ Roadmap</h3></summary>
123123

124-
- [Recent Achievements (v0.1.59)](#recent-achievements-v0159)
125-
- [Previous Achievements (v0.0.3 - v0.1.58)](#previous-achievements-v003---v0158)
124+
- [Recent Achievements (v0.1.61)](#recent-achievements-v0161)
125+
- [Previous Achievements (v0.0.3 - v0.1.60)](#previous-achievements-v003---v0160)
126126
- [Key Future Enhancements](#key-future-enhancements)
127127
- Bug Fixes & Backend Improvements
128128
- Advanced Agent Collaboration
129129
- Expanded Model, Tool & Agent Integrations
130130
- Improved Performance & Scalability
131131
- Enhanced Developer Experience
132-
- [v0.1.60 Roadmap](#v0160-roadmap)
132+
- [v0.1.62 Roadmap](#v0162-roadmap)
133133
</details>
134134

135135
<details open>
@@ -154,23 +154,22 @@ This project started with the "threads of thought" and "iterative refinement" id
154154

155155
---
156156

157-
## 🆕 Latest Features (v0.1.60)
157+
## 🆕 Latest Features (v0.1.61)
158158

159-
**🎉 Released: March 6, 2026**
159+
**🎉 Released: March 9, 2026**
160160

161-
**What's New in v0.1.60:**
162-
- **🛠️ Multimodal Tool Improvements** - Rewritten `read_media` with clearer schema and `MediaCallLedgerHook` for tracking media calls.
163-
- **🤖 Subagent Enhancements** - `inherit_spawning_agent_backend` for automatic backend inheritance, `final_answer_strategy` for child orchestrator policy, per-agent `subagent_agents` override.
164-
- **🧠 GPT-5.4** - New default OpenAI flagship model across all coordination modes.
165-
- **🔄 Decomp + Checklist Cooperation** - Decomp mode works with checklist workflow for quality-gated subtask iteration.
161+
**What's New in v0.1.61:**
162+
- **🔄 Round Evaluator Paradigm** - New `round_evaluator` subagent type that delegates evaluation to specialized evaluator subagents for deeper quality assessment.
163+
- **📝 Evaluation Improvements** - Improved evaluation prompts with task plan injection for context-aware assessment.
164+
- **🔧 Orchestrator Refactoring** - Major orchestrator refactoring (+1,189 lines) to support the round evaluation workflow.
166165

167-
**Try v0.1.60 Features:**
166+
**Try v0.1.61 Features:**
168167
```bash
169168
# Install or upgrade
170169
pip install --upgrade massgen
171170

172-
# Choose backend 'openai' with model 'gpt-5.4' in the setup wizard to start using GPT-5.4
173-
uv run massgen --quickstart
171+
# Try the round evaluator paradigm
172+
uv run massgen --config @examples/features/round_evaluator_example.yaml "Create a website for an AI startup with polished visuals and interactive elements"
174173
```
175174

176175
[See full release history and examples](massgen/configs/README.md#release-history--examples)
@@ -1232,31 +1231,27 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch
12321231

12331232
⚠️ **Early Stage Notice:** As MassGen is in active development, please expect upcoming breaking architecture changes as we continue to refine and improve the system.
12341233

1235-
### Recent Achievements (v0.1.60)
1236-
1237-
**🎉 Released: March 6, 2026**
1234+
### Recent Achievements (v0.1.61)
12381235

1239-
#### Multimodal Tools
1240-
- **Rewritten `read_media` Tool** ([#978](https://github.com/massgen/MassGen/pull/978)): Clearer schema, better error handling, and improved naming
1241-
- **`MediaCallLedgerHook`**: New hook for tracking `read_media` and `generate_media` tool calls
1236+
**🎉 Released: March 9, 2026**
12421237

1243-
#### Subagent Enhancements
1244-
- **`inherit_spawning_agent_backend`** ([#978](https://github.com/massgen/MassGen/pull/978)): Subagents automatically inherit the spawning agent's backend configuration
1245-
- **`final_answer_strategy`**: Configurable child orchestrator final-answer policy (winner_reuse, winner_present, synthesize)
1246-
- **Per-Agent `subagent_agents`**: Per-agent override for subagent agent configs; orchestrator config file support with robust JSON parsing
1238+
#### Round Evaluator Paradigm
1239+
- **Round Evaluator Subagent Type** ([#986](https://github.com/massgen/MassGen/pull/986)): New `round_evaluator` subagent type that delegates evaluation to specialized evaluator subagents for deeper quality assessment
1240+
- **Orchestrator Refactoring**: Major orchestrator refactoring (+1,189 lines) to support the round evaluation workflow
1241+
- **New Config**: `round_evaluator_example.yaml` for easy adoption
12471242

1248-
#### Model & Coordination
1249-
- **GPT-5.4 Support** ([#978](https://github.com/massgen/MassGen/pull/978)): New default OpenAI flagship model added to the model registry
1250-
- **Decomp + Checklist Cooperation**: Decomposition mode works with the checklist workflow for quality-gated subtask iteration
1251-
- **Improved Verification Round Time**: Better `verification_latest` prompts for faster verification rounds
1243+
#### Evaluation Improvements
1244+
- **Improved Evaluation Prompts** ([#986](https://github.com/massgen/MassGen/pull/986)): Clearer, more actionable feedback with task plan injection
1245+
- **Simplified Config**: Simplified config handling for evaluation parameters
1246+
- **SUBAGENT.md Generality**: Improved SUBAGENT.md for broader subagent compatibility
12521247

12531248
#### Fixes
1254-
- **Checklist & Proposal Injections**: More reliable checklist behavior with improved proposal injection
1255-
- **Codex Prompt Caching**: Fixed prompt caching calculation for pricing accuracy
1256-
- **Task Plan Refresh**: Fixed task plan refresh during quality rounds
1257-
- **Skill Prefix Handling**: Fixed edge cases in skill prefix resolution
1249+
- **Session Resumption** ([#986](https://github.com/massgen/MassGen/pull/986)): Fixed resumption from already-resumed logs
1250+
- **Round Evaluation Prompts**: Improved round evaluation prompt clarity
1251+
1252+
### Previous Achievements (v0.0.3 - v0.1.60)
12581253

1259-
### Previous Achievements (v0.0.3 - v0.1.59)
1254+
**Multimodal Tools, Subagent Enhancements & GPT-5.4 (v0.1.60)**: Rewritten read_media with clearer schema and MediaCallLedgerHook. Subagent enhancements with inherit_spawning_agent_backend, final_answer_strategy, per-agent subagent_agents. GPT-5.4 as default OpenAI flagship. Decomp mode cooperates with checklist workflow. Codex prompt caching fix.
12601255

12611256
**Quality Round Improvements (v0.1.59)**: Auto-add improvements to task plan, plan review enhancements. Better eval gen config, checklist fixes, Gemini tool name normalization for MCP. Subagent behavior adjustments, Docker skill write access fixes. Video gen skill adjustments and impact metric restoration.
12621257

@@ -1521,9 +1516,9 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch
15211516

15221517
We welcome community contributions to achieve these goals.
15231518

1524-
### v0.1.60 Roadmap
1519+
### v0.1.62 Roadmap
15251520

1526-
Version 0.1.60 focuses on improving skill use and exploration:
1521+
Version 0.1.62 focuses on improving skill use and exploration:
15271522

15281523
#### Planned Features
15291524
- **Improve Skill Use and Exploration** ([#873](https://github.com/massgen/MassGen/issues/873)): Local skill execution, skill registry with hierarchical organization, and skill consolidation workflow

0 commit comments

Comments
 (0)