Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 28 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Recent Releases

**v0.1.62 (March 11, 2026)** - MassGen Skill & Viewer
New general-purpose MassGen Skill with 4 modes (general, evaluate, plan, spec) for use from Claude Code and other AI agents. Session viewer for real-time observation. Backend improvements for Claude Code, Codex, and Copilot. Headless and web quickstart modes.

**v0.1.61 (March 9, 2026)** - Round Evaluator Paradigm
New round evaluator subagent type that delegates evaluation to specialized evaluator subagents for deeper quality assessment. Major orchestrator refactoring with improved evaluation prompts, task plan injection, and subagent fixes.

Expand All @@ -18,11 +21,33 @@ Rewritten read_media with clearer schema and MediaCallLedgerHook for media call
**v0.1.59 (March 4, 2026)** - Quality Round Improvements
Planning improvements with auto-added improvements to task plan and plan review enhancements. Checklist and evaluation enhancements with better eval gen config and Gemini tool name normalization. Subagent behavior adjustments and media generation fixes.

**v0.1.58 (March 2, 2026)** - Multimodal Revamp, Nvidia NIM Backend & Quality Rethinking
Comprehensive multimodal revamp with ElevenLabs TTS/STT, Nano Banana 2 image generation, and Grok multimedia. Nvidia NIM backend for NVIDIA Inference Microservices. Quality rethinking subagent for per-element craft improvements. Smarter checklists with improve/preserve listings. Logging architecture refactor and CLI mode flags.

---

## [0.1.62] - 2026-03-11

### Added
- **MassGen Skill** ([#992](https://github.com/massgen/MassGen/pull/992)): New general-purpose multi-agent skill with 4 modes (general, evaluate, plan, spec) for Claude Code and other AI agents
- **Session Viewer** ([#992](https://github.com/massgen/MassGen/pull/992)): New `massgen viewer` command for real-time observation of automation sessions with interactive session picker and web mode
- **Headless Quickstart** ([#992](https://github.com/massgen/MassGen/pull/992)): Non-interactive setup via `--quickstart --headless` for CI/CD integration
- **Web Quickstart** ([#992](https://github.com/massgen/MassGen/pull/992)): Browser-based setup flow via `--web-quickstart`
- **Skill Auto-Sync** ([#992](https://github.com/massgen/MassGen/pull/992)): GitHub Actions workflow to auto-sync MassGen Skill to separate repository for easy installation

### Changed
- **Claude Code Backend** ([#992](https://github.com/massgen/MassGen/pull/992)): Background task execution support and SDK MCP integration
- **Codex Backend** ([#992](https://github.com/massgen/MassGen/pull/992)): Native filesystem access, JSONL event streaming, and MCP tool support
- **Copilot Model Discovery** ([#992](https://github.com/massgen/MassGen/pull/992)): Runtime model fetching with metadata caching
- **Planning & Evaluation** ([#992](https://github.com/massgen/MassGen/pull/992)): Better planning prompts with thoroughness support, removed should/could criteria to reduce output similarity
- **CLI Enhancements** ([#992](https://github.com/massgen/MassGen/pull/992)): `--print-backends` table, viewer subcommand, multi-agent quickstart via `--quickstart-agent`

### Fixed
- **Skill Viewer** ([#992](https://github.com/massgen/MassGen/pull/992)): Fixed skill viewer display and added convenience shell script
- **Correctness Prompts** ([#992](https://github.com/massgen/MassGen/pull/992)): Updated correctness prompts for improved accuracy

### Technical Details
- **Major Focus**: MassGen Skill & Viewer — general-purpose skill, session observation, backend improvements
- **PRs Merged**: [#992](https://github.com/massgen/MassGen/pull/992) (evaluator-skill)
- **Contributors**: @ncrispino (6 commits), @HenryQi (2 commits) and the MassGen team

## [0.1.61] - 2026-03-09

### Added
Expand Down
8 changes: 4 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -359,7 +359,7 @@ Create a `.env` file in the `massgen` directory as described in [README](README.

## 🔧 Development Workflow

> **Important**: Our next version is v0.1.62. If you want to contribute, please contribute to the `dev/v0.1.62` branch (or `main` if dev/v0.1.62 doesn't exist yet).
> **Important**: Our next version is v0.1.63. If you want to contribute, please contribute to the `dev/v0.1.63` branch (or `main` if dev/v0.1.63 doesn't exist yet).

### 1. Create Feature Branch

Expand All @@ -368,7 +368,7 @@ Create a `.env` file in the `massgen` directory as described in [README](README.
git fetch upstream

# Create feature branch from dev/v0.1.60 (or main if dev branch doesn't exist yet)
git checkout -b feature/your-feature-name upstream/dev/v0.1.62
git checkout -b feature/your-feature-name upstream/dev/v0.1.63
```

### 2. Make Your Changes
Expand Down Expand Up @@ -507,7 +507,7 @@ git push origin feature/your-feature-name
```

Then create a pull request on GitHub:
- Base branch: `dev/v0.1.62` (or `main` if dev branch doesn't exist yet)
- Base branch: `dev/v0.1.63` (or `main` if dev branch doesn't exist yet)
- Compare branch: `feature/your-feature-name`
- Add clear description of changes
- Link any related issues
Expand Down Expand Up @@ -617,7 +617,7 @@ Have a significant feature idea not covered by existing tracks?
- [ ] Tests pass locally
- [ ] Documentation is updated if needed
- [ ] Commit messages follow convention
- [ ] PR targets `dev/v0.1.62` branch (or `main` if dev branch doesn't exist yet)
- [ ] PR targets `dev/v0.1.63` branch (or `main` if dev branch doesn't exist yet)

### PR Description Should Include

Expand Down
71 changes: 38 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ MassGen is a cutting-edge multi-agent framework that coordinates AI agents to so
This project started with the "threads of thought" and "iterative refinement" ideas presented in [The Myth of Reasoning](https://docs.ag2.ai/latest/docs/blog/2025/04/16/Reasoning/), and extends the classic "multi-agent conversation" idea in [AG2](https://github.com/ag2ai/ag2). Here is a [video recording](https://www.youtube.com/watch?v=xM2Uguw1UsQ) of the background context introduction presented at the Berkeley Agentic AI Summit 2025.

<p align="center">
<b>🤖 For LLM Agents:</b> <a href="AI_USAGE.md">AI_USAGE.md</a> - Complete automation guide to run MassGen inside an LLM
<b>🧩 Use MassGen as a Skill:</b> <code>npx skills add massgen/skills --all</code> — then type invoke the skill in Claude Code, Cursor, Copilot, or 40+ other agents. <a href="https://github.com/massgen/skills">Learn more →</a>
</p>

<p align="center">
Expand All @@ -69,7 +69,7 @@ This project started with the "threads of thought" and "iterative refinement" id
<details open>
<summary><h3>🆕 Latest Features</h3></summary>

- [v0.1.61 Features](#-latest-features-v0161)
- [v0.1.62 Features](#-latest-features-v0162)
</details>

<details open>
Expand Down Expand Up @@ -122,15 +122,15 @@ This project started with the "threads of thought" and "iterative refinement" id
<details open>
<summary><h3>🗺️ Roadmap</h3></summary>

- [Recent Achievements (v0.1.61)](#recent-achievements-v0161)
- [Previous Achievements (v0.0.3 - v0.1.60)](#previous-achievements-v003---v0160)
- [Recent Achievements (v0.1.62)](#recent-achievements-v0162)
- [Previous Achievements (v0.0.3 - v0.1.61)](#previous-achievements-v003---v0161)
- [Key Future Enhancements](#key-future-enhancements)
- Bug Fixes & Backend Improvements
- Advanced Agent Collaboration
- Expanded Model, Tool & Agent Integrations
- Improved Performance & Scalability
- Enhanced Developer Experience
- [v0.1.62 Roadmap](#v0162-roadmap)
- [v0.1.63 Roadmap](#v0163-roadmap)
</details>

<details open>
Expand All @@ -155,22 +155,24 @@ This project started with the "threads of thought" and "iterative refinement" id

---

## 🆕 Latest Features (v0.1.61)
## 🆕 Latest Features (v0.1.62)

**🎉 Released: March 9, 2026**
**🎉 Released: March 11, 2026**

**What's New in v0.1.61:**
- **🔄 Round Evaluator Paradigm** - New `round_evaluator` subagent type that delegates evaluation to specialized evaluator subagents for deeper quality assessment.
- **📝 Evaluation Improvements** - Improved evaluation prompts with task plan injection for context-aware assessment.
- **🔧 Orchestrator Refactoring** - Major orchestrator refactoring (+1,189 lines) to support the round evaluation workflow.
**What's New in v0.1.62:**
- **🧩 MassGen Skill** - New general-purpose multi-agent skill with 4 modes (general, evaluate, plan, spec) for Claude Code and other AI agents.
- **👁️ Session Viewer** - New `massgen viewer` command for real-time observation of automation sessions with interactive picker and web mode.
- **⚡ Backend & Quickstart** - Claude Code/Codex/Copilot backend improvements, headless and web quickstart modes.

**Try v0.1.61 Features:**
**Try v0.1.62 Features:**
```bash
# Install or upgrade
pip install --upgrade massgen
# Install the MassGen Skill for your AI agent
npx skills add massgen/skills --all
# Then in Claude Code, Cursor, Copilot, etc.:
# /massgen "Your complex task"

# Try the round evaluator paradigm
uv run massgen --config @examples/features/round_evaluator_example.yaml "Create a website for an AI startup with polished visuals and interactive elements"
# Try the Session Viewer
uv run massgen viewer --pick
```

→ [See full release history and examples](massgen/configs/README.md#release-history--examples)
Expand Down Expand Up @@ -1242,25 +1244,27 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch

⚠️ **Early Stage Notice:** As MassGen is in active development, please expect upcoming breaking architecture changes as we continue to refine and improve the system.

### Recent Achievements (v0.1.61)
### Recent Achievements (v0.1.62)

**🎉 Released: March 9, 2026**
**🎉 Released: March 11, 2026**

#### Round Evaluator Paradigm
- **Round Evaluator Subagent Type** ([#986](https://github.com/massgen/MassGen/pull/986)): New `round_evaluator` subagent type that delegates evaluation to specialized evaluator subagents for deeper quality assessment
- **Orchestrator Refactoring**: Major orchestrator refactoring (+1,189 lines) to support the round evaluation workflow
- **New Config**: `round_evaluator_example.yaml` for easy adoption
#### MassGen Skill
- **General-Purpose Skill** ([#992](https://github.com/massgen/MassGen/pull/992)): New multi-agent skill with 4 modes (general, evaluate, plan, spec) for Claude Code and other AI agents
- **Auto-Sync**: GitHub Actions workflow to auto-sync skill to separate repository for easy installation
- **Reference Docs**: Comprehensive workflow guides and prompt templates for each mode

#### Evaluation Improvements
- **Improved Evaluation Prompts** ([#986](https://github.com/massgen/MassGen/pull/986)): Clearer, more actionable feedback with task plan injection
- **Simplified Config**: Simplified config handling for evaluation parameters
- **SUBAGENT.md Generality**: Improved SUBAGENT.md for broader subagent compatibility
#### Session Viewer
- **Viewer Command** ([#992](https://github.com/massgen/MassGen/pull/992)): New `massgen viewer` for real-time observation of automation sessions
- **Interactive Picker**: `--pick` flag for session selection, `--web` for browser-based viewing

#### Fixes
- **Session Resumption** ([#986](https://github.com/massgen/MassGen/pull/986)): Fixed resumption from already-resumed logs
- **Round Evaluation Prompts**: Improved round evaluation prompt clarity
#### Backend & Quickstart
- **Backend Improvements** ([#992](https://github.com/massgen/MassGen/pull/992)): Claude Code background task execution, Codex native filesystem and MCP support, Copilot runtime model discovery
- **Quickstart Modes**: Headless quickstart (`--quickstart --headless`) for CI/CD, web quickstart (`--web-quickstart`) for browser-based setup
- **Evaluation & Planning**: Better planning prompts with thoroughness support, removed should/could criteria

### Previous Achievements (v0.0.3 - v0.1.60)
### Previous Achievements (v0.0.3 - v0.1.61)

✅ **Round Evaluator Paradigm (v0.1.61)**: New round evaluator subagent type that automatically spawns evaluator subagents after each new answer to provide detailed feedback as input to the next round. Major orchestrator refactoring with improved evaluation prompts, task plan injection, and subagent fixes.

✅ **Multimodal Tools, Subagent Enhancements & GPT-5.4 (v0.1.60)**: Rewritten read_media with clearer schema and MediaCallLedgerHook. Subagent enhancements with inherit_spawning_agent_backend, final_answer_strategy, per-agent subagent_agents. GPT-5.4 as default OpenAI flagship. Decomp mode cooperates with checklist workflow. Codex prompt caching fix.

Expand Down Expand Up @@ -1527,12 +1531,13 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch

We welcome community contributions to achieve these goals.

### v0.1.62 Roadmap
### v0.1.63 Roadmap

Version 0.1.62 focuses on improving skill use and exploration:
Version 0.1.63 focuses on adding a Gemini CLI backend and image/video editing capabilities:

#### Planned Features
- **Improve Skill Use and Exploration** ([#873](https://github.com/massgen/MassGen/issues/873)): Local skill execution, skill registry with hierarchical organization, and skill consolidation workflow
- **Gemini CLI Backend** ([#952](https://github.com/massgen/MassGen/issues/952)): Gemini CLI as a first-class backend option
- **Image/Video Edit Capabilities** ([#959](https://github.com/massgen/MassGen/issues/959)): Check and support image/video editing capabilities across providers

---

Expand Down
Loading
Loading