Releases: massgen/MassGen
Releases · massgen/MassGen
v0.1.64
🚀 Release Highlights — v0.1.64 (2026-03-16)
🔌 Gemini CLI Backend
- Gemini CLI as a native backend: Google's Gemini CLI with subprocess-based streaming
- Session persistence: Multi-turn conversations via CLI session IDs
- MCP tools: Wired through
.gemini/settings.jsonwith native hook adapter for tool execution - Docker support: Containerized execution via
gemini_cli_docker.yamlconfig
🔍 Execution Trace Analyzer
- New subagent type: Mechanistic analysis of agent execution traces to extract durable learnings
- 7-dimension evaluation: Error learning, effort allocation, approach effectiveness, tool strategy, reasoning patterns, context health, verification completeness
- Output:
process_report.md(narrative) andprocess_verdict.json(structured scores)
⚡ WebSocket Streaming
- Persistent WebSocket transport:
wss://connection to OpenAI Response API for real-time event streaming - Auto-reconnection: Configurable retry logic with exponential backoff
- YAML config: Enable with
websocket_mode: trueon OpenAI backend
🐳 Copilot Docker Mode
- Containerized tool execution:
command_line_execution_mode: "docker"for Copilot backend - Configuration: Docker sudo support, network mode selection (bridge/host)
✅ Fixes
- Response API duplicates: Prevent duplicate item errors in recursive tool loops (#1000)
📖 Getting Started
- Quick Start Guide
- Try It:
pip install massgen==0.1.64 # Try the Gemini CLI backend uv run massgen --config @examples/providers/gemini/gemini_cli_local "Explain quantum computing"
What's Changed
- Trace eval sub by @ncrispino in #1002
- feat: add websocket_mode for responses api by @praneeth999 in #990
- fix: Response API duplicate item errors and Windows MCP server crashes by @db-ol in #1000
- feat: Gemini cli clean by @ncrispino in #999
- docs: docs for v0.1.64 by @Henry-811 in #1003
- feat: v0.1.64 by @Henry-811 in #998
Full Changelog: v0.1.63...v0.1.64
v0.1.63
🚀 Release Highlights — v0.1.63 (2026-03-13)
🎯 Ensemble Pattern
- Ensemble defaults for subagents:
disable_injectionanddefer_voting_until_all_answerednow default to true, so subagents work independently before voting for more diverse, higher-quality results - Automatic ensemble orchestration: Defaults apply when spawning subagent orchestrators without explicit override
🔄 Round Evaluator Improvements
- Transformation pressure: Evaluator pushes agents toward meaningful structural changes rather than surface-level edits
- Success contracts: Explicit quality gates agents must satisfy before the round evaluator allows convergence
- Verification replay: Evaluation consistency across rounds via replayed verification context
⚡ Lighter Refinement
- Reduced subagent overhead: Lighter refinement prompts for subagent workflows cut token usage and latency
- Killed agent handling: Graceful management of agents that time out or fail mid-round
✅ Fixes
- Timeout fallback: More robust coordination when agents hit timeout boundaries
📖 Getting Started
- Quick Start Guide
- Try It:
pip install massgen==0.1.63 # Try the round evaluator with ensemble defaults uv run massgen --config @examples/features/round_evaluator_example.yaml "Create a polished landing page for an AI product"
What's Changed
- feat: Better subagent contracts; lighter refinement for subagents too by @ncrispino in #996
- docs: docs for v0.1.63 by @Henry-811 in #997
- feat: v0.1.63 by @Henry-811 in #995
Full Changelog: v0.1.62...v0.1.63
v0.1.62
🚀 Release Highlights — v0.1.62 (2026-03-11)
🧩 MassGen Skill
- Multi-agent collaboration as a skill: Install with
npx skills add massgen/skills --alland use MassGen directly from Claude Code, Cursor, Copilot, and 40+ other AI agents - Four modes: General (any task), Evaluate (critique existing work), Plan (structured project plans), Spec (requirements specifications)
- Auto-distributed: Skill automatically syncs to a dedicated repository for easy installation
👁️ Session Viewer
- Watch automation runs in real-time: New
massgen viewercommand opens a TUI to observe running or completed sessions - Session picker:
--pickflag for browsing and selecting specific sessions,--webfor browser-based viewing
⚡ Backend & Quickstart Improvements
- Claude Code backend: Background task execution and native MCP support via the SDK
- Codex backend: Native filesystem access and MCP tool integration
- Copilot backend: Runtime model discovery with automatic capability detection
- Headless quickstart: Non-interactive setup via
--quickstart --headlessfor CI/CD pipelines - Web quickstart: Browser-based setup via
--web-quickstart
✅ Fixes
- Evaluation criteria: Removed should/could criteria that caused agents to produce overly similar outputs
- Planning prompts: Improved planning prompts with configurable thoroughness levels
📖 Getting Started
- Quick Start Guide
- Try It:
# Install the MassGen Skill for your AI agent npx skills add massgen/skills --all # Then use MassGen from Claude Code, Cursor, Copilot, etc. # Or install MassGen directly and try the Session Viewer pip install massgen==0.1.62 uv run massgen viewer --pick
What's Changed
- feat: MassGen skill by @ncrispino in #992
- docs: docs for v0.1.62 by @Henry-811 in #993
- feat: v0.1.62 by @Henry-811 in #991
Full Changelog: v0.1.61...v0.1.62
v0.1.61
🚀 Release Highlights — v0.1.61 (2026-03-09)
🔄 Round Evaluator Paradigm
- Automatic post-answer evaluation: New
round_evaluatorsubagent type that automatically spawns evaluator subagents after each new answer, feeding detailed feedback into the next round - Configurable evaluation flow: Control whether evaluation runs before or after checklist grading, whether to skip synthesis, and whether evaluators refine their feedback
- Example config: New
round_evaluator_example.yaml— one agent builds while three agents evaluate in parallel
📝 Evaluation & Prompt Improvements
- Task plan injection: Evaluation prompts now include the current task plan for context-aware quality assessment
- Clearer evaluation prompts: Rewritten round evaluation prompts for more actionable, focused feedback
✅ Fixes
- Session resumption: Fixed crash when resuming from an already-resumed log
- Timeout fallback: When coordination times out, the latest answer is used directly without an extra final presentation step
- Subagent compatibility: Improved SUBAGENT.md template for broader subagent type support
- Codex backend: Added Codex backend support for new orchestrator features
📖 Getting Started
- Quick Start Guide
- Try It:
# Install or upgrade to v0.1.61 pip install --upgrade massgen # One agent builds, 3 agents evaluate — round evaluator in action uv run massgen --config @examples/features/round_evaluator_example.yaml "Create a website about a fictional AI product that is visually stunning and has at least one unique interactive element"
What's Changed
- feat: Add round evaluator paradigm by @ncrispino in #986
- docs: docs for v0.1.61 by @Henry-811 in #987
- feat: v0.1.61 by @Henry-811 in #985
Full Changelog: v0.1.60...v0.1.61
v0.1.60
🚀 Release Highlights — v0.1.60 (2026-03-06)
🛠️ Multimodal Tool Improvements
- Rewritten
read_media: Clearer tool schema, better error handling, and improved naming for more reliable media understanding - Media Call Ledger: Automatic tracking of all
read_mediaandgenerate_mediacalls for observability
🤖 Subagent Enhancements
- Backend Inheritance: New
inherit_spawning_agent_backendoption lets subagents automatically use the same backend as the agent that spawned them - Final Answer Strategy: New
final_answer_strategyoption controls how subagent orchestrators select the final answer (reuse winner, have winner present, or synthesize) - Per-Agent Subagent Configs: Each agent can now define its own
subagent_agentsoverride for fine-grained subagent control
🧠 GPT-5.4 Support
- New default OpenAI flagship: GPT-5.4 added to the model registry, ready to use across all coordination modes
🔄 Decomposition + Checklist Cooperation
- Unified Quality Workflow: Decomposition mode now cooperates with the checklist workflow, enabling quality-gated subtask iteration
- Faster Verification Rounds: Improved prompts for verification replay, reducing verification round time
✅ Fixes
- Checklist & Prompt Injection: More reliable checklist behavior with improved proposal injection and system prompt refocused on entire output quality
- Codex Pricing Accuracy: Fixed prompt caching calculation for correct cost tracking
- Task Plan Refresh: Fixed plan refresh during quality rounds
- Skill Prefix Handling: Fixed edge cases in skill prefix resolution
📖 Getting Started
- Quick Start Guide
- Try It:
# Install or upgrade to v0.1.60 pip install --upgrade massgen # Choose backend 'openai' with model 'gpt-5.4' in the setup wizard to start using GPT-5.4 uv run massgen --quickstart
What's Changed
- feat: Improve verification time by @ncrispino in #978
- docs: docs for v0.1.60 by @Henry-811 in #979
- feat: v0.1.60 by @Henry-811 in #974
Full Changelog: v0.1.59...v0.1.60
v0.1.59
🚀 Release Highlights — v0.1.59 (2026-03-04)
🔄 Smarter Quality Rounds
- Verification Replay Memories: Agents save replayable verification steps (commands, scripts, artifacts) to
verification_latest.md, auto-injected into future rounds so the next agent can replay the exact verification pipeline - Plan-Tracked Improvements: Improvements from each round are auto-added to the task plan, so agents build on prior progress instead of starting fresh
- Enhanced Plan Review: More thorough quality evaluation during plan review phases
✅ Checklist & Evaluation Fixes
- More Accurate Evaluations: Improved evaluation generation config for higher-quality assessments
- Consistent Checklist Behavior: Fixed checklist handling across rounds
- Gemini MCP Compatibility: Tool name normalization for Gemini backends using MCP tools
🔧 Infrastructure
- Subagent Enhancements: Improved coordination, task delegation, and Docker skill write access
- Video Generation Fixes: Cleaner error handling (no silent fallback to animated), restored impact metrics
- Bug Fixes: Answer anonymization fix, quickstart updates
📖 Getting Started
- Quick Start Guide
- Try It:
# Install or upgrade to v0.1.59 pip install --upgrade massgen # Try checklist-gated quality rounds with verification replay uv run massgen --config @examples/features/subagent_checklist.yaml \ "Create a website for an AI company selling a creative sci-fi style product. Ensure polished visuals and cool interactive elements"
What's Changed
- feat: Improve quality rounds by @ncrispino in #969
- docs: docs for v0.1.59 by @Henry-811 in #970
- feat: v0.1.59 by @Henry-811 in #968
Full Changelog: v0.1.58...v0.1.59
v0.1.58
🚀 Release Highlights — v0.1.58 (2026-03-02)
🖼️ Comprehensive Multimodal Revamp
- New Media Providers: ElevenLabs (TTS/STT), Nano Banana 2 (default image gen), and Grok Imagine (image/video) join existing providers
- Media Generation Skills: Reusable skills for image, video, and audio generation workflows
- Multi-Turn Image Editing: Iterative image editing for supported providers — agents can refine images across rounds
🟢 Nvidia NIM Backend
- NVIDIA Inference Microservices: First-class provider for NVIDIA-hosted models via NIM API
🔍 Quality Rethinking Subagent
- Per-Element Craft Improvements: New
quality_rethinkingsubagent type that targets specific elements for refinement - Improve/Preserve Checklists: Checklists now explicitly separate what to improve vs. what to preserve
🖥️ CLI Mode Flags
- New Flags:
--quick,--single-agent,--coordination-mode,--personasflags mirror TUI toggles - Plan Mode from CLI: Start plan mode directly from the command line
🔧 Infrastructure
- Logging Refactor: Fixed concurrent logging for parallel multi-agent execution — each agent gets isolated log context
- Subagent Hardening: Better error handling for malformed inputs and repeated tool calls
- Evaluation Criteria Defaults: Sensible defaults when criteria are not explicitly specified
📖 Getting Started
- Quick Start Guide
- Try It:
# Install or upgrade to v0.1.58 pip install --upgrade massgen # Try checklist-driven refinement with quality rethinking uv run massgen --config @examples/features/subagent_checklist.yaml \ "Create a website for an AI company selling a creative sci-fi style product. Ensure polished visuals and cool interactive elements"
What's Changed
- feat(backend): add Nvidia NIM as a first-class backend provider by @AbhimanyuAryan in #962
- feat: Improve coordination with improvements; improve and expand multimedia generation by @ncrispino in #964
- docs: docs for v0.1.58 by @Henry-811 in #965
- feat: v0.1.58 by @Henry-811 in #957
Full Changelog: v0.1.57...v0.1.58
v0.1.57
🚀 Release Highlights — v0.1.57 (2026-02-27)
🔗 Subagent Delegation Protocol
- Container-to-Host Spawning: File-based delegation via
SubagentLaunchWatcherwith atomic JSON request/response exchange and workspace path validation - Auto-Mounted Parent Workspace: Subagents get parent workspace (read-only) by default
🏗️ Builder Subagent
- Fresh-Context Artifact Generation: New subagent type for transformative redesigns and complex multi-file rewrites with prescriptive specs
📊 Smarter Convergence
- Substantiveness Tracking: Planned changes classified as transformative/structural/incremental — triggers builder or novelty subagent accordingly
- Diagnostic Report Gating: Optional quality gate — agents must submit a structured diagnostic report before checklist passes
- Per-Agent Checklist Scoring: Evaluate multiple agents separately with automatic format detection
🔧 Infrastructure
- Claude Code Reasoning: Unified
reasoningconfig (type, effort, budget_tokens) replacing deprecatedmax_thinking_tokens - Bug Fixes: Fixed Codex subagent spawning, subagent synchronization/timeout handling, temp workspace directories
📖 Getting Started
- Quick Start Guide
- Try It:
# Install or upgrade to v0.1.57 pip install --upgrade massgen # Try builder subagent with checklist-driven refinement uv run massgen --config @examples/features/subagent_checklist.yaml \ "Create a website for an AI company selling a creative sci-fi style product. Ensure polished visuals and cool interactive elements"
What's Changed
- feat: Improve subagent calling, eval criteria by @ncrispino in #955
- docs: docs for v0.1.57 by @Henry-811 in #956
- feat: v0.1.57 by @Henry-811 in #947
Full Changelog: v0.1.56...v0.1.57
v0.1.56
🚀 Release Highlights — v0.1.56 (2026-02-25)
📋 Spec Plan Mode
- Formal Requirements Before Execution:
plan_mode="spec"— spec creation, approval modal, and execution pipeline with changedoc integration. TUI spec mode (Shift+Tab twice to enter)
🎯 ask_others Targeted Messaging
- Focused Agent-to-Agent Communication:
target_agentsparameter for directed queries instead of broadcast, with per-target validation and response counting
🔧 Infrastructure
- Critic Subagent: New subagent type for honest quality assessment — detects genuine vs incremental improvement, quality ceiling. Enhanced novelty subagent guidance for growth-oriented refinement
- read_media Conversation Continuity: Follow-up image conversations via
continue_fromconversation_id - Codex OAuth Login Fix: Codex backend always available in WebUI regardless of OPENAI_API_KEY
- Docker Configuration Mounting: Claude and Codex configuration mounting for Docker containers
📖 Getting Started
- Quick Start Guide
- Try It:
# Install or upgrade to v0.1.56 pip install --upgrade massgen # Launch MassGen, then press Shift+Tab twice to enter 'spec' mode uv run massgen
What's Changed
- fix: adding codex for OAuth login by @MuL1ian in #937
- feat: Add spec mode by @ncrispino in #945
- docs: docs for v0.1.56 by @Henry-811 in #946
- feat: v0.1.56 by @Henry-811 in #944
Full Changelog: v0.1.55...v0.1.56
v0.1.55
🚀 Release Highlights — v0.1.55 (2026-02-23)
🧩 Specialized Subagent Types
- Discovery-Based Roles: Specialized subagent roles via
SUBAGENT.mdfrontmatter — evaluator (programmatic verification), explorer (investigation), researcher (deep analysis), novelty (breaks refinement plateaus) - TUI Visualization: Subagent roles displayed in agent status indicators
📊 Dynamic Evaluation Criteria
- Task-Specific Quality Gates: GEPA-inspired evaluation criteria generation replacing static checklist items, with domain-specific presets (persona, decomposition, evaluation, prompt, analysis)
- Core/Stretch Convergence: Items categorized as core or stretch — convergence off-ramp triggers when all core items pass. Score scale 0-10. Config:
evaluation_criteria_generator
🔧 Infrastructure
- Native Backend Image Routing:
understand_imageroutes to agent's own backend (Claude, Gemini, Grok, Claude Code, Codex) with OpenAI fallback - Configurable Video Frame Extraction: Scene-based (PySceneDetect) or uniform modes with
max_framescost guardrail (default 30, max 60). Config:multimodal_config.video - Remotion Skill: Video generation/editing skill available in quickstart wizard
- Unified Pre-Collaboration: Persona generation, decomposition, and eval criteria generation unified as composable primitives
📖 Getting Started
- Quick Start Guide
- Try It:
# Install or upgrade to v0.1.55 pip install --upgrade massgen # Multi-agent coordination with specialized subagents uv run massgen --config massgen/configs/features/background_subagent_example.yaml --cwd-context ro "Use an explorer subagent to analyze this repo"
What's Changed
- feat: Add subagent roles by @ncrispino in #938
- docs: docs for v0.1.55 by @Henry-811 in #939
- feat: v0.1.55 by @Henry-811 in #935
Full Changelog: v0.1.54...v0.1.55