Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,10 @@ When planning or creating specs, use AskUserQuestions to ensure you align with t

**IMPORTANT**: When the user asks you to check logs from a MassGen run, assume they ran with the current uncommitted changes unless they explicitly say otherwise. Do NOT assume "the run used an older commit" just because the execution_metadata.yaml shows a different git commit - the user likely ran with local modifications after you suggested changes. Always debug the actual code behavior first.

## Implementation Guidelines

After implementing any feature that involves passing parameters through multiple layers (e.g., backend → manager → component), always verify the full wiring chain end-to-end by tracing the parameter from its origin to its final usage site. Do not rely solely on unit tests passing — add an integration smoke test or assertion that the parameter actually arrives at its destination, not just that the downstream logic works when the parameter is provided.

## Project Overview

MassGen is a multi-agent system that coordinates multiple AI agents to solve complex tasks through parallel processing, intelligence sharing, and consensus building. Agents work simultaneously, observe each other's progress, and vote to converge on the best solution.
Expand Down Expand Up @@ -97,6 +101,13 @@ base.py (abstract interface)
└── grok.py (xAI)
```

### Agent Statelessness and Anonymity
Agents are STATELESS and ANONYMOUS across coordination rounds. Each round:
- Agent gets a fresh LLM invocation with no memory of previous rounds
- Agent does not know which agent it is (all identities are anonymous)
- Cross-agent information (answers, workspaces) is presented anonymously
- System prompts and branch names must NOT reveal agent identity or round history

### TUI Design Principles

**Timeline Chronology Rule**: Tool batching MUST respect chronological order. Tools should ONLY be batched when they arrive consecutively with no intervening content (thinking, text, status). When non-tool content arrives, any pending batch must be finalized before the content is added, and the next tool starts a fresh batch.
Expand Down Expand Up @@ -292,6 +303,7 @@ Detailed documentation for specific modules lives in `docs/modules/`. **Always c

- `docs/modules/subagents.md` - Subagent spawning, logging architecture, TUI integration
- `docs/modules/interactive_mode.md` - Interactive mode architecture, launch_run MCP, system prompts, project workspace
- `docs/modules/worktrees.md` - Worktree lifecycle, branch naming, scratch archives, system prompt integration

## MassGen Skills

Expand Down
163 changes: 163 additions & 0 deletions docs/modules/worktrees.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
# Worktrees Module

## Overview

When `write_mode` is enabled, agents work in git worktrees — isolated checkouts of the user's project. Each coordination round, every agent gets a fresh worktree with its own branch. Branches are preserved across rounds for cross-agent visibility, and scratch files are archived for continuity.

## Lifecycle

```
Round 1 Round 2 Final Presentation
───────────────────────────── ────────────────────────────── ─────────────────────────
agent1: massgen/a1b2c3d4 agent1: massgen/e5f6g7h8 presenter: presenter
agent2: massgen/i9j0k1l2 agent2: massgen/m3n4o5p6 (based on winner's branch)
│ │
▼ ▼
cleanup_round() cleanup_round()
├─ auto-commit changes ├─ auto-commit changes
├─ archive scratch → agent1/ ├─ archive scratch → agent1/
├─ remove worktree ├─ remove worktree
└─ keep branch └─ keep branch
cleanup_session()
└─ delete all branches
```

## Branch Naming

| Context | Branch Name | Example |
|---------|------------|---------|
| Regular rounds | `massgen/{8-char hex}` | `massgen/f028d1c7` |
| Final presentation | `branch_label` param (explicit) | `presenter` |
| No `branch_label` | Random hex suffix | `massgen/a1b2c3d4` |

Branch names are intentionally short and anonymous. They do NOT contain agent IDs, round numbers, or session IDs.

### Why not `agent1`, `agent2` as branch names?

An agent's branch gets deleted when it starts a new round (`previous_branch` mechanism). If agent1's branch were named `agent1` in round 1, then in round 2 that branch gets deleted and recreated — meaning other agents lose the reference mid-session. Short random names avoid this collision.

Instead, the **system prompt** maps other agents' branches to readable labels:

```
Other agents' branches:
- agent1: `massgen/f028d1c7`
- agent2: `massgen/a1b2c3d4`
```

## Scratch Directory

Each worktree gets a `.massgen_scratch/` directory:

- Git-excluded (via `info/exclude` in the **common** git dir)
- For experiments, eval scripts, notes
- Invisible to `git status`, `git diff`, and reviewers

### Scratch Archive

On `cleanup_round()`, scratch files are moved to the workspace:

```
{workspace}/.scratch_archive/
├── agent1/ # From round N (named by archive_label)
│ └── notes.md
└── agent2/
└── eval.py
```

The `archive_label` parameter on `move_scratch_to_workspace()` controls the directory name. The orchestrator passes the anonymous agent ID (e.g. `agent1`), making archives human-readable.

Without `archive_label`, falls back to the hex suffix from the branch name.

## Key Components

| Component | Location | Purpose |
|-----------|----------|---------|
| `IsolationContextManager` | `massgen/filesystem_manager/_isolation_context_manager.py` | Creates/manages worktrees, scratch dirs, branch lifecycle |
| `WorktreeManager` | `massgen/infrastructure/` | Low-level GitPython wrapper for worktree operations |
| `WorkspaceStructureSection` | `massgen/system_prompt_sections.py` | System prompt section showing branches and workspace info |

## IsolationContextManager Parameters

| Parameter | Type | Used For |
|-----------|------|----------|
| `session_id` | `str` | Not used in branch names (only for logging) |
| `write_mode` | `str` | `"auto"`, `"worktree"`, `"isolated"`, `"legacy"` |
| `workspace_path` | `str` | Where worktrees are created (`{workspace}/.worktree/`) |
| `previous_branch` | `str` | Branch to delete on init (one-branch-per-agent invariant) |
| `base_commit` | `str` | Starting point for worktree (e.g. winner's branch for final pres) |
| `branch_label` | `str` | Explicit branch name override (e.g. `"presenter"`) |

## System Prompt

The `WorkspaceStructureSection` shows agents:

1. **Their branch**: "Your work is on branch `massgen/f028d1c7`. All changes are auto-committed when your turn ends."
2. **Other agents' branches** (with anonymous labels): `agent1: massgen/abc123`
3. **Scratch archive reminder**: "Check `.scratch_archive/` for experiments from prior rounds."

The prompt does NOT reveal which anonymous ID the agent is — maintaining anonymity. The agent sees its branch name (which is random) but doesn't know it corresponds to any particular agent label.

## Auto-Commit

`cleanup_round()` auto-commits all uncommitted changes before removing the worktree:

```python
# In _auto_commit_worktree():
repo.git.add("-A")
repo.index.commit("[ROUND] Auto-commit")
```

This ensures the branch contains the agent's actual work even after the worktree is gone. Without this, the branch would point at HEAD (empty) and cross-agent visibility would find nothing.

## Orchestrator Integration

### Regular Rounds (`_stream_agent_execution`)

```python
round_isolation_mgr = IsolationContextManager(
session_id=f"{self.session_id}-{round_suffix}",
write_mode=write_mode,
workspace_path=workspace_path,
previous_branch=previous_branch,
# No branch_label — uses short random name
)
```

Other branches passed to system prompt as `Dict[str, str]`:
```python
other_agent_branches = {
agent_mapping.get(aid, aid): branch # {"agent1": "massgen/abc123"}
for aid, branch in self._agent_current_branches.items()
if aid != agent_id and branch
}
```

### Final Presentation

```python
self._isolation_manager = IsolationContextManager(
session_id=self.session_id,
write_mode=write_mode,
workspace_path=workspace_path,
base_commit=winner_branch, # Start from winner's work
branch_label="presenter", # Explicit readable name
)
```

## Testing

Tests live in `massgen/tests/test_write_mode_scratch.py`. Key test classes:

| Class | Covers |
|-------|--------|
| `TestScratchDirectory` | `.massgen_scratch/` creation, git exclusion, diff filtering |
| `TestScratchArchiveLabel` | `archive_label` naming, fallback to hex suffix |
| `TestBranchLifecycle` | `cleanup_round` keeps branch, `cleanup_session` deletes, `previous_branch` deletion |
| `TestWorkspaceScratchNoContextPaths` | Workspace mode (no context_paths) branch + scratch lifecycle |
| `TestAutoCommitBeforeCleanup` | Auto-commit on cleanup, no-op when clean |
| `TestWorkspaceStructureBranchInfo` | System prompt shows branch name, other branches with labels, scratch archive mention |
| `TestRestartContextBranchInfo` | Branch info in restart context (dict format) |

```bash
uv run pytest massgen/tests/test_write_mode_scratch.py -v
```
100 changes: 100 additions & 0 deletions docs/source/user_guide/agent_workspaces.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
Agent Workspaces and Code Isolation
====================================

How agents interact with your project code during MassGen coordination.

write_mode Configuration
-------------------------

The ``write_mode`` option controls how agents interact with your project files::

orchestrator:
coordination:
write_mode: auto # auto | worktree | isolated | legacy

.. list-table::
:header-rows: 1

* - Mode
- Git repo
- Non-git directory
* - ``auto`` (recommended)
- Git worktree per round
- Shadow copy with git init
* - ``worktree``
- Git worktree per round
- Error (falls back to shadow)
* - ``isolated``
- Shadow copy
- Shadow copy
* - ``legacy``
- Direct writes (no isolation)
- Direct writes

Per-Round Worktrees
--------------------

Each coordination round, every agent gets a fresh git checkout of your project.
Agents have full read/write access to experiment with the code. Changes during
coordination rounds are tracked on anonymous git branches but not applied to
your project.

Only the final presentation winner's changes go through a review modal
where you approve which files to apply.

**Branch lifecycle:**

- Each agent has exactly one branch alive at a time
- Old branches are deleted when a new round starts for that agent
- Branch names use random suffixes (no agent IDs or round numbers)
- Branches are visible to other agents via ``git branch`` / ``git diff``

Scratch Space
--------------

Inside each worktree, ``.massgen_scratch/`` provides a git-excluded directory
for experiments, evaluation scripts, and notes. Scratch files can import from
the project naturally since they live inside the checkout.

Scratch is archived to ``.scratch_archive/`` in the workspace between rounds,
so it persists in workspace snapshots shared with other agents.

**Key properties:**

- Git-excluded: invisible to ``git status`` and review modals
- Archived between rounds: previous scratch available in workspace
- Shared via snapshots: other agents can see your scratch archive

Agent Statelessness
--------------------

Agents are stateless and anonymous across rounds. Each round is a fresh
invocation with no memory of previous rounds. All cross-agent information
is presented anonymously.

This means:

- Agents don't know which agent they are
- System prompts and branch names don't reveal identity
- Cross-agent answers and workspaces are presented anonymously
- Each round starts fresh from HEAD (no accumulated state)

Migrating from use_two_tier_workspace
---------------------------------------

``use_two_tier_workspace`` is deprecated. Replace::

# Old
coordination:
use_two_tier_workspace: true

# New
coordination:
write_mode: auto

The new ``write_mode: auto`` provides:

- Git worktree isolation (safe experimentation)
- In-worktree scratch space (replaces ``scratch/`` directory)
- Branch-based cross-agent visibility
- Review modal for final presentation changes
17 changes: 11 additions & 6 deletions massgen/agent_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,12 +115,16 @@ class CoordinationConfig:
- injection_strategy: str (default "tool_result") - How to inject results:
- "tool_result": Append result to next tool call output
- "user_message": Inject as separate user message
use_two_tier_workspace: If True, agent workspaces are structured with two directories:
- scratch/: Working files, experiments, intermediate results, evaluation scripts
- deliverable/: Final outputs to showcase to voters
When enabled, git versioning is automatically initialized in the workspace
for audit trails and history tracking. Both directories are shared with
other agents during voting/coordination phases.
use_two_tier_workspace: DEPRECATED - Use write_mode instead.
If True, agent workspaces are structured with scratch/ and deliverable/
directories. Superseded by write_mode which provides git worktree
isolation with in-worktree scratch space.
write_mode: Controls how agent file writes are isolated during coordination.
- "auto": Automatically detect (worktree for git repos, shadow for non-git)
- "worktree": Use git worktrees for isolation (requires git repo)
- "isolated": Use shadow repos for full isolation
- "legacy": Use direct writes (no isolation, current behavior)
- None: Disabled (default, same as "legacy")
"""

enable_planning_mode: bool = False
Expand Down Expand Up @@ -155,6 +159,7 @@ class CoordinationConfig:
# Async subagent execution configuration
async_subagents: Optional[Dict[str, Any]] = None # {enabled: bool, injection_strategy: str}
use_two_tier_workspace: bool = False # Enable scratch/deliverable structure + git versioning
write_mode: Optional[str] = None # "auto" | "worktree" | "isolated" | "legacy"

def __post_init__(self):
"""Validate configuration after initialization."""
Expand Down
1 change: 1 addition & 0 deletions massgen/api_params_handler/_api_params_handler_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ def get_base_excluded_params(self) -> Set[str]:
# Coordination parameters (handled by orchestrator, not passed to API)
"vote_only", # Vote-only mode flag for coordination
"use_two_tier_workspace", # Two-tier workspace (scratch/deliverable) + git versioning
"write_mode", # Isolated write context mode (auto/worktree/isolated/legacy)
# NLIP configuration belongs to MassGen routing, never provider APIs
"enable_nlip",
"nlip",
Expand Down
3 changes: 3 additions & 0 deletions massgen/backend/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,8 @@ def __init__(self, api_key: Optional[str] = None, **kwargs):
"session_storage_base": kwargs.get("session_storage_base"),
# Two-tier workspace (scratch/deliverable) + git versioning
"use_two_tier_workspace": kwargs.get("use_two_tier_workspace", False),
# Write mode for worktree isolation (suppresses Docker context mounts)
"write_mode": kwargs.get("write_mode"),
}

# Create FilesystemManager
Expand Down Expand Up @@ -318,6 +320,7 @@ def get_base_excluded_config_params(cls) -> set:
# Coordination parameters (handled by orchestrator, not passed to API)
"vote_only", # Vote-only mode flag for coordination
"use_two_tier_workspace", # Two-tier workspace (scratch/deliverable) + git versioning
"write_mode", # Isolated write context mode (auto/worktree/isolated/legacy)
# Multimodal tools configuration (handled by CustomToolAndMCPBackend)
"enable_multimodal_tools",
"multimodal_config",
Expand Down
6 changes: 0 additions & 6 deletions massgen/backend/codex.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,10 +160,6 @@ def __init__(self, api_key: Optional[str] = None, **kwargs):
# Verify Codex CLI is available (skip in docker mode — resolved inside container)
if self._docker_execution:
self._codex_path = "codex"
# Auto-enable mounting ~/.codex/ for OAuth tokens
if self.filesystem_manager and self.filesystem_manager.docker_manager:
self.filesystem_manager.docker_manager.mount_codex_config = True
logger.info("Codex backend: docker execution mode — CLI will be resolved inside container")
else:
self._codex_path = self._find_codex_cli()
if not self._codex_path:
Expand Down Expand Up @@ -521,8 +517,6 @@ def _write_workspace_config(self) -> None:
self._write_toml_fallback(config, config_path)

self._workspace_config_written = True
logger.info(f"Wrote Codex workspace config: {config_path}")
logger.info(f"Codex workspace config contents: {config}")

@staticmethod
def _toml_value(v: Any) -> str:
Expand Down
Loading