Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,7 @@ temp
# Python cache
__pycache__/
*.pyc

.codex
*.json
plan.md
24 changes: 24 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ A Claude Code plugin that provides iterative development with independent AI rev
- **Iteration over Perfection** -- Instead of expecting perfect output in one shot, Humanize leverages continuous feedback loops where issues are caught early and refined incrementally.
- **One Build + One Review** -- Claude implements, Codex independently reviews. No blind spots.
- **Ralph Loop with Swarm Mode** -- Iterative refinement continues until all acceptance criteria are met. Optionally parallelize with Agent Teams.
- **Manager-Driven Scenario Matrix** -- Humanize keeps a machine-readable task graph in `.humanize/rlcr/<timestamp>/scenario-matrix.json`, lets the top-level manager reconcile task state, projects that state back into the Goal Tracker and checkpoint contract, and can nudge a stuck agent toward a narrower recovery path without replacing the single-mainline rule.
- **Begin with the End in Mind** -- Before the loop starts, Humanize verifies that *you* understand the plan you are about to execute. The human must remain the architect. ([Details](docs/usage.md#begin-with-the-end-in-mind))

## How It Works
Expand All @@ -25,6 +26,14 @@ A Claude Code plugin that provides iterative development with independent AI rev

The loop has two phases: **Implementation** (Claude works, Codex reviews summaries) and **Code Review** (Codex checks code quality with severity markers). Issues feed back into implementation until resolved.

New-format loops also maintain a compatibility-first manager orchestration runtime:

- `scenario-matrix.json` is the machine-native control plane. It stores authoritative task state, dependency edges, task packets, repair-wave clustering, checkpoint/convergence metadata, and oversight signals.
- The top-level manager is the only authoritative scheduler and matrix reconciler. Execution agents implement code, while the manager assigns bounded task packets, ingests feedback, and keeps exactly one current primary objective.
- Review findings first enter a raw backlog, are deduplicated and normalized into grouped issue backlogs, and only become executable tasks when the manager explicitly promotes them.
- `goal-tracker.md` and `round-N-contract.md` remain human-facing compatibility views, but their mutable task sections are now projected from the matrix.
- Oversight interventions such as `nudge`, `reframe`, `split`, or `resequence` only steer the active agent back onto the current task. They do not create multiple mainlines or take over implementation authority.

## Install

```bash
Expand Down Expand Up @@ -69,6 +78,20 @@ Requires [codex CLI](https://github.com/openai/codex) for review. See the full [
humanize monitor gemini # Gemini invocations only
```

The RLCR monitor now shows scenario-matrix readiness, the current mainline projection, the current manager checkpoint, convergence state, repair-wave context, and any active oversight action alongside the existing loop status.

6. **Render the current scenario matrix as an HTML dashboard**:
```bash
source <path/to/humanize>/scripts/humanize.sh
humanize matrix # latest local RLCR session
humanize matrix --input tmp.json # explicit matrix/session/state file
humanize matrix --serve # local browser client with refresh
```

`humanize matrix` generates a local HTML snapshot with the current primary objective, supporting window, dependency graph, feedback queues, recent events, and convergence/oversight status.

`humanize matrix --serve` starts a local HTML client on `http://127.0.0.1:<port>/`. Leave that page open and use the in-page `Refresh Snapshot` button instead of reopening freshly generated files.

## Monitor Dashboard

<p align="center">
Expand All @@ -78,6 +101,7 @@ Requires [codex CLI](https://github.com/openai/codex) for review. See the full [
## Documentation

- [Usage Guide](docs/usage.md) -- Commands, options, environment variables
- [Scenario Matrix Guide](scenario-matrix.md) -- Manager role, task packets, repair waves, and convergence flow
- [Install for Claude Code](docs/install-for-claude.md) -- Full installation instructions
- [Install for Codex](docs/install-for-codex.md) -- Codex skill runtime setup
- [Install for Kimi](docs/install-for-kimi.md) -- Kimi CLI skill setup
Expand Down
65 changes: 65 additions & 0 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,18 @@ Humanize creates an iterative feedback loop with two phases:

The loop continues until all acceptance criteria are met or no issues remain.

New-format RLCR loops also keep a compatibility-first runtime artifact at `.humanize/rlcr/<timestamp>/scenario-matrix.json`. The matrix is the machine-readable control plane: it records seeded plan tasks, dependency edges, manager authority, task packets, repair waves, checkpoint/convergence state, review-driven state changes, and bounded oversight interventions when the active agent appears stuck.

The top-level orchestrating session acts as the **manager**. It is the only authoritative scheduler and matrix reconciler. Execution agents implement code, while the manager reviews progress, ingests findings, and keeps exactly one current `primary objective` plus a bounded supporting window.

Subagents do not receive the full global loop prompt by default. Instead, Humanize projects a **task packet** from the matrix that includes the current primary objective, local task, direct dependencies, downstream impact, allowed scope, success criteria, stop criteria, and explicit out-of-scope boundaries. This is how the runtime avoids "subagent single-player mode" caused by limited LLM context.

Review findings first enter a raw backlog, are deduplicated, and are normalized into grouped issue backlogs before the manager decides whether any of them should become executable repair tasks. Low-value or out-of-bound findings can stay deferred in a watchlist instead of automatically joining the frontier.

`goal-tracker.md` and `round-N-contract.md` remain the human-facing workflow. Humanize projects matrix state back into those files so the active checkpoint still has exactly one current mainline objective even when several supporting tasks are queued behind it.

Oversight does not replace the executing agent. It only injects bounded corrections such as `nudge`, `reframe`, `split`, `reclassify`, or `resequence` when repeated failures suggest the current method or task framing is unhealthy.

## Begin with the End in Mind

Before the RLCR loop starts any work, Humanize runs a **Plan Understanding Quiz** -- a brief pre-flight check that verifies you genuinely understand the plan you are about to execute.
Expand Down Expand Up @@ -64,6 +76,12 @@ The quiz is advisory, not a gate. You always have the option to proceed. But tha
| `/gen-plan --input <draft.md> --output <plan.md>` | Generate structured plan from draft |
| `/refine-plan --input <annotated-plan.md>` | Refine an annotated plan and generate a QA ledger |
| `/ask-codex [question]` | One-shot consultation with Codex |
| `humanize matrix [--input <path>] [--output <path>] [--serve]` | Render a local HTML dashboard or run a refreshable local matrix client |

For scenario-matrix inspection, there are now two modes:

- `humanize matrix` writes a static HTML snapshot next to the current matrix or session.
- `humanize matrix --serve` starts a localhost HTML client. Keep that browser tab open and use the page's `Refresh Snapshot` button to pull the latest matrix view without reopening generated files.

## Command Reference

Expand Down Expand Up @@ -227,6 +245,53 @@ for getting a second opinion, reviewing a design, or asking domain-specific ques
Responses are saved to `.humanize/skill/<timestamp>/` with `input.md`, `output.md`,
and `metadata.md` for reference.

### humanize matrix

After sourcing `scripts/humanize.sh`, you can render a scenario-matrix snapshot into a local HTML dashboard:

```bash
source scripts/humanize.sh

humanize matrix
humanize matrix --input .humanize/rlcr/2026-04-01_20-41-00
humanize matrix --input tmp.json --output /tmp/matrix-view.html
```

Input resolution rules:

- No `--input`: use the latest local RLCR session under `.humanize/rlcr/`
- Session directory: resolve that session's `scenario-matrix.json`
- `state.md` / `*-state.md`: follow `scenario_matrix_file` from the state file
- `.json` file: render that matrix file directly
- Project directory: resolve the latest local RLCR session under that project

The generated HTML snapshot includes:

- current primary objective and supporting window
- task board grouped into primary/supporting/active/done/deferred buckets
- dependency edges between tasks
- checkpoint, convergence, and oversight status
- recent events plus execution/review feedback queues
- per-task detail drill-down without reading raw JSON

## Monitoring

Load the helper script and run the RLCR monitor:

```bash
source scripts/humanize.sh
humanize monitor rlcr
humanize matrix
```

The monitor remains compatible with legacy loops, but for new loops it also surfaces:

- scenario-matrix readiness (`ready`, `legacy`, `missing`, `invalid`, or `not_applicable`)
- the current matrix-derived mainline summary
- the current manager checkpoint and convergence status
- the primary repair-wave or cluster context when one is active
- any active oversight action currently steering the next round

## Configuration

Humanize uses a 4-layer config hierarchy (lowest to highest priority):
Expand Down
6 changes: 6 additions & 0 deletions hooks/lib/loop-common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ readonly FIELD_PRIVACY_MODE="privacy_mode"
readonly FIELD_MAINLINE_STALL_COUNT="mainline_stall_count"
readonly FIELD_LAST_MAINLINE_VERDICT="last_mainline_verdict"
readonly FIELD_DRIFT_STATUS="drift_status"
readonly FIELD_SCENARIO_MATRIX_FILE="scenario_matrix_file"
readonly FIELD_SCENARIO_MATRIX_REQUIRED="scenario_matrix_required"

readonly MAINLINE_VERDICT_ADVANCED="advanced"
readonly MAINLINE_VERDICT_STALLED="stalled"
Expand Down Expand Up @@ -407,6 +409,8 @@ _parse_state_fields() {
STATE_MAINLINE_STALL_COUNT=$(echo "$STATE_FRONTMATTER" | grep "^${FIELD_MAINLINE_STALL_COUNT}:" | sed "s/${FIELD_MAINLINE_STALL_COUNT}: *//" | tr -d ' ' || true)
STATE_LAST_MAINLINE_VERDICT=$(echo "$STATE_FRONTMATTER" | grep "^${FIELD_LAST_MAINLINE_VERDICT}:" | sed "s/${FIELD_LAST_MAINLINE_VERDICT}: *//" | tr -d ' ' || true)
STATE_DRIFT_STATUS=$(echo "$STATE_FRONTMATTER" | grep "^${FIELD_DRIFT_STATUS}:" | sed "s/${FIELD_DRIFT_STATUS}: *//" | tr -d ' ' || true)
STATE_SCENARIO_MATRIX_FILE=$(echo "$STATE_FRONTMATTER" | grep "^${FIELD_SCENARIO_MATRIX_FILE}:" | sed "s/${FIELD_SCENARIO_MATRIX_FILE}: *//; s/^\"//; s/\"\$//" || true)
STATE_SCENARIO_MATRIX_REQUIRED=$(echo "$STATE_FRONTMATTER" | grep "^${FIELD_SCENARIO_MATRIX_REQUIRED}:" | sed "s/${FIELD_SCENARIO_MATRIX_REQUIRED}: *//" | tr -d ' ' || true)
}

# Parse state file frontmatter and set variables (tolerant mode with defaults)
Expand Down Expand Up @@ -457,6 +461,7 @@ parse_state_file() {
STATE_MAINLINE_STALL_COUNT="${STATE_MAINLINE_STALL_COUNT:-0}"
STATE_LAST_MAINLINE_VERDICT="${STATE_LAST_MAINLINE_VERDICT:-$MAINLINE_VERDICT_UNKNOWN}"
STATE_DRIFT_STATUS="${STATE_DRIFT_STATUS:-$DRIFT_STATUS_NORMAL}"
STATE_SCENARIO_MATRIX_REQUIRED="${STATE_SCENARIO_MATRIX_REQUIRED:-false}"
# STATE_REVIEW_STARTED left as-is (empty if missing, to allow schema validation)

return 0
Expand Down Expand Up @@ -536,6 +541,7 @@ parse_state_file_strict() {
STATE_MAINLINE_STALL_COUNT="${STATE_MAINLINE_STALL_COUNT:-0}"
STATE_LAST_MAINLINE_VERDICT="${STATE_LAST_MAINLINE_VERDICT:-$MAINLINE_VERDICT_UNKNOWN}"
STATE_DRIFT_STATUS="${STATE_DRIFT_STATUS:-$DRIFT_STATUS_NORMAL}"
STATE_SCENARIO_MATRIX_REQUIRED="${STATE_SCENARIO_MATRIX_REQUIRED:-false}"

return 0
}
Expand Down
Loading
Loading