Commit 946fcbd
authored
release(2026-04-20): merge develop to main (suite 1.10.0) (#379)
* feat(skill): add /research skill for structured topic investigation (#285)
Add new global skill that conducts structured research on any topic
using web search, codebase analysis, and document synthesis.
Key features:
- Phase-based workflow (context → discovery → analysis → synthesis)
- Context-adaptive output matching existing doc conventions
- Language auto-detection with manual override
- Document index system integration
- Three depth levels (shallow/standard/deep)
Closes #284
* fix(statusline): handle lock and remove block-timer widget (#286)
* fix(statusline): honor rate-limit lock
When ccstatusline's OAuth usage API returns 429, it writes
{"blockedUntil":<epoch>} into ~/.cache/ccstatusline/usage.lock and
turns subsequent spawns into no-ops until the timestamp expires.
The previous script kept spawning a background refresh every 30s
regardless, wasting processes, and the Extra line showed stale
values with no visible cause.
Read usage.lock, skip the background refresh while the lock is
active, and append a dim "(locked Nm)" suffix to the Extra line
so the stale display is self-explanatory. Mirrored in the
PowerShell counterpart.
* chore(statusline): remove block-timer widget
The block-timer widget was the most visually dominant item on the
first line and duplicated context already implied by the session
reset countdown. Drop it along with the adjacent separator to
avoid consecutive "|" artifacts on the first line.
---------
Co-authored-by: kcenon <4158198+kcenon@users.noreply.github.com>
* feat(skills): lower default batch limit to 5 (#300)
Reduce default batch size from 20 to 5 and cap maximum at 10 across
issue-work and pr-work skills. Larger batches now require explicit
--force-large opt-in.
Rule drift becomes empirically visible around items 15-25 in long
batches; the conservative default keeps batches inside the safe zone
by default, while still allowing power users to bypass with explicit
acknowledgment of the risk.
Closes #288
* feat(skills): add chunked confirmation gate every 5 items (#301)
Insert a mandatory confirmation gate after every 5 items in batch mode
for both issue-work and pr-work. The gate halts execution and uses
AskUserQuestion to ask whether to continue, pause-with-resume, or
cancel. A new --no-confirm flag bypasses the gate for CI-driven or
unattended batches.
Beyond the obvious user-control benefit, each AskUserQuestion produces
a fresh user message that acts as an attention anchor, restoring
salience of CLAUDE.md rules and skill instructions that have drifted
out of recent context. This is one of the strongest available drift
mitigations for long-running batches.
Pausing writes .claude/resume.md per the existing session-resume
workflow so the next session can pick up at item N+1.
Closes #289
* feat(skills): inline critical rules in batch mode (#302)
Add a per-item rule reminder (B-4.0) to both issue-work and pr-work
batch modes. Before each item's execution, the loop emits a 5-line
invariant block as a fresh tool result so the language, commit format,
attribution, and CI gate rules sit in the recent attention window
instead of being buried by accumulating context.
This complements the per-5-item user-facing gate from #289: the gate
refreshes attention via user messages every five items, while the
inline reminder refreshes it via tool results every single item.
Together they form a multi-layer drift mitigation that does not
depend on model self-discipline.
The 5-line cap keeps cumulative cost linear and tiny (~25 tokens per
item). Reference doc loads inside the loop are explicitly forbidden
so the inline reminder remains the most recent context anchor.
Closes #290
* feat(hooks): add pr-language-guard pretooluse hook (#303)
* feat(hooks): add validate-language shared library
* feat(hooks): add pr-language-guard pretooluse hook
* ci(hooks): register pr-language-guard in settings
* docs: document pr-language-guard hook
* feat(hooks): add merge-gate-guard pretooluse hook (#304)
* feat(hooks): add merge-gate-guard pretooluse hook
* ci(hooks): register merge-gate-guard in settings
* docs: document merge-gate-guard hook
* feat(hooks): extend attribution guard to gh pr and issue commands (#305)
* refactor(hooks): expose validate-no-attribution helper
* feat(hooks): add attribution-guard pretooluse hook
* ci(hooks): register attribution-guard in settings
* docs: document attribution-guard hook
* feat(skills): delegate batch items to subagents (#306)
Default batch mode in issue-work and pr-work now dispatches each item
to a fresh general-purpose Agent. The parent context retains only a
per-item queue record (item id, status, pr url, ci conclusion), while
expensive tool output lives inside the subagent and is discarded on
completion. This keeps rule compliance at item 30 equivalent to item 1.
Add an opt-in --inline flag that preserves the legacy single-context
loop for tiny batches or cases where inter-item context is genuinely
useful (e.g. related regressions sharing a root cause). Document the
trade-offs in a comparison table inside each reference/batch-mode.md.
Closes #294
* feat(skills): add --auto-restart to batch gate (#307)
Replace the interactive chunked gate with a forced session restart
when --auto-restart is set. Every CONFIRM_INTERVAL items the batch
writes .claude/resume.md via the Batch Workflow Resume Format and
exits, so a fresh claude session starts each chunk with CLAUDE.md
and skill files reloaded at position zero.
This is a stronger context reset than the interactive gate because
it ends the OS process entirely: accumulated tool results, gh
outputs, CI log fetches and diff reads are all discarded. Intended
for long unattended batches paired with a wrapper that re-invokes
claude on exit.
--no-restart overrides --auto-restart and falls back to the
interactive gate, so defensive scripts can guarantee no session
exit even if --auto-restart is set by an alias or wrapper.
Default behavior (neither flag) is unchanged: the interactive
AskUserQuestion gate still fires every CONFIRM_INTERVAL items.
Touches both issue-work and pr-work skill pairs to keep the two
batch-mode references in sync, plus a one-line note in
global/CLAUDE.md and a cross-reference in session-resume.md.
Closes #296
* docs: add distributed-batch-dispatch research doc (#308)
Research-only deliverable for #298: evaluate RemoteTrigger and the
/schedule skill as a distributed dispatch layer for issue-work /
pr-work batch mode.
The doc answers all six research questions by separating what is
knowable from the RemoteTrigger API surface from what needs
empirical testing. Each test item is flagged explicitly so a future
revisit has a concrete test plan instead of hand-waving.
Decision: DEFER. The rule-drift problem #298 was meant to address
is already solved by subagent delegation (#294 / PR #306) and
--auto-restart (#296 / PR #307), both of which reuse existing
infrastructure without new operational surface area. The unique
value proposition of remote triggers -- parallel execution across
independent accounts or machines -- is not a current batch-mode
requirement, since batch limits are bounded by rule drift rather
than by throughput.
Includes a cost comparison table across subagent delegation,
--auto-restart, external script (#297), and remote trigger per
item; an empirical test plan that can be executed at low cost if
the decision is revisited; and explicit linkage to the other
tier-3 issues so the strategy space stays coherent.
Closes #298
* feat(scripts): add external batch orchestrators (#309)
Add four wrapper scripts that spawn one fresh claude CLI process per
batch item, pushing isolation to the OS process boundary:
scripts/batch-issue-work.sh (bash)
scripts/batch-issue-work.ps1 (PowerShell 7)
scripts/batch-pr-work.sh (bash)
scripts/batch-pr-work.ps1 (PowerShell 7)
Each script enumerates candidate items via gh, loops over them, and
invokes `claude --print /issue-work <org/repo> <n> --solo` (or the
pr-work equivalent) per item. Per-item logs are written to
~/.claude/batch-logs/<timestamp>/issue-<n>.log. On any item failure
the batch pauses and exits non-zero so the operator can inspect the
log before continuing; successful items are not rolled back.
pr-work orchestrators select PRs whose statusCheckRollup contains at
least one FAILURE/TIMED_OUT/CANCELLED/ACTION_REQUIRED/STARTUP_FAILURE
conclusion, so passing and in-progress PRs are skipped.
README adds Use Case D documenting when to pick external orchestration
over in-session batch mode, and the Common Tasks table gains rows for
the new scripts on both platforms.
This complements subagent delegation (#294 / PR #306) and --auto-restart
(#296 / PR #307) by offering the strongest available form of per-item
isolation: a fresh claude process boots for every item, so neither
conversation-level nor host-process-level state can leak.
Closes #297
* test(batch): add drift signal extractor library (#316)
Pure bash library of five extractor functions used to measure rule
compliance in batch-mode runs of /issue-work and /pr-work:
- extract_language_violations: count CJK characters
- extract_attribution_leaks: count AI attribution markers
- extract_ci_gate_violations: detect merge-with-failing-check
- extract_missing_closes: detect missing Closes/Fixes/Resolves keywords
- extract_commit_format_violations: count Conventional Commits violations
Sources hooks/lib/validate-commit-message.sh so CMV_ATTRIBUTION_REGEX
and validate_commit_message stay single-source. Self-tested with 34
cases covering empty input, mixed text, JSON variants, and SSOT
loading. Wired into validate-hooks.yml so future PRs to main run the
suite and shellcheck the new scripts.
Foundational layer for the Tier 2 benchmark orchestrator (#314) and
the regression test (#311).
Part of #310
Part of #287
Closes #312
* test(batch): add scratch repo seeding script (#317)
Idempotent bash script that bootstraps kcenon/batch-drift-scratch with
30 trivial typo-fix issues for the Tier 2 benchmark corpus:
- Creates the repo if absent (public, with README)
- Upserts docs/file-01.md .. docs/file-30.md with identical one-line typo
content; skips PUT when existing SHA-matched content already matches
- Enumerates open issues prefixed "fix typo in docs/file-" and creates
only the missing ones (title/body/acceptance criteria 5W1H-formatted)
Supports --dry-run (network-free preview) and --help. Tests cover 23
cases: flag parsing, dry-run output shape and determinism, file
numbering bounds (01..30), and that dry-run never issues HTTP.
Fixes the GNU-vs-BSD grep divergence in the test harness by passing
`--` before literal patterns that begin with `-`.
Wired into validate-hooks.yml as a sibling step to the extractor tests.
Part of #310
Part of #287
Closes #313
* test(batch): add benchmark orchestrator and aggregator (#318)
Adds the operator-facing benchmark entry point and its offline-testable
aggregation half:
- run-benchmark.sh: orchestrator that invokes /issue-work under one of
three Tier 2 strategies (subagent, auto-restart, orchestrator),
captures per-item PR data, and delegates aggregation. Dry-run mode
prints the planned invocation without calling claude, gh, or the
seeder. Validates args, preconditions, and dependency files (extractor
lib, seeder, external orchestrator from #297) before running.
- aggregate-results.sh: pure function that reads raw per-item PR JSONs
from a directory, applies the five drift extractors, and emits the
strategy results JSON per the schema in #314. No network, no gh.
- 11 aggregator fixtures: 5 clean + 6 with drift concentrated at item 6
(hangul body, AI-assisted keyword, merged-with-FAILURE, no Closes
keyword, no-type-prefix commit). Exercises bucketing into
items_1_to_5 and items_6_to_30.
- 71 new test cases (40 aggregator + 31 orchestrator) covering flag
parsing, precondition errors, dry-run shape, per-strategy invocation
lines, JSON schema shape, bucketing, and determinism.
Per-item capture is designed to survive single-item failures: a gh pr
view error writes a null-signal raw file with capture_error=true
instead of aborting the batch.
Live execution is deliberately out of scope (belongs to #315); this PR
ships the infrastructure that #315 will drive.
Part of #310
Part of #287
Closes #314
* feat(agents): add memory, maxTurns, and effort frontmatter fields (#323)
Add advanced frontmatter fields to all 6 agent definitions:
- memory: project scope (local for structure-explorer)
- maxTurns: 15-30 based on agent complexity
- effort: high (medium for structure-explorer)
- initialPrompt: context-loading prompt for 5 agents
- Enhanced description fields for better agent selection
- Added Bash tool to code-reviewer for git/test access
Changes applied to both project/.claude/agents/ and plugin/agents/.
structure-explorer kept on haiku model as its task (file structure
mapping) is well-suited to a fast, lightweight model.
Closes #320
* feat(agents): standardize output format, guardrails, and language-specific rules (#324)
Add four new sections to agent definitions:
1. Core Behavioral Guardrails (all 6 agents):
Self-check questions from anti-patterns.md to prevent
assumption-making, over-engineering, and scope creep.
2. Standardized Output Format (5 agents, qa-reviewer unchanged):
- code-reviewer: severity table + APPROVE/REQUEST_CHANGES verdict
- codebase-analyzer: confidence scores in findings table
- documentation-writer: documentation checklist with completeness %
- refactor-assistant: before/after diff + test verification report
- structure-explorer: no change needed (already structured)
3. Language-Specific Rules (3 agents):
- code-reviewer: language-aware review checks
- codebase-analyzer: language-aware analysis points
- refactor-assistant: language-aware refactoring considerations
4. Safety Verification Protocol (refactor-assistant only):
Replaces the 4-line safety principles with concrete before/during/after
verification steps and hard-stop conditions.
Changes applied to both project/.claude/agents/ and plugin/agents/.
Closes #321
* feat(agents): add team communication protocol and new specialized agents (#325)
Part A: Team Communication Protocol
- Added ## Team Communication Protocol to all 6 existing agents
- Each protocol defines receives-from, sends-to, handoff triggers,
and task management behavior for team collaboration
- No circular delegation loops in the protocol graph:
structure-explorer → codebase-analyzer → {documentation-writer, code-reviewer}
code-reviewer ↔ qa-reviewer (bidirectional boundary verification)
code-reviewer → refactor-assistant (one-way delegation)
Part B: New Specialized Agents
- dependency-auditor: CVE scanning, license compliance, freshness
analysis, unused dependency detection
- test-strategist: coverage gap identification, test quality
assessment, strategy recommendation, skeleton generation
- migration-planner: deferred — scope too speculative for a
configuration repo without production databases or APIs
Both new agents include all sections from #320 (frontmatter) and
#321 (guardrails, output format, team protocol).
Updated project/CLAUDE.md agent list (6 → 8 agents).
Synchronized all changes to plugin/agents/.
Closes #322
* test(batch): add drift regression harness with thresholds and docs (#326)
Add automated behavioral regression test that verifies 30-item batch
workflows retain rule compliance within configurable thresholds.
Components:
- run-regression.sh: orchestrates seed, benchmark, threshold assertion
- thresholds.json: default max-allowed drift counts per signal
- test-run-regression.sh: 35 offline unit tests (arg parsing, dry-run)
- docs/batch-drift-regression.md: methodology, triage guide, cost notes
- .gitignore: exclude benchmark runtime outputs
Reuses benchmark infrastructure from tests/batch_drift_benchmark/ and
SSOT extractors from hooks/lib/validate-commit-message.sh.
Note: CI workflow (.github/workflows/batch-drift-regression.yml) must be
pushed separately with a token that has the `workflow` scope.
Closes #311
* test(batch): execute Tier 2 benchmarks and publish results (#327)
* fix(scripts): use gh api user for auth check instead of gh auth status
gh auth status returns non-zero when any configured token is invalid,
even if the active GH_TOKEN works. Replace with gh api user which
tests actual API connectivity.
* fix(scripts): remove redundant jq pipe and add empty fallback
The gh -q flag already outputs valid JSON. The extra pipe to jq -c
caused failures under pipefail when gh emitted auth warnings to
stderr. Add empty-array fallback for robustness.
* test(batch): add Tier 2 benchmark results and comparison document
Benchmark executed against kcenon/batch-drift-scratch with 30 trivial
typo-fix issues. Results: zero drift across all 5 signals for both
5-item baseline and 30-item single-session batch.
Key finding: Tier 0-1 mitigations (hooks, inline rules) are sufficient
for uniform XS workloads. Orchestrator (#297) recommended as default
Tier 2 strategy for production mixed-complexity batches.
Limitation: subagent and auto-restart strategies could not be
benchmarked due to batch mode AskUserQuestion blocking in --print mode.
Closes #315
* fix(structure): consolidate triple-duplicated workflow reference files (#337)
* feat(scripts): add SSOT sync and check tooling for workflow refs
Adds scripts/sync_references.{sh,ps1} and scripts/check_references.{sh,ps1}
to keep workflow reference files consistent across three locations:
- canonical: project/.claude/rules/workflow/
- mirror 1: project/.claude/skills/project-workflow/reference/
- mirror 2: plugin/skills/project-workflow/reference/
scripts/sync.sh gains a --references-only fast path that delegates to
sync_references.sh. The validate-skills CI workflow runs check_references
on every PR and fails on drift (exit 2).
Part of #328
* fix(refs): restore canonical content in project and plugin mirrors
The skills/reference/ copies contained one-line relative-path strings
left over from a symlink experiment that never worked. Imports via
@./reference/<file> received a literal path string instead of the
intended content.
The plugin/reference/ copies carried an older, verbose version of each
document that had drifted substantially from the current concise rules/
copy.
Runs scripts/sync_references.sh to bring both mirrors byte-identical to
project/.claude/rules/workflow/. Drift is enforced by the new CI check.
Part of #328
Closes #329
* chore(versions): unify version numbering via VERSION_MAP.yml (#338)
* chore(versions): add VERSION_MAP.yml with independent SemVer tracks
Introduces VERSION_MAP.yml as single source of truth for four independent
SemVer tracks: suite, plugin, plugin-lite, and settings-schema. Each field
moves independently to reflect their distinct release cadences.
Adds scripts/check_versions.{sh,ps1} to verify each declared field matches
its consumer files (plugin.json, settings.json, README badge URL) and
scripts/sync_versions.{sh,ps1} to propagate map values to consumers.
Wires check_versions.sh into the validate-skills CI workflow so that any
drift between VERSION_MAP.yml and its consumers fails the release PR.
Adds scripts/sync.sh --versions-only fast path for the common case of
regenerating version references after editing the map.
* docs(release): integrate VERSION_MAP into release skill and docs
Updates the release skill to accept --target <field> and bump only the
targeted SemVer track via VERSION_MAP.yml and sync_versions.sh. For
non-suite targets the tag format becomes <target>-v<version> to keep
tracks separate in the git tag history.
Documents the version layout in docs/CUSTOM_EXTENSIONS.md under a new
VERSION_MAP SSOT section, explaining why plugin, plugin-lite, and
settings-schema each follow their own SemVer track rather than being
locked to a single suite version.
The skill falls back to its legacy single-version behavior when no
VERSION_MAP.yml is present, so it remains usable in projects that
inherit the configuration without adopting the map.
* fix(hooks): repair 4 bugs in markdown-anchor-validator (#340)
* fix(hooks): four bugs in markdown-anchor-validator
Bug A (sh only): `/^#+[[:space:]]/` accepted lines with 7+ hashes as
headings, silently registering anchors that GitHub does not create.
Replaced with `/^#{1,6}[[:space:]]/` in both the match and the
subsequent sub().
Bug B (sh + ps1): intra-file and inter-file reference extraction
scanned the raw line, so `[a](#missing)` inside inline backticks was
treated as a live reference and blocked commits on documentation
files that include syntax examples. Now the line is copied into
a `work`/`scanLine` variable with inline-code spans stripped before
the match loops run.
Bug C (sh only): JSON escaping covered `"` but not `\`, producing
invalid JSON whenever an anchor or filename contained a backslash.
Replaced manual escaping with `jq -Rs .`, which handles `\`, `"`,
newlines, and control characters in one step. The error message is
now built with real newlines (`$'\n'`) rather than literal "\n"
strings, matching how jq expects input.
Bug D (sh only): `set -euo pipefail` combined with a `jq` pipeline
could abort the script silently on systems without jq, leaving
Claude Code without a decision response. Added an explicit
`command -v jq` check at script entry that fails open with a stderr
warning, plus a `|| CMD=""` guard on the command-extraction pipe.
The PowerShell variant only shared bug B; its heading regex already
limited to `#{1,6}`, its JSON output goes through `ConvertTo-Json`,
and it does not depend on jq.
* test(hooks): add regression suite for markdown-anchor-validator bugs
Adds fixture markdown files and a test runner that exercises each of
the four bugs addressed in the companion fix commit:
- bug-a-excessive-hashes.md: 7 hashes should not register an anchor
- bug-b-inline-code.md: inline-code example syntax should not block commit
- bug-c-backslash.md: anchor with backslash must produce valid JSON
- baseline-valid.md: well-formed markdown must not trigger errors
The runner skips cleanly when jq is absent (demonstrating the bug D
fail-open path) and matches the "N passed, N failed" summary format
that tests/hooks/test-runner.sh parses, so it is picked up by the
validate-hooks CI workflow automatically.
* chore(plugin): remove redundant path fields from manifests (#341)
The official Claude Code plugin spec auto-discovers agents/, skills/,
hooks/hooks.json, .mcp.json, and .lsp.json at the plugin root. The
explicit path fields in plugin.json were overrides that duplicated the
default layout and risked masking future spec changes.
- plugin/.claude-plugin/plugin.json: drop agents, skills, hooks,
lspServers
- plugin-lite/.claude-plugin/plugin.json: drop skills
- tests/plugin/smoke-test.{sh,ps1}: switch from manifest-declared paths
to default-location discovery; add hooks.json validity check
- plugin/README.md: update manifest compatibility note to describe
auto-discovery behavior instead of the removed path fields
Closes #331
* chore(ci): seal nightly batch drift regression workflow (#343)
Add the GitHub Actions workflow that schedules the drift regression
harness produced under epic #287, and exclude the local scratch repo
from version control so it is not accidentally committed by future
seeding runs.
Closes #342
* feat(skills): modernize SKILL.md frontmatter with disable-model-invocation, allowed-tools, and paths (#344)
* feat(skills): add disable-model-invocation to global workflow skills
* feat(skills): add allowed-tools to global workflow skills
* feat(skills): add paths to plugin and project knowledge skills
* feat(skills): add when_to_use to plugin and project knowledge skills
* feat(skills): extend workflow frontmatter to doc-review git-status doc-update
* docs(skills): update frontmatter documentation
* feat(skills): add paths to project code-quality knowledge skill
* feat(skills): add paths to project code-quality skill
* feat(hooks): adopt InstructionsLoaded, PostCompact, and TaskCreated hook events (#345)
* feat(hooks): add instructions-loaded-reinforcer for InstructionsLoaded event
* feat(hooks): add post-compact-restore for PostCompact event
* feat(hooks): add task-created-validator for TaskCreated event
* feat(settings): subscribe to InstructionsLoaded, PostCompact, TaskCreated events
* test(hooks): add tests for InstructionsLoaded, PostCompact, TaskCreated hooks
* fix(hooks): ensure bash and powershell hook outputs are byte-identical
* docs(hooks): document InstructionsLoaded, PostCompact, TaskCreated hooks
* feat(scripts): add official-spec linter and integrate into validate-skills.yml (#346)
* feat(scripts): add official-spec linter for skill, plugin, settings
Validates SKILL.md frontmatter, plugin.json, and settings.json against
canonical Claude Code 2026 schemas (PyYAML + jsonschema). Wires into
sync.sh --lint sub-flag and validate_skills.sh as a soft-fail check.
* test(scripts): add spec linter test suite
12 fixture-based test cases covering SKILL.md/plugin.json/settings.json
validation, did-you-mean suggestions for unknown fields, --warn-only and
--strict mode flags, and full-repo discovery via the wrapper. Adds
--strict flag and unknown-field annotation to spec_lint.py.
* ci(skills): wire spec linter into validate-skills workflow
Adds jsonschema dependency, expands path triggers to cover spec linter
sources and schemas, runs spec_lint.sh --strict and the linter test
suite. Adds --strict / -Strict flag to bash and PowerShell wrappers
so CI can distinguish strict-mode failures from regular violations.
* chore(scripts): enable Set-StrictMode in spec_lint.ps1
Hardens the PowerShell wrapper against silent reference-to-uninitialized
variables, matching the bash twin's set -euo pipefail rigor.
* docs(scripts): document spec linter and its CI integration
Add a Spec Linter section to docs/CUSTOM_EXTENSIONS.md describing the
canonical Claude Code 2026 schemas under scripts/schemas/, the bash and
PowerShell wrappers, the three exit codes, the warn-only/strict modes,
the schema-update procedure, and the validate-skills.yml CI gate.
Add an Unreleased entry to global/VERSION_HISTORY.md cross-referencing
issue #334 and parent epic #328.
* fix(sync): abort interactive sync when canonical files violate schemas
Adds a pre-flight spec_lint guard to sync.sh and sync.ps1 default flow
so the interactive sync refuses to deploy schema-violating files to
~/.claude/. Bypass with --skip-lint for emergency syncs (e.g., reverting
a bad change). Also adds the py launcher to bash Python discovery for
Git Bash on Windows, refreshes the spec_lint.py docstring exit-code
table, fixes sync.ps1 flag forwarding to use a splatted hashtable so
PowerShell ValidateSet does not misbind switches as Mode values, and
extends both test suites with three integration cases (--lint fast path,
pre-flight abort, --skip-lint bypass).
* feat(skills): adopt context:fork for security-audit, performance-review, and doc-review (#347)
* feat(skills): adopt context:fork for audit and review skills
Add agent: Explore to security-audit and performance-review (both
already had context: fork) so audits run in a forked, read-only
subagent context. Add allowed-tools to performance-review to declare
its read-only audit posture explicitly. Add context: fork plus
agent: general-purpose to doc-review so its larger analysis output
runs in isolation; general-purpose is required for --fix write access.
Each modified SKILL.md gains a structured Output section that reminds
the forked subagent it has no access to the calling conversation's
history and must operate from the supplied arguments only.
Project mirror SKILL.md files under project/.claude/skills/ are
intentionally preserved with their existing development-style tool
declarations and are out of scope for this issue.
Closes #335
* docs(skills): document context:fork adoption for audit skills
Add Skill Context Isolation subsection under Detailed Breakdown in
docs/CUSTOM_EXTENSIONS.md, listing the three skills now using
context: fork with their agent choice and rationale. Add Unreleased
entry to global/VERSION_HISTORY.md cross-referencing #335 and the
parent epic #328.
Closes #335
* docs: migrate to canonical documentation URLs and inventory settings fields (#348)
Closes #336
Replaces legacy documentation URLs with the new canonical equivalents and
adds a Settings Field Inventory section to COMPATIBILITY.md classifying
every non-schema field as Stable/Experimental/Undocumented/Misplaced
against the official reference.
Key findings recorded:
- showTurnDuration and teammateMode belong in the global config file, not settings.json
- env.CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS is officially experimental
- env.MAX_TEAMS, ENABLE_TOOL_SEARCH, MAX_MCP_OUTPUT_TOKENS are undocumented
- effortLevel only accepts low/medium/high/xhigh (not max)
Files touched: HOOKS.md, docs/CUSTOM_EXTENSIONS.md, COMPATIBILITY.md (URL
fixes + new inventory section), README.md (migration note), VERSION_HISTORY.md
(Unreleased entry).
* Replace local git-commit-format with symlink (#349)
Replace project/.claude/skills/project-workflow/reference/git-commit-format.md with a symlink to ../../../rules/workflow/git-commit-format.md to centralize commit message guidelines and remove the duplicated inline guidance.
* Point workflow refs to centralized rules (#350)
Replace local workflow reference docs with symlinks to the canonical rule files to remove duplication and centralize maintenance. Updated files: project/.claude/skills/project-workflow/reference/github-issue-5w1h.md, github-pr-5w1h.md, and performance-analysis.md now point to ../../../rules/workflow/*. This keeps a single source of truth for workflow guidance and simplifies future updates.
* fix(ci): unblock release by fixing 3 blocker categories (#352)
* fix(ci): add bash 4+ install on macOS and clean remaining SC2034 (#353)
* fix(ci): remove remaining dead HAS_FIELD assignment (#354)
* test(hooks): use jq as primary JSON validator with python fallback (#355)
The assert_valid_json helpers in test-instructions-loaded-reinforcer.sh
and test-post-compact-restore.sh required python or python3 to be on
PATH. On minimal runners (e.g. the WSL/Docker image used by CI contributors
that only installs jq via the validate-hooks workflow), both python
invocations fail with exit 127 and the tests incorrectly report the
hook output as invalid JSON.
Prefer jq — already a required dependency of every hook these tests
cover, so it is guaranteed present whenever the hooks themselves run —
and retain python3 and python as fallbacks for environments where jq
is unavailable but python is.
* Make markdown anchor validator mawk-compatible (#356)
Replace the ERE quantifier /^#{1,6}/ with match()+RLENGTH-based logic to detect 1–6 leading hashes. mawk does not support the {1,6} quantifier, so this change uses match($0, /^#+[[:space:]]/) and computes h_count = RLENGTH - 1, then proceeds only if h_count is between 1 and 6. The rest of the heading extraction/printing logic is preserved, so behavior and output remain the same while ensuring compatibility with mawk.
* fix(config): wire SSL_CERT_FILE to resolve sandbox TLS for git/curl (#368)
* fix(config): wire SSL_CERT_FILE for sandbox TLS
Adds SSL_CERT_FILE and SSL_CERT_DIR env vars so git, curl, npm,
pip and other CA-bundle-aware tools complete TLS handshakes
inside the Claude Code sandbox without dangerouslyDisableSandbox.
Adds scripts/verify-tls.sh to probe the fix. gh on macOS still
fails because Darwin Go ignores SSL_CERT_FILE; remediated
separately via a Bash allowlist. See docs/SANDBOX_TLS.md.
Refs #367
* docs(sandbox): document SSL_CERT_FILE fix and gh caveat
Adds docs/SANDBOX_TLS.md with the root cause, coverage matrix,
platform fallback ladder, and the gh-on-macOS caveat plus its
Bash-allowlist remediation. Updates global/CLAUDE.md with a new
Environment Workarounds section that references the doc.
Refs #367
* feat(hooks): add PostToolUse agent-checkpoint hook (#369)
Closes #360
Add post-task-checkpoint.sh/.ps1 under global/hooks/ and register it in
global/settings.json and global/settings.windows.json under PostToolUse
matcher Task|Agent. After each Task or Agent tool call completes, the
hook snapshots any working-tree changes into a wip(agent): checkpoint
commit so a later sub-agent cannot silently clobber a prior agent's
output in multi-agent workflows.
Design:
- Fail-open: exit 0 on any parse, git, or jq error — never block workflow
- No-op on clean tree (avoids empty-commit spam)
- No-op outside git worktree
- --no-verify bypasses commit-msg validator (wip() not in accepted types)
- --allow-empty satisfies empty-tree AC defensively
- Agent name sanitized to [A-Za-z0-9_-], clipped to 64 chars
Documentation: HOOKS.md gains section #19 with purpose, limitations,
opt-out path, and the async/timeout tradeoff rationale.
Tests: tests/hooks/test-post-task-checkpoint.sh covers 14 cases
including dirty/clean/non-repo paths, agent-name sanitization,
malformed-JSON fail-open, and a two-agent overwrite scenario that
proves agent-A's output is recoverable from HEAD~1 after agent-B
overwrites shared.txt.
Scope clarification vs the issue body:
- Test fixture lives at tests/hooks/test-post-task-checkpoint.sh
(project convention, discoverable by test-runner.sh) instead of
tests/post-task-checkpoint/.
- hooks/install-hooks.sh is not modified: that script installs
git-level hooks under .git/hooks/, while Claude Code hooks in
global/hooks/ are deployed via scripts/sync.sh, which already
syncs the entire directory unchanged.
* feat(skills): add Atomic Multi-Phase Execution rule to _policy.md (#370)
Closes #361
Add a top-level Atomic Multi-Phase Execution section to the global
command policy so multi-phase requests ("Phase 1/2/3", "up to Phase N")
are treated as a single atomic unit without mid-plan confirmation
prompts. Cross-reference the rule from issue-work, pr-work, and
release SKILL.md — the three skills where this friction was observed.
The original common-rules list is preserved verbatim under a new
Common Rules heading so existing references remain valid.
* docs: consolidate Environment Workarounds in global/CLAUDE.md (#371)
Closes #362
The canonical Environment Workarounds section in global/CLAUDE.md (lines
17-21) already documents the SSL_CERT_FILE fix (PR #368), the gh macOS
caveat, Read-before-Edit contract, and the dangerouslyDisableSandbox
fallback policy. This change removes two drifted duplicates that still
advertised the outdated dangerouslyDisableSandbox-first approach:
- project/.claude/rules/workflow/ci-resilience.md § TLS / Sandbox Errors
- global/skills/pr-work/SKILL.md § TLS/Sandbox Error Handling
Both now defer to the canonical section and docs/SANDBOX_TLS.md, keeping
only the diagnostic note that distinguishes TLS errors from auth errors.
Out of scope:
- docs/SANDBOX_TLS.md — deep-dive reference, already cited from CLAUDE.md
- project/.claude/skills/ci-debugging/reference/common-failures.md —
diagnostic content (symptoms/causes/solutions), not a rule; kept as-is
- plugin/skills/ci-debugging/reference/common-failures.md — same
Acceptance:
- global/CLAUDE.md Environment Workarounds: already canonical (no change)
- pr-work and ci-resilience now cross-reference it
- No duplicate rules remain (verified by grep of
dangerouslyDisableSandbox|SSL_CERT_FILE|OSStatus -26276 across *.md)
* feat(skills): add ci-fix skill codifying recurring CI failure patterns (#372)
Creates global/skills/ci-fix/ with a classifier-plus-known-fixes pipeline for
three recurring CI failure patterns documented in the 2026-04-18 /insights report:
- msvc-c4996: deprecated API under /WX warnings-as-errors
- cmake-fetchcontent: GIT_SHALLOW ON with commit hash
- cpp-lib-format: __cpp_lib_format probe vs link-time availability
The skill fetches the failing run log, matches against a deterministic
classifier table, and applies the codified remediation. Budget: 20 minutes
wall-clock with one retry slot, then escalates.
pr-work SKILL.md now points to ci-fix as a known-pattern shortcut before
hand-authoring a fix.
Validated via scripts/spec_lint.sh (no violations) and
scripts/validate_skills.sh (232/232 passed).
Closes #363
* feat(hooks): add pre-edit-read-guard PreToolUse hook (#373)
Enforces the Read-before-Edit/Write tool contract. A single script is
registered twice in global/settings.json:
- PreToolUse Edit|Write guard mode (deny on tracker miss)
- PostToolUse Read track mode (append file_path to tracker)
Tracker lives at $TMPDIR/claude-read-set-<session-id>, one absolute path
per line, deduplicated. Fail-open when the tracker file is absent so
fresh sessions are not blocked before any Read has fired.
Converts silent Edit retries on unread files into an actionable deny
message naming the exact Read target.
Changes:
- global/hooks/pre-edit-read-guard.sh (new)
- global/hooks/pre-edit-read-guard.ps1 (new)
- global/settings.json (+ PreToolUse Edit|Write, + PostToolUse Read)
- global/settings.windows.json (same pair)
- HOOKS.md: new § 20
- tests/hooks/test-pre-edit-read-guard.sh (15 cases)
Validated:
- tests/hooks/test-runner.sh 253 passed, 0 failed
- scripts/spec_lint.sh settings/files=2 violations=0
- scripts/validate_skills.sh 232/232 passed
Closes #364
* feat(skills): add preflight skill for local CI simulation before push (#374)
Creates global/skills/preflight/ — a check orchestrator that reproduces
the CI contract locally so failures surface on the developer machine,
not on GitHub. Pairs with ci-fix: same pattern catalogue, opposite
direction (preflight prevents, ci-fix reacts).
Checks in invocation order (cheap first):
- deprecated-api grep patterns shared with ci-fix/reference
- cmake-configure cmake -S . -B <dir> -Werror=dev; skips without cmake
- act-linux nektos/act --list; skips without act or docker
- msvc-docker docker run of a Windows image; skips off-Windows
Each check emits one JSON line on stdout (status: pass/fail/skip) and
writes evidence to $TMPDIR/preflight-<check>-<pid>.log on failure.
run-all.sh aggregates and prints a summary to stderr; exit code is
non-zero iff any check reports fail.
Integrates with hooks/pre-push via opt-in CLAUDE_PREFLIGHT=1. Default
behaviour (protected-branch block) is preserved when the flag is unset.
Changes:
- global/skills/preflight/SKILL.md
- global/skills/preflight/scripts/{run-all,run-deprecated-api,run-cmake-configure,run-act,run-msvc-docker}.sh
- hooks/pre-push (+ CLAUDE_PREFLIGHT=1 branch)
- hooks/pre-push.ps1 (same branch via bash shim)
Validated:
- scripts/spec_lint.sh skill violations=0
- scripts/validate_skills.sh 240/240 passed
- Manual run in this repo: 1 pass (deprecated-api), 3 skips
Closes #365
* feat(skills): add fleet-orchestrator skill for parallel multi-repo operations (#375)
* feat(skills): add fleet-orchestrator skill for parallel multi-repo runs
Introduces a supervisor-plus-workers harness that fans a single directive
across N repositories. Each repo is handled by a fresh general-purpose
Agent launched in background; all workers coordinate through a single
flock-guarded manifest (fleet-status.json) that records per-repo status,
PR URL, CI conclusion, and merge outcome.
Deliverables:
- SKILL.md: entry point with workflow phases, dispatch protocol, supervisor
polling loop, and aggregation report format
- reference/manifest-schema.json: JSON Schema Draft 2020-12 for the shared
fleet-status.json file, including worker lifecycle phases and error classes
- reference/worker-template.md: per-repo worker prompt with substitution
tokens, manifest update protocol, and failure-isolation contract
Refs: #366
* docs: add fleet-orchestrator user guide alongside harness docs
Adds a user-facing overview of the fleet-orchestrator skill including its
place in the tier progression from the 2026-04-18 /insights report, the
fan-out/supervisor architecture diagram, the relationship to harness /
issue-work / pr-work, example invocations, and failure-isolation classes.
Refs: #366
* chore(release): bump suite to 1.10.0 (#376)
---------
Co-authored-by: kcenon <4158198+kcenon@users.noreply.github.com>1 parent c6724b8 commit 946fcbd
36 files changed
Lines changed: 3370 additions & 31 deletions
File tree
- docs
- global
- hooks
- skills
- ci-fix
- reference
- fleet-orchestrator
- reference
- issue-work
- pr-work
- preflight
- scripts
- release
- hooks
- project/.claude/rules/workflow
- scripts
- tests/hooks
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| 29 | + | |
| 30 | + | |
29 | 31 | | |
30 | 32 | | |
31 | 33 | | |
| |||
603 | 605 | | |
604 | 606 | | |
605 | 607 | | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
| 682 | + | |
| 683 | + | |
| 684 | + | |
| 685 | + | |
| 686 | + | |
| 687 | + | |
| 688 | + | |
| 689 | + | |
| 690 | + | |
| 691 | + | |
| 692 | + | |
| 693 | + | |
| 694 | + | |
| 695 | + | |
| 696 | + | |
| 697 | + | |
| 698 | + | |
| 699 | + | |
| 700 | + | |
| 701 | + | |
| 702 | + | |
| 703 | + | |
| 704 | + | |
| 705 | + | |
| 706 | + | |
| 707 | + | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
606 | 714 | | |
607 | 715 | | |
608 | 716 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | | - | |
| 16 | + | |
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
0 commit comments