Skip to content

Commit 946fcbd

Browse files
authored
release(2026-04-20): merge develop to main (suite 1.10.0) (#379)
* feat(skill): add /research skill for structured topic investigation (#285) Add new global skill that conducts structured research on any topic using web search, codebase analysis, and document synthesis. Key features: - Phase-based workflow (context → discovery → analysis → synthesis) - Context-adaptive output matching existing doc conventions - Language auto-detection with manual override - Document index system integration - Three depth levels (shallow/standard/deep) Closes #284 * fix(statusline): handle lock and remove block-timer widget (#286) * fix(statusline): honor rate-limit lock When ccstatusline's OAuth usage API returns 429, it writes {"blockedUntil":<epoch>} into ~/.cache/ccstatusline/usage.lock and turns subsequent spawns into no-ops until the timestamp expires. The previous script kept spawning a background refresh every 30s regardless, wasting processes, and the Extra line showed stale values with no visible cause. Read usage.lock, skip the background refresh while the lock is active, and append a dim "(locked Nm)" suffix to the Extra line so the stale display is self-explanatory. Mirrored in the PowerShell counterpart. * chore(statusline): remove block-timer widget The block-timer widget was the most visually dominant item on the first line and duplicated context already implied by the session reset countdown. Drop it along with the adjacent separator to avoid consecutive "|" artifacts on the first line. --------- Co-authored-by: kcenon <4158198+kcenon@users.noreply.github.com> * feat(skills): lower default batch limit to 5 (#300) Reduce default batch size from 20 to 5 and cap maximum at 10 across issue-work and pr-work skills. Larger batches now require explicit --force-large opt-in. Rule drift becomes empirically visible around items 15-25 in long batches; the conservative default keeps batches inside the safe zone by default, while still allowing power users to bypass with explicit acknowledgment of the risk. Closes #288 * feat(skills): add chunked confirmation gate every 5 items (#301) Insert a mandatory confirmation gate after every 5 items in batch mode for both issue-work and pr-work. The gate halts execution and uses AskUserQuestion to ask whether to continue, pause-with-resume, or cancel. A new --no-confirm flag bypasses the gate for CI-driven or unattended batches. Beyond the obvious user-control benefit, each AskUserQuestion produces a fresh user message that acts as an attention anchor, restoring salience of CLAUDE.md rules and skill instructions that have drifted out of recent context. This is one of the strongest available drift mitigations for long-running batches. Pausing writes .claude/resume.md per the existing session-resume workflow so the next session can pick up at item N+1. Closes #289 * feat(skills): inline critical rules in batch mode (#302) Add a per-item rule reminder (B-4.0) to both issue-work and pr-work batch modes. Before each item's execution, the loop emits a 5-line invariant block as a fresh tool result so the language, commit format, attribution, and CI gate rules sit in the recent attention window instead of being buried by accumulating context. This complements the per-5-item user-facing gate from #289: the gate refreshes attention via user messages every five items, while the inline reminder refreshes it via tool results every single item. Together they form a multi-layer drift mitigation that does not depend on model self-discipline. The 5-line cap keeps cumulative cost linear and tiny (~25 tokens per item). Reference doc loads inside the loop are explicitly forbidden so the inline reminder remains the most recent context anchor. Closes #290 * feat(hooks): add pr-language-guard pretooluse hook (#303) * feat(hooks): add validate-language shared library * feat(hooks): add pr-language-guard pretooluse hook * ci(hooks): register pr-language-guard in settings * docs: document pr-language-guard hook * feat(hooks): add merge-gate-guard pretooluse hook (#304) * feat(hooks): add merge-gate-guard pretooluse hook * ci(hooks): register merge-gate-guard in settings * docs: document merge-gate-guard hook * feat(hooks): extend attribution guard to gh pr and issue commands (#305) * refactor(hooks): expose validate-no-attribution helper * feat(hooks): add attribution-guard pretooluse hook * ci(hooks): register attribution-guard in settings * docs: document attribution-guard hook * feat(skills): delegate batch items to subagents (#306) Default batch mode in issue-work and pr-work now dispatches each item to a fresh general-purpose Agent. The parent context retains only a per-item queue record (item id, status, pr url, ci conclusion), while expensive tool output lives inside the subagent and is discarded on completion. This keeps rule compliance at item 30 equivalent to item 1. Add an opt-in --inline flag that preserves the legacy single-context loop for tiny batches or cases where inter-item context is genuinely useful (e.g. related regressions sharing a root cause). Document the trade-offs in a comparison table inside each reference/batch-mode.md. Closes #294 * feat(skills): add --auto-restart to batch gate (#307) Replace the interactive chunked gate with a forced session restart when --auto-restart is set. Every CONFIRM_INTERVAL items the batch writes .claude/resume.md via the Batch Workflow Resume Format and exits, so a fresh claude session starts each chunk with CLAUDE.md and skill files reloaded at position zero. This is a stronger context reset than the interactive gate because it ends the OS process entirely: accumulated tool results, gh outputs, CI log fetches and diff reads are all discarded. Intended for long unattended batches paired with a wrapper that re-invokes claude on exit. --no-restart overrides --auto-restart and falls back to the interactive gate, so defensive scripts can guarantee no session exit even if --auto-restart is set by an alias or wrapper. Default behavior (neither flag) is unchanged: the interactive AskUserQuestion gate still fires every CONFIRM_INTERVAL items. Touches both issue-work and pr-work skill pairs to keep the two batch-mode references in sync, plus a one-line note in global/CLAUDE.md and a cross-reference in session-resume.md. Closes #296 * docs: add distributed-batch-dispatch research doc (#308) Research-only deliverable for #298: evaluate RemoteTrigger and the /schedule skill as a distributed dispatch layer for issue-work / pr-work batch mode. The doc answers all six research questions by separating what is knowable from the RemoteTrigger API surface from what needs empirical testing. Each test item is flagged explicitly so a future revisit has a concrete test plan instead of hand-waving. Decision: DEFER. The rule-drift problem #298 was meant to address is already solved by subagent delegation (#294 / PR #306) and --auto-restart (#296 / PR #307), both of which reuse existing infrastructure without new operational surface area. The unique value proposition of remote triggers -- parallel execution across independent accounts or machines -- is not a current batch-mode requirement, since batch limits are bounded by rule drift rather than by throughput. Includes a cost comparison table across subagent delegation, --auto-restart, external script (#297), and remote trigger per item; an empirical test plan that can be executed at low cost if the decision is revisited; and explicit linkage to the other tier-3 issues so the strategy space stays coherent. Closes #298 * feat(scripts): add external batch orchestrators (#309) Add four wrapper scripts that spawn one fresh claude CLI process per batch item, pushing isolation to the OS process boundary: scripts/batch-issue-work.sh (bash) scripts/batch-issue-work.ps1 (PowerShell 7) scripts/batch-pr-work.sh (bash) scripts/batch-pr-work.ps1 (PowerShell 7) Each script enumerates candidate items via gh, loops over them, and invokes `claude --print /issue-work <org/repo> <n> --solo` (or the pr-work equivalent) per item. Per-item logs are written to ~/.claude/batch-logs/<timestamp>/issue-<n>.log. On any item failure the batch pauses and exits non-zero so the operator can inspect the log before continuing; successful items are not rolled back. pr-work orchestrators select PRs whose statusCheckRollup contains at least one FAILURE/TIMED_OUT/CANCELLED/ACTION_REQUIRED/STARTUP_FAILURE conclusion, so passing and in-progress PRs are skipped. README adds Use Case D documenting when to pick external orchestration over in-session batch mode, and the Common Tasks table gains rows for the new scripts on both platforms. This complements subagent delegation (#294 / PR #306) and --auto-restart (#296 / PR #307) by offering the strongest available form of per-item isolation: a fresh claude process boots for every item, so neither conversation-level nor host-process-level state can leak. Closes #297 * test(batch): add drift signal extractor library (#316) Pure bash library of five extractor functions used to measure rule compliance in batch-mode runs of /issue-work and /pr-work: - extract_language_violations: count CJK characters - extract_attribution_leaks: count AI attribution markers - extract_ci_gate_violations: detect merge-with-failing-check - extract_missing_closes: detect missing Closes/Fixes/Resolves keywords - extract_commit_format_violations: count Conventional Commits violations Sources hooks/lib/validate-commit-message.sh so CMV_ATTRIBUTION_REGEX and validate_commit_message stay single-source. Self-tested with 34 cases covering empty input, mixed text, JSON variants, and SSOT loading. Wired into validate-hooks.yml so future PRs to main run the suite and shellcheck the new scripts. Foundational layer for the Tier 2 benchmark orchestrator (#314) and the regression test (#311). Part of #310 Part of #287 Closes #312 * test(batch): add scratch repo seeding script (#317) Idempotent bash script that bootstraps kcenon/batch-drift-scratch with 30 trivial typo-fix issues for the Tier 2 benchmark corpus: - Creates the repo if absent (public, with README) - Upserts docs/file-01.md .. docs/file-30.md with identical one-line typo content; skips PUT when existing SHA-matched content already matches - Enumerates open issues prefixed "fix typo in docs/file-" and creates only the missing ones (title/body/acceptance criteria 5W1H-formatted) Supports --dry-run (network-free preview) and --help. Tests cover 23 cases: flag parsing, dry-run output shape and determinism, file numbering bounds (01..30), and that dry-run never issues HTTP. Fixes the GNU-vs-BSD grep divergence in the test harness by passing `--` before literal patterns that begin with `-`. Wired into validate-hooks.yml as a sibling step to the extractor tests. Part of #310 Part of #287 Closes #313 * test(batch): add benchmark orchestrator and aggregator (#318) Adds the operator-facing benchmark entry point and its offline-testable aggregation half: - run-benchmark.sh: orchestrator that invokes /issue-work under one of three Tier 2 strategies (subagent, auto-restart, orchestrator), captures per-item PR data, and delegates aggregation. Dry-run mode prints the planned invocation without calling claude, gh, or the seeder. Validates args, preconditions, and dependency files (extractor lib, seeder, external orchestrator from #297) before running. - aggregate-results.sh: pure function that reads raw per-item PR JSONs from a directory, applies the five drift extractors, and emits the strategy results JSON per the schema in #314. No network, no gh. - 11 aggregator fixtures: 5 clean + 6 with drift concentrated at item 6 (hangul body, AI-assisted keyword, merged-with-FAILURE, no Closes keyword, no-type-prefix commit). Exercises bucketing into items_1_to_5 and items_6_to_30. - 71 new test cases (40 aggregator + 31 orchestrator) covering flag parsing, precondition errors, dry-run shape, per-strategy invocation lines, JSON schema shape, bucketing, and determinism. Per-item capture is designed to survive single-item failures: a gh pr view error writes a null-signal raw file with capture_error=true instead of aborting the batch. Live execution is deliberately out of scope (belongs to #315); this PR ships the infrastructure that #315 will drive. Part of #310 Part of #287 Closes #314 * feat(agents): add memory, maxTurns, and effort frontmatter fields (#323) Add advanced frontmatter fields to all 6 agent definitions: - memory: project scope (local for structure-explorer) - maxTurns: 15-30 based on agent complexity - effort: high (medium for structure-explorer) - initialPrompt: context-loading prompt for 5 agents - Enhanced description fields for better agent selection - Added Bash tool to code-reviewer for git/test access Changes applied to both project/.claude/agents/ and plugin/agents/. structure-explorer kept on haiku model as its task (file structure mapping) is well-suited to a fast, lightweight model. Closes #320 * feat(agents): standardize output format, guardrails, and language-specific rules (#324) Add four new sections to agent definitions: 1. Core Behavioral Guardrails (all 6 agents): Self-check questions from anti-patterns.md to prevent assumption-making, over-engineering, and scope creep. 2. Standardized Output Format (5 agents, qa-reviewer unchanged): - code-reviewer: severity table + APPROVE/REQUEST_CHANGES verdict - codebase-analyzer: confidence scores in findings table - documentation-writer: documentation checklist with completeness % - refactor-assistant: before/after diff + test verification report - structure-explorer: no change needed (already structured) 3. Language-Specific Rules (3 agents): - code-reviewer: language-aware review checks - codebase-analyzer: language-aware analysis points - refactor-assistant: language-aware refactoring considerations 4. Safety Verification Protocol (refactor-assistant only): Replaces the 4-line safety principles with concrete before/during/after verification steps and hard-stop conditions. Changes applied to both project/.claude/agents/ and plugin/agents/. Closes #321 * feat(agents): add team communication protocol and new specialized agents (#325) Part A: Team Communication Protocol - Added ## Team Communication Protocol to all 6 existing agents - Each protocol defines receives-from, sends-to, handoff triggers, and task management behavior for team collaboration - No circular delegation loops in the protocol graph: structure-explorer → codebase-analyzer → {documentation-writer, code-reviewer} code-reviewer ↔ qa-reviewer (bidirectional boundary verification) code-reviewer → refactor-assistant (one-way delegation) Part B: New Specialized Agents - dependency-auditor: CVE scanning, license compliance, freshness analysis, unused dependency detection - test-strategist: coverage gap identification, test quality assessment, strategy recommendation, skeleton generation - migration-planner: deferred — scope too speculative for a configuration repo without production databases or APIs Both new agents include all sections from #320 (frontmatter) and #321 (guardrails, output format, team protocol). Updated project/CLAUDE.md agent list (6 → 8 agents). Synchronized all changes to plugin/agents/. Closes #322 * test(batch): add drift regression harness with thresholds and docs (#326) Add automated behavioral regression test that verifies 30-item batch workflows retain rule compliance within configurable thresholds. Components: - run-regression.sh: orchestrates seed, benchmark, threshold assertion - thresholds.json: default max-allowed drift counts per signal - test-run-regression.sh: 35 offline unit tests (arg parsing, dry-run) - docs/batch-drift-regression.md: methodology, triage guide, cost notes - .gitignore: exclude benchmark runtime outputs Reuses benchmark infrastructure from tests/batch_drift_benchmark/ and SSOT extractors from hooks/lib/validate-commit-message.sh. Note: CI workflow (.github/workflows/batch-drift-regression.yml) must be pushed separately with a token that has the `workflow` scope. Closes #311 * test(batch): execute Tier 2 benchmarks and publish results (#327) * fix(scripts): use gh api user for auth check instead of gh auth status gh auth status returns non-zero when any configured token is invalid, even if the active GH_TOKEN works. Replace with gh api user which tests actual API connectivity. * fix(scripts): remove redundant jq pipe and add empty fallback The gh -q flag already outputs valid JSON. The extra pipe to jq -c caused failures under pipefail when gh emitted auth warnings to stderr. Add empty-array fallback for robustness. * test(batch): add Tier 2 benchmark results and comparison document Benchmark executed against kcenon/batch-drift-scratch with 30 trivial typo-fix issues. Results: zero drift across all 5 signals for both 5-item baseline and 30-item single-session batch. Key finding: Tier 0-1 mitigations (hooks, inline rules) are sufficient for uniform XS workloads. Orchestrator (#297) recommended as default Tier 2 strategy for production mixed-complexity batches. Limitation: subagent and auto-restart strategies could not be benchmarked due to batch mode AskUserQuestion blocking in --print mode. Closes #315 * fix(structure): consolidate triple-duplicated workflow reference files (#337) * feat(scripts): add SSOT sync and check tooling for workflow refs Adds scripts/sync_references.{sh,ps1} and scripts/check_references.{sh,ps1} to keep workflow reference files consistent across three locations: - canonical: project/.claude/rules/workflow/ - mirror 1: project/.claude/skills/project-workflow/reference/ - mirror 2: plugin/skills/project-workflow/reference/ scripts/sync.sh gains a --references-only fast path that delegates to sync_references.sh. The validate-skills CI workflow runs check_references on every PR and fails on drift (exit 2). Part of #328 * fix(refs): restore canonical content in project and plugin mirrors The skills/reference/ copies contained one-line relative-path strings left over from a symlink experiment that never worked. Imports via @./reference/<file> received a literal path string instead of the intended content. The plugin/reference/ copies carried an older, verbose version of each document that had drifted substantially from the current concise rules/ copy. Runs scripts/sync_references.sh to bring both mirrors byte-identical to project/.claude/rules/workflow/. Drift is enforced by the new CI check. Part of #328 Closes #329 * chore(versions): unify version numbering via VERSION_MAP.yml (#338) * chore(versions): add VERSION_MAP.yml with independent SemVer tracks Introduces VERSION_MAP.yml as single source of truth for four independent SemVer tracks: suite, plugin, plugin-lite, and settings-schema. Each field moves independently to reflect their distinct release cadences. Adds scripts/check_versions.{sh,ps1} to verify each declared field matches its consumer files (plugin.json, settings.json, README badge URL) and scripts/sync_versions.{sh,ps1} to propagate map values to consumers. Wires check_versions.sh into the validate-skills CI workflow so that any drift between VERSION_MAP.yml and its consumers fails the release PR. Adds scripts/sync.sh --versions-only fast path for the common case of regenerating version references after editing the map. * docs(release): integrate VERSION_MAP into release skill and docs Updates the release skill to accept --target <field> and bump only the targeted SemVer track via VERSION_MAP.yml and sync_versions.sh. For non-suite targets the tag format becomes <target>-v<version> to keep tracks separate in the git tag history. Documents the version layout in docs/CUSTOM_EXTENSIONS.md under a new VERSION_MAP SSOT section, explaining why plugin, plugin-lite, and settings-schema each follow their own SemVer track rather than being locked to a single suite version. The skill falls back to its legacy single-version behavior when no VERSION_MAP.yml is present, so it remains usable in projects that inherit the configuration without adopting the map. * fix(hooks): repair 4 bugs in markdown-anchor-validator (#340) * fix(hooks): four bugs in markdown-anchor-validator Bug A (sh only): `/^#+[[:space:]]/` accepted lines with 7+ hashes as headings, silently registering anchors that GitHub does not create. Replaced with `/^#{1,6}[[:space:]]/` in both the match and the subsequent sub(). Bug B (sh + ps1): intra-file and inter-file reference extraction scanned the raw line, so `[a](#missing)` inside inline backticks was treated as a live reference and blocked commits on documentation files that include syntax examples. Now the line is copied into a `work`/`scanLine` variable with inline-code spans stripped before the match loops run. Bug C (sh only): JSON escaping covered `"` but not `\`, producing invalid JSON whenever an anchor or filename contained a backslash. Replaced manual escaping with `jq -Rs .`, which handles `\`, `"`, newlines, and control characters in one step. The error message is now built with real newlines (`$'\n'`) rather than literal "\n" strings, matching how jq expects input. Bug D (sh only): `set -euo pipefail` combined with a `jq` pipeline could abort the script silently on systems without jq, leaving Claude Code without a decision response. Added an explicit `command -v jq` check at script entry that fails open with a stderr warning, plus a `|| CMD=""` guard on the command-extraction pipe. The PowerShell variant only shared bug B; its heading regex already limited to `#{1,6}`, its JSON output goes through `ConvertTo-Json`, and it does not depend on jq. * test(hooks): add regression suite for markdown-anchor-validator bugs Adds fixture markdown files and a test runner that exercises each of the four bugs addressed in the companion fix commit: - bug-a-excessive-hashes.md: 7 hashes should not register an anchor - bug-b-inline-code.md: inline-code example syntax should not block commit - bug-c-backslash.md: anchor with backslash must produce valid JSON - baseline-valid.md: well-formed markdown must not trigger errors The runner skips cleanly when jq is absent (demonstrating the bug D fail-open path) and matches the "N passed, N failed" summary format that tests/hooks/test-runner.sh parses, so it is picked up by the validate-hooks CI workflow automatically. * chore(plugin): remove redundant path fields from manifests (#341) The official Claude Code plugin spec auto-discovers agents/, skills/, hooks/hooks.json, .mcp.json, and .lsp.json at the plugin root. The explicit path fields in plugin.json were overrides that duplicated the default layout and risked masking future spec changes. - plugin/.claude-plugin/plugin.json: drop agents, skills, hooks, lspServers - plugin-lite/.claude-plugin/plugin.json: drop skills - tests/plugin/smoke-test.{sh,ps1}: switch from manifest-declared paths to default-location discovery; add hooks.json validity check - plugin/README.md: update manifest compatibility note to describe auto-discovery behavior instead of the removed path fields Closes #331 * chore(ci): seal nightly batch drift regression workflow (#343) Add the GitHub Actions workflow that schedules the drift regression harness produced under epic #287, and exclude the local scratch repo from version control so it is not accidentally committed by future seeding runs. Closes #342 * feat(skills): modernize SKILL.md frontmatter with disable-model-invocation, allowed-tools, and paths (#344) * feat(skills): add disable-model-invocation to global workflow skills * feat(skills): add allowed-tools to global workflow skills * feat(skills): add paths to plugin and project knowledge skills * feat(skills): add when_to_use to plugin and project knowledge skills * feat(skills): extend workflow frontmatter to doc-review git-status doc-update * docs(skills): update frontmatter documentation * feat(skills): add paths to project code-quality knowledge skill * feat(skills): add paths to project code-quality skill * feat(hooks): adopt InstructionsLoaded, PostCompact, and TaskCreated hook events (#345) * feat(hooks): add instructions-loaded-reinforcer for InstructionsLoaded event * feat(hooks): add post-compact-restore for PostCompact event * feat(hooks): add task-created-validator for TaskCreated event * feat(settings): subscribe to InstructionsLoaded, PostCompact, TaskCreated events * test(hooks): add tests for InstructionsLoaded, PostCompact, TaskCreated hooks * fix(hooks): ensure bash and powershell hook outputs are byte-identical * docs(hooks): document InstructionsLoaded, PostCompact, TaskCreated hooks * feat(scripts): add official-spec linter and integrate into validate-skills.yml (#346) * feat(scripts): add official-spec linter for skill, plugin, settings Validates SKILL.md frontmatter, plugin.json, and settings.json against canonical Claude Code 2026 schemas (PyYAML + jsonschema). Wires into sync.sh --lint sub-flag and validate_skills.sh as a soft-fail check. * test(scripts): add spec linter test suite 12 fixture-based test cases covering SKILL.md/plugin.json/settings.json validation, did-you-mean suggestions for unknown fields, --warn-only and --strict mode flags, and full-repo discovery via the wrapper. Adds --strict flag and unknown-field annotation to spec_lint.py. * ci(skills): wire spec linter into validate-skills workflow Adds jsonschema dependency, expands path triggers to cover spec linter sources and schemas, runs spec_lint.sh --strict and the linter test suite. Adds --strict / -Strict flag to bash and PowerShell wrappers so CI can distinguish strict-mode failures from regular violations. * chore(scripts): enable Set-StrictMode in spec_lint.ps1 Hardens the PowerShell wrapper against silent reference-to-uninitialized variables, matching the bash twin's set -euo pipefail rigor. * docs(scripts): document spec linter and its CI integration Add a Spec Linter section to docs/CUSTOM_EXTENSIONS.md describing the canonical Claude Code 2026 schemas under scripts/schemas/, the bash and PowerShell wrappers, the three exit codes, the warn-only/strict modes, the schema-update procedure, and the validate-skills.yml CI gate. Add an Unreleased entry to global/VERSION_HISTORY.md cross-referencing issue #334 and parent epic #328. * fix(sync): abort interactive sync when canonical files violate schemas Adds a pre-flight spec_lint guard to sync.sh and sync.ps1 default flow so the interactive sync refuses to deploy schema-violating files to ~/.claude/. Bypass with --skip-lint for emergency syncs (e.g., reverting a bad change). Also adds the py launcher to bash Python discovery for Git Bash on Windows, refreshes the spec_lint.py docstring exit-code table, fixes sync.ps1 flag forwarding to use a splatted hashtable so PowerShell ValidateSet does not misbind switches as Mode values, and extends both test suites with three integration cases (--lint fast path, pre-flight abort, --skip-lint bypass). * feat(skills): adopt context:fork for security-audit, performance-review, and doc-review (#347) * feat(skills): adopt context:fork for audit and review skills Add agent: Explore to security-audit and performance-review (both already had context: fork) so audits run in a forked, read-only subagent context. Add allowed-tools to performance-review to declare its read-only audit posture explicitly. Add context: fork plus agent: general-purpose to doc-review so its larger analysis output runs in isolation; general-purpose is required for --fix write access. Each modified SKILL.md gains a structured Output section that reminds the forked subagent it has no access to the calling conversation's history and must operate from the supplied arguments only. Project mirror SKILL.md files under project/.claude/skills/ are intentionally preserved with their existing development-style tool declarations and are out of scope for this issue. Closes #335 * docs(skills): document context:fork adoption for audit skills Add Skill Context Isolation subsection under Detailed Breakdown in docs/CUSTOM_EXTENSIONS.md, listing the three skills now using context: fork with their agent choice and rationale. Add Unreleased entry to global/VERSION_HISTORY.md cross-referencing #335 and the parent epic #328. Closes #335 * docs: migrate to canonical documentation URLs and inventory settings fields (#348) Closes #336 Replaces legacy documentation URLs with the new canonical equivalents and adds a Settings Field Inventory section to COMPATIBILITY.md classifying every non-schema field as Stable/Experimental/Undocumented/Misplaced against the official reference. Key findings recorded: - showTurnDuration and teammateMode belong in the global config file, not settings.json - env.CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS is officially experimental - env.MAX_TEAMS, ENABLE_TOOL_SEARCH, MAX_MCP_OUTPUT_TOKENS are undocumented - effortLevel only accepts low/medium/high/xhigh (not max) Files touched: HOOKS.md, docs/CUSTOM_EXTENSIONS.md, COMPATIBILITY.md (URL fixes + new inventory section), README.md (migration note), VERSION_HISTORY.md (Unreleased entry). * Replace local git-commit-format with symlink (#349) Replace project/.claude/skills/project-workflow/reference/git-commit-format.md with a symlink to ../../../rules/workflow/git-commit-format.md to centralize commit message guidelines and remove the duplicated inline guidance. * Point workflow refs to centralized rules (#350) Replace local workflow reference docs with symlinks to the canonical rule files to remove duplication and centralize maintenance. Updated files: project/.claude/skills/project-workflow/reference/github-issue-5w1h.md, github-pr-5w1h.md, and performance-analysis.md now point to ../../../rules/workflow/*. This keeps a single source of truth for workflow guidance and simplifies future updates. * fix(ci): unblock release by fixing 3 blocker categories (#352) * fix(ci): add bash 4+ install on macOS and clean remaining SC2034 (#353) * fix(ci): remove remaining dead HAS_FIELD assignment (#354) * test(hooks): use jq as primary JSON validator with python fallback (#355) The assert_valid_json helpers in test-instructions-loaded-reinforcer.sh and test-post-compact-restore.sh required python or python3 to be on PATH. On minimal runners (e.g. the WSL/Docker image used by CI contributors that only installs jq via the validate-hooks workflow), both python invocations fail with exit 127 and the tests incorrectly report the hook output as invalid JSON. Prefer jq — already a required dependency of every hook these tests cover, so it is guaranteed present whenever the hooks themselves run — and retain python3 and python as fallbacks for environments where jq is unavailable but python is. * Make markdown anchor validator mawk-compatible (#356) Replace the ERE quantifier /^#{1,6}/ with match()+RLENGTH-based logic to detect 1–6 leading hashes. mawk does not support the {1,6} quantifier, so this change uses match($0, /^#+[[:space:]]/) and computes h_count = RLENGTH - 1, then proceeds only if h_count is between 1 and 6. The rest of the heading extraction/printing logic is preserved, so behavior and output remain the same while ensuring compatibility with mawk. * fix(config): wire SSL_CERT_FILE to resolve sandbox TLS for git/curl (#368) * fix(config): wire SSL_CERT_FILE for sandbox TLS Adds SSL_CERT_FILE and SSL_CERT_DIR env vars so git, curl, npm, pip and other CA-bundle-aware tools complete TLS handshakes inside the Claude Code sandbox without dangerouslyDisableSandbox. Adds scripts/verify-tls.sh to probe the fix. gh on macOS still fails because Darwin Go ignores SSL_CERT_FILE; remediated separately via a Bash allowlist. See docs/SANDBOX_TLS.md. Refs #367 * docs(sandbox): document SSL_CERT_FILE fix and gh caveat Adds docs/SANDBOX_TLS.md with the root cause, coverage matrix, platform fallback ladder, and the gh-on-macOS caveat plus its Bash-allowlist remediation. Updates global/CLAUDE.md with a new Environment Workarounds section that references the doc. Refs #367 * feat(hooks): add PostToolUse agent-checkpoint hook (#369) Closes #360 Add post-task-checkpoint.sh/.ps1 under global/hooks/ and register it in global/settings.json and global/settings.windows.json under PostToolUse matcher Task|Agent. After each Task or Agent tool call completes, the hook snapshots any working-tree changes into a wip(agent): checkpoint commit so a later sub-agent cannot silently clobber a prior agent's output in multi-agent workflows. Design: - Fail-open: exit 0 on any parse, git, or jq error — never block workflow - No-op on clean tree (avoids empty-commit spam) - No-op outside git worktree - --no-verify bypasses commit-msg validator (wip() not in accepted types) - --allow-empty satisfies empty-tree AC defensively - Agent name sanitized to [A-Za-z0-9_-], clipped to 64 chars Documentation: HOOKS.md gains section #19 with purpose, limitations, opt-out path, and the async/timeout tradeoff rationale. Tests: tests/hooks/test-post-task-checkpoint.sh covers 14 cases including dirty/clean/non-repo paths, agent-name sanitization, malformed-JSON fail-open, and a two-agent overwrite scenario that proves agent-A's output is recoverable from HEAD~1 after agent-B overwrites shared.txt. Scope clarification vs the issue body: - Test fixture lives at tests/hooks/test-post-task-checkpoint.sh (project convention, discoverable by test-runner.sh) instead of tests/post-task-checkpoint/. - hooks/install-hooks.sh is not modified: that script installs git-level hooks under .git/hooks/, while Claude Code hooks in global/hooks/ are deployed via scripts/sync.sh, which already syncs the entire directory unchanged. * feat(skills): add Atomic Multi-Phase Execution rule to _policy.md (#370) Closes #361 Add a top-level Atomic Multi-Phase Execution section to the global command policy so multi-phase requests ("Phase 1/2/3", "up to Phase N") are treated as a single atomic unit without mid-plan confirmation prompts. Cross-reference the rule from issue-work, pr-work, and release SKILL.md — the three skills where this friction was observed. The original common-rules list is preserved verbatim under a new Common Rules heading so existing references remain valid. * docs: consolidate Environment Workarounds in global/CLAUDE.md (#371) Closes #362 The canonical Environment Workarounds section in global/CLAUDE.md (lines 17-21) already documents the SSL_CERT_FILE fix (PR #368), the gh macOS caveat, Read-before-Edit contract, and the dangerouslyDisableSandbox fallback policy. This change removes two drifted duplicates that still advertised the outdated dangerouslyDisableSandbox-first approach: - project/.claude/rules/workflow/ci-resilience.md § TLS / Sandbox Errors - global/skills/pr-work/SKILL.md § TLS/Sandbox Error Handling Both now defer to the canonical section and docs/SANDBOX_TLS.md, keeping only the diagnostic note that distinguishes TLS errors from auth errors. Out of scope: - docs/SANDBOX_TLS.md — deep-dive reference, already cited from CLAUDE.md - project/.claude/skills/ci-debugging/reference/common-failures.md — diagnostic content (symptoms/causes/solutions), not a rule; kept as-is - plugin/skills/ci-debugging/reference/common-failures.md — same Acceptance: - global/CLAUDE.md Environment Workarounds: already canonical (no change) - pr-work and ci-resilience now cross-reference it - No duplicate rules remain (verified by grep of dangerouslyDisableSandbox|SSL_CERT_FILE|OSStatus -26276 across *.md) * feat(skills): add ci-fix skill codifying recurring CI failure patterns (#372) Creates global/skills/ci-fix/ with a classifier-plus-known-fixes pipeline for three recurring CI failure patterns documented in the 2026-04-18 /insights report: - msvc-c4996: deprecated API under /WX warnings-as-errors - cmake-fetchcontent: GIT_SHALLOW ON with commit hash - cpp-lib-format: __cpp_lib_format probe vs link-time availability The skill fetches the failing run log, matches against a deterministic classifier table, and applies the codified remediation. Budget: 20 minutes wall-clock with one retry slot, then escalates. pr-work SKILL.md now points to ci-fix as a known-pattern shortcut before hand-authoring a fix. Validated via scripts/spec_lint.sh (no violations) and scripts/validate_skills.sh (232/232 passed). Closes #363 * feat(hooks): add pre-edit-read-guard PreToolUse hook (#373) Enforces the Read-before-Edit/Write tool contract. A single script is registered twice in global/settings.json: - PreToolUse Edit|Write guard mode (deny on tracker miss) - PostToolUse Read track mode (append file_path to tracker) Tracker lives at $TMPDIR/claude-read-set-<session-id>, one absolute path per line, deduplicated. Fail-open when the tracker file is absent so fresh sessions are not blocked before any Read has fired. Converts silent Edit retries on unread files into an actionable deny message naming the exact Read target. Changes: - global/hooks/pre-edit-read-guard.sh (new) - global/hooks/pre-edit-read-guard.ps1 (new) - global/settings.json (+ PreToolUse Edit|Write, + PostToolUse Read) - global/settings.windows.json (same pair) - HOOKS.md: new § 20 - tests/hooks/test-pre-edit-read-guard.sh (15 cases) Validated: - tests/hooks/test-runner.sh 253 passed, 0 failed - scripts/spec_lint.sh settings/files=2 violations=0 - scripts/validate_skills.sh 232/232 passed Closes #364 * feat(skills): add preflight skill for local CI simulation before push (#374) Creates global/skills/preflight/ — a check orchestrator that reproduces the CI contract locally so failures surface on the developer machine, not on GitHub. Pairs with ci-fix: same pattern catalogue, opposite direction (preflight prevents, ci-fix reacts). Checks in invocation order (cheap first): - deprecated-api grep patterns shared with ci-fix/reference - cmake-configure cmake -S . -B <dir> -Werror=dev; skips without cmake - act-linux nektos/act --list; skips without act or docker - msvc-docker docker run of a Windows image; skips off-Windows Each check emits one JSON line on stdout (status: pass/fail/skip) and writes evidence to $TMPDIR/preflight-<check>-<pid>.log on failure. run-all.sh aggregates and prints a summary to stderr; exit code is non-zero iff any check reports fail. Integrates with hooks/pre-push via opt-in CLAUDE_PREFLIGHT=1. Default behaviour (protected-branch block) is preserved when the flag is unset. Changes: - global/skills/preflight/SKILL.md - global/skills/preflight/scripts/{run-all,run-deprecated-api,run-cmake-configure,run-act,run-msvc-docker}.sh - hooks/pre-push (+ CLAUDE_PREFLIGHT=1 branch) - hooks/pre-push.ps1 (same branch via bash shim) Validated: - scripts/spec_lint.sh skill violations=0 - scripts/validate_skills.sh 240/240 passed - Manual run in this repo: 1 pass (deprecated-api), 3 skips Closes #365 * feat(skills): add fleet-orchestrator skill for parallel multi-repo operations (#375) * feat(skills): add fleet-orchestrator skill for parallel multi-repo runs Introduces a supervisor-plus-workers harness that fans a single directive across N repositories. Each repo is handled by a fresh general-purpose Agent launched in background; all workers coordinate through a single flock-guarded manifest (fleet-status.json) that records per-repo status, PR URL, CI conclusion, and merge outcome. Deliverables: - SKILL.md: entry point with workflow phases, dispatch protocol, supervisor polling loop, and aggregation report format - reference/manifest-schema.json: JSON Schema Draft 2020-12 for the shared fleet-status.json file, including worker lifecycle phases and error classes - reference/worker-template.md: per-repo worker prompt with substitution tokens, manifest update protocol, and failure-isolation contract Refs: #366 * docs: add fleet-orchestrator user guide alongside harness docs Adds a user-facing overview of the fleet-orchestrator skill including its place in the tier progression from the 2026-04-18 /insights report, the fan-out/supervisor architecture diagram, the relationship to harness / issue-work / pr-work, example invocations, and failure-isolation classes. Refs: #366 * chore(release): bump suite to 1.10.0 (#376) --------- Co-authored-by: kcenon <4158198+kcenon@users.noreply.github.com>
1 parent c6724b8 commit 946fcbd

36 files changed

Lines changed: 3370 additions & 31 deletions

HOOKS.md

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@ Hooks are user-defined commands that automatically execute during specific Claud
2626
| Re-inject critical policy after instruction load | [Instructions Loaded Reinforcer](#16-instructions-loaded-reinforcer-instructionsloaded) |
2727
| Restore core principles after context compaction | [Post-Compact Restore](#17-post-compact-restore-postcompact) |
2828
| Validate task descriptions at creation time | [Task Created Validator](#18-task-created-validator-taskcreated) |
29+
| Auto-commit working tree after Task/Agent runs | [Post Task/Agent Checkpoint](#19-post-taskagent-checkpoint-posttooluse) |
30+
| Block Edit/Write on files that weren't Read first | [Pre-edit Read Guard](#20-pre-edit-read-guard-pretoolusepostooluse) |
2931
| Block direct pushes to protected branches | [Pre-push Protected Branch Guard](#git-hooks-pre-push-protected-branch-guard) |
3032
| Add my own custom hook | [Adding New Hooks](#adding-new-hooks) |
3133
| Set up hooks on Windows | [Windows Support](#windows-support-powershell) |
@@ -603,6 +605,112 @@ TaskCreated rejected: description must be at least 20 characters (got 12). Add s
603605
TaskCreated rejected: description must contain at least one '- [ ]' checkbox marker for acceptance criteria.
604606
```
605607

608+
### 19. Post Task/Agent Checkpoint (PostToolUse)
609+
610+
*Snapshots the working tree into a WIP commit after every `Task` or `Agent` call — prevents a later sub-agent from silently overwriting a prior agent's output in multi-agent workflows.*
611+
612+
**Purpose**: Close the write-race window in multi-agent skills (issue-work team mode, harness fan-out, fleet-orchestrator) where a second agent can clobber uncommitted output from a first agent. The hook checkpoints after each `Task`/`Agent` completes so the previous agent's changes survive in git history even if the working tree is overwritten.
613+
614+
**Trigger**: `PostToolUse` event, matcher `Task|Agent`. Non-matching tools pass through silently.
615+
616+
**Files**: `global/hooks/post-task-checkpoint.sh`, `global/hooks/post-task-checkpoint.ps1`
617+
618+
**Behavior**:
619+
1. Read JSON from stdin (tool_name, tool_input). Fail-open on malformed input.
620+
2. Skip silently if tool_name is not `Task` or `Agent`.
621+
3. No-op if not inside a git worktree (prevents errors in non-repo directories).
622+
4. No-op if working tree is clean (keeps history free of empty-commit spam).
623+
5. Otherwise: `git add -A && git commit -m "wip(agent): $AGENT_NAME checkpoint $TS" --no-verify --allow-empty`.
624+
625+
**Commit message format**: `wip(agent): <sanitized-agent-name> checkpoint YYYY-MM-DD HH:MM:SS`. Agent name is extracted from `tool_input.subagent_type` (preferred) or `tool_input.name` (fallback); only `[A-Za-z0-9_-]` characters survive sanitization, clipped to 64 chars.
626+
627+
**Why `--no-verify`**: `wip(agent):` is not in the Conventional Commits type list that `commit-msg` accepts, so the validator would reject it. Checkpoint commits are throwaway and expected to be squashed at PR merge.
628+
629+
**Why `--allow-empty`**: Defensive — satisfies the acceptance criterion that "hook succeeds on empty tree" even if the no-op check is skipped for some reason (e.g., staged/unstaged boundary edge cases).
630+
631+
**Decision control**: Always exits 0 — the hook never blocks a workflow. Any git, jq, or JSON failure is swallowed silently. The failure mode is "checkpoint didn't happen," not "workflow stopped."
632+
633+
**Configuration**:
634+
```json
635+
{
636+
"type": "command",
637+
"command": "~/.claude/hooks/post-task-checkpoint.sh",
638+
"timeout": 15,
639+
"async": true
640+
}
641+
```
642+
643+
**Limitations**:
644+
- WIP checkpoints pollute history before squash merge. Acceptable tradeoff for recoverability. Release-time squash cleans them up.
645+
- Async: the hook does not block the model's next turn. A rapid-fire agent could start before its predecessor's checkpoint lands, though the wall-clock gap in practice is < 100 ms.
646+
- Does not run in non-git directories (e.g., ad-hoc `/tmp` work) — nothing to checkpoint there.
647+
648+
**Opt-out**: Remove the `PostToolUse` matcher block from `global/settings.json` and re-run `scripts/sync.sh`. Individual sessions can skip by running outside a git worktree.
649+
650+
**Test fixture**: `tests/hooks/test-post-task-checkpoint.sh` exercises dirty/clean/non-repo paths, agent-name sanitization, malformed-JSON fail-open, and the two-agent overwrite scenario.
651+
652+
### 20. Pre-edit Read Guard (PreToolUse/PostToolUse)
653+
654+
*Converts "file was not read" tool-contract violations into an actionable Read-first deny message on the first attempt — no more silent Edit retries.*
655+
656+
**Purpose**: Enforce the Claude Code contract that `Edit` and `Write` on an existing file require a prior `Read` in the same session. Without this guard the tool simply errors out and the model retries in the dark. With it, the hook denies the edit up-front with a reason that tells Claude exactly which file to Read.
657+
658+
**Trigger**: A single script is registered under two hook entries:
659+
- `PreToolUse` matcher `Edit|Write` — the guard (returns `allow`/`deny` JSON).
660+
- `PostToolUse` matcher `Read` — the tracker (records the Read path, no JSON).
661+
662+
**Files**: `global/hooks/pre-edit-read-guard.sh`, `global/hooks/pre-edit-read-guard.ps1`
663+
664+
**Tracker**: `$TMPDIR/claude-read-set-<session-id>` on Unix, `%TEMP%\claude-read-set-<session-id>` on Windows. One absolute path per line, deduplicated. Cleared naturally when the temp directory rotates between sessions.
665+
666+
**Behavior**:
667+
1. Read JSON from stdin (`tool_name`, `tool_input.file_path`, `session_id`).
668+
2. **If `tool_name == "Read"`**: resolve `file_path` to absolute, append to tracker if absent. Emit no JSON. Best-effort — any failure is swallowed.
669+
3. **If `tool_name in {"Edit","Write"}`**: resolve `file_path` to absolute, then:
670+
- Tracker file missing → `allow` (first-run safety for fresh sessions).
671+
- `Write` on a non-existent file → `allow` (new files cannot have been Read).
672+
- Tracker contains the path → `allow`.
673+
- Otherwise → `deny` with a message naming the exact file to Read first.
674+
4. **Any other `tool_name`**: `allow` (prevents interference with other matchers).
675+
676+
**Deny reason format**:
677+
```
678+
Cannot Edit '/abs/path/to/file' without reading it first in this session. Call Read on '/abs/path/to/file' and retry. (Session <id>, tracker <path>.)
679+
```
680+
681+
**Decision control**: Always exits 0. Fail-open on empty stdin or missing `jq` — the hook never blocks the workflow when it can't do its job correctly.
682+
683+
**Configuration** (`global/settings.json`):
684+
```json
685+
{
686+
"PreToolUse": [
687+
{
688+
"matcher": "Edit|Write",
689+
"hooks": [
690+
{ "type": "command", "command": "~/.claude/hooks/pre-edit-read-guard.sh", "timeout": 5 }
691+
]
692+
}
693+
],
694+
"PostToolUse": [
695+
{
696+
"matcher": "Read",
697+
"hooks": [
698+
{ "type": "command", "command": "~/.claude/hooks/pre-edit-read-guard.sh", "timeout": 5, "async": true }
699+
]
700+
}
701+
]
702+
}
703+
```
704+
705+
**Limitations**:
706+
- Session-scoped. Across restarts the tracker rotates with `$TMPDIR` and Claude must Read again.
707+
- Case-sensitive file match on case-insensitive filesystems (macOS default HFS+/APFS) can produce rare false negatives if the same path is Read with different casing — normalize via `realpath` where available.
708+
- `Edit` on a file that the user manually modified outside the session still passes once it is in the tracker. The hook verifies Read, not freshness.
709+
710+
**Opt-out**: Remove the two matcher blocks from `global/settings.json` and `global/settings.windows.json`, then re-run `scripts/sync.sh`.
711+
712+
**Test fixture**: `tests/hooks/test-pre-edit-read-guard.sh` exercises the deny/allow paths, first-run tracker-missing safety, Write on non-existent files, and Read-then-Edit unlocking.
713+
606714
### Hook Response Format
607715

608716
All PreToolUse hooks must output JSON to stdout and exit with code 0:

README.ko.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Claude Configuration Backup & Deployment System
22

33
<p align="center">
4-
<a href="https://github.com/kcenon/claude-config/releases"><img src="https://img.shields.io/badge/version-1.9.0-blue.svg" alt="Version"></a>
4+
<a href="https://github.com/kcenon/claude-config/releases"><img src="https://img.shields.io/badge/version-1.10.0-blue.svg" alt="Version"></a>
55
<a href="LICENSE"><img src="https://img.shields.io/badge/license-BSD--3--Clause-green.svg" alt="License"></a>
66
<a href="https://github.com/kcenon/claude-config/actions/workflows/validate-skills.yml"><img src="https://github.com/kcenon/claude-config/actions/workflows/validate-skills.yml/badge.svg" alt="CI"></a>
77
</p>

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Claude Configuration Backup & Deployment System
22

33
<p align="center">
4-
<a href="https://github.com/kcenon/claude-config/releases"><img src="https://img.shields.io/badge/version-1.9.0-blue.svg" alt="Version"></a>
4+
<a href="https://github.com/kcenon/claude-config/releases"><img src="https://img.shields.io/badge/version-1.10.0-blue.svg" alt="Version"></a>
55
<a href="LICENSE"><img src="https://img.shields.io/badge/license-BSD--3--Clause-green.svg" alt="License"></a>
66
<a href="https://github.com/kcenon/claude-config/actions/workflows/validate-skills.yml"><img src="https://github.com/kcenon/claude-config/actions/workflows/validate-skills.yml/badge.svg" alt="CI"></a>
77
</p>

VERSION_MAP.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
# To bump a version: edit the field here, then run scripts/sync_versions.sh
1414
# (or /release <field> <new-version>) to propagate to consumers.
1515

16-
suite: 1.9.0
16+
suite: 1.10.0
1717
plugin: 2.3.0
1818
plugin-lite: 1.1.0
1919
settings-schema: 1.12.0

docs/SANDBOX_TLS.md

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
# Sandbox TLS: SSL_CERT_FILE Fix
2+
3+
Root-cause fix for the recurring TLS handshake failure that forced Claude Code
4+
sessions to use `dangerouslyDisableSandbox: true` for `git`, `curl`, and most
5+
HTTPS-based tooling. Tracked in issue #367.
6+
7+
## Symptom
8+
9+
Inside the Claude Code sandbox, Go-based TLS stacks and some system binaries
10+
have failed with:
11+
12+
```
13+
Post "https://api.github.com/graphql": tls: failed to verify certificate:
14+
x509: OSStatus -26276
15+
```
16+
17+
The workaround prior to this fix was to set `dangerouslyDisableSandbox: true`
18+
on every affected Bash call, which in turn triggered an individual user
19+
confirmation prompt per call.
20+
21+
## Root Cause
22+
23+
The Claude Code sandbox on macOS restricts access to the Keychain Services
24+
API. Tools that read trust anchors from a local CA bundle file (LibreSSL,
25+
OpenSSL, libcurl) succeed once the bundle path is discoverable via
26+
`SSL_CERT_FILE`. Tools that mandate Keychain access return the Apple
27+
`OSStatus -26276` error.
28+
29+
## Fix
30+
31+
Set two environment variables in `global/settings.json` so CA-bundle-aware
32+
tools consult an on-disk file instead of the Keychain:
33+
34+
```json
35+
{
36+
"env": {
37+
"SSL_CERT_FILE": "/etc/ssl/cert.pem",
38+
"SSL_CERT_DIR": "/etc/ssl/certs"
39+
}
40+
}
41+
```
42+
43+
After `scripts/sync.sh` propagates the change to `~/.claude/settings.json`,
44+
new Claude Code sessions inherit the variables.
45+
46+
## Coverage Matrix
47+
48+
| Tool | TLS Stack | SSL_CERT_FILE Respected? | Works Inside Sandbox After Fix? |
49+
|------|-----------|--------------------------|---------------------------------|
50+
| `curl` | LibreSSL | Yes | Yes |
51+
| `git` (HTTPS) | libcurl | Yes | Yes |
52+
| `wget` | OpenSSL | Yes | Yes |
53+
| `npm`, `pnpm`, `yarn` | Node OpenSSL | Yes | Yes |
54+
| `pip` | urllib3 + OpenSSL | Yes | Yes |
55+
| `go build`, `go mod` | Go crypto/tls | Yes (uses fallback roots) | Yes |
56+
| `cargo` | rustls or OpenSSL | Yes | Yes |
57+
| `gh` (GitHub CLI) | Go crypto/x509 on Darwin | **No** | **No** |
58+
| Any Darwin-Go binary compiled against `Security.framework` | Go crypto/x509 on Darwin | **No** | **No** |
59+
60+
### gh Caveat
61+
62+
`gh` on macOS links against `crypto/x509/root_darwin.go`, which always calls
63+
`Security.framework` for trust evaluation and ignores `SSL_CERT_FILE`. The
64+
sandbox blocks Keychain access, so `gh` inherits the failure regardless of
65+
env-var configuration. Neither `GODEBUG=x509roots=fallback` nor
66+
`GODEBUG=x509usefallbackroots=1` changes this behavior on recent Go versions
67+
because Darwin's `systemRoots` always succeeds-or-errors before the fallback
68+
is consulted.
69+
70+
Two options for `gh` commands specifically:
71+
72+
1. **Bash allowlist (preferred for day-to-day use)**: add patterns to
73+
`global/settings.json` `permissions.allow` so the sandbox-bypass
74+
confirmation prompt does not re-appear for safe `gh` verbs:
75+
76+
```json
77+
"permissions": {
78+
"allow": [
79+
"Bash(gh issue *)",
80+
"Bash(gh pr view*)",
81+
"Bash(gh pr list*)",
82+
"Bash(gh pr checks*)",
83+
"Bash(gh run list*)",
84+
"Bash(gh repo view*)"
85+
]
86+
}
87+
```
88+
89+
2. **Rebuild gh without CGO**: `CGO_ENABLED=0 go install github.com/cli/cli/v2/cmd/gh@latest`
90+
produces a pure-Go binary that respects `SSL_CERT_FILE`. Not recommended
91+
for most users since it forfeits the Homebrew update pipeline.
92+
93+
## Platform Fallback Ladder
94+
95+
`scripts/verify-tls.sh` picks the first readable path when `SSL_CERT_FILE`
96+
is not already set:
97+
98+
| Platform | Path | Notes |
99+
|----------|------|-------|
100+
| macOS (default) | `/etc/ssl/cert.pem` | Present on every supported macOS release |
101+
| macOS (Homebrew) | `$(brew --prefix)/etc/openssl@3/cert.pem` | Fallback when the system path is unreadable |
102+
| Debian / Ubuntu | `/etc/ssl/certs/ca-certificates.crt` | Installed by `ca-certificates` package |
103+
| RHEL / Fedora | `/etc/pki/tls/certs/ca-bundle.crt` | Installed by `ca-certificates` package |
104+
| Windows | N/A | `gh` on Windows uses Schannel and reads the Windows certificate store directly |
105+
106+
`SSL_CERT_DIR` points at a directory of hashed-link CA files. macOS keeps the
107+
directory empty but some Linux distributions populate it; setting both is the
108+
safest default.
109+
110+
## Verification
111+
112+
Run the included verification probe inside a sandboxed session:
113+
114+
```bash
115+
./scripts/verify-tls.sh
116+
```
117+
118+
Expected output (macOS, inside sandbox, after the fix):
119+
120+
```
121+
SSL_CERT_FILE=/etc/ssl/cert.pem
122+
SSL_CERT_DIR=/etc/ssl/certs
123+
124+
[FAIL] gh api user
125+
(expected — see "gh Caveat" above)
126+
[OK] curl https://api.github.com
127+
[OK] git ls-remote origin
128+
129+
2 / 4 probes passed. git and curl work without sandbox bypass.
130+
For gh, use the Bash allowlist remediation documented above.
131+
```
132+
133+
The script reports `gh` as a FAIL on macOS even after this fix. That is the
134+
documented behavior, not a regression. `git` and `curl` passing is the goal
135+
of this change.

0 commit comments

Comments
 (0)