Commit be81446
authored
feat(agents+installer): narrowing-role subagents (Phase 5) + spellbook-cco fork wiring (Phase 7) (#284)
* docs: regenerate stale skill and command pages
Output of scripts/generate_docs.py against the current source-of-truth
in skills/ and commands/. No source changes; just catches up the docs
mirror with drift accumulated since the last regen.
* feat(installer): add install_agents component for symlink discovery
New installer/components/agents.py mirrors create_skill_symlinks /
create_command_symlinks with two divergences: skip+warn on user-authored
target files (don't clobber), and verify_source narrowing on uninstall.
Tests cover install (idempotence, broken-symlink replace, user-file
preservation, dry-run, mixed states, non-md filter, source resolution)
and uninstall (spellbook-only narrowing, user-file preservation,
missing-dir, idempotence). 17 cases total.
Wiring into ClaudeCodeInstaller and the cleanup list lands in a
separate commit (Task A2).
* refactor(installer): align install_agents action labels with ui.py vocabulary
Code review and fact-check on commit 3d88d0f2 surfaced three findings:
- "replaced" action label was not recognized by installer/ui.py
(which knows installed/upgraded/created/skipped/failed/unchanged) and
fell through to a generic icon. Renamed to "upgraded".
- Comment on the broken-symlink uninstall branch over-broadened the
check to "raw target lies inside agents dir" when the code actually
checks the raw target's parent dir; corrected.
- Test name "test_non_md_files_in_source_ignored" only covered one of
three behaviors the body verified; renamed (or split) for clarity.
* feat(installer): wire install_agents into ClaudeCodeInstaller
Adds the agents step to ClaudeCodeInstaller.install() (after scripts,
before CLAUDE.md) and the symmetric removal to .uninstall(). Adds a
narrowed install-time pre-cleanup pass that purges stale agent symlinks
(renamed/removed source files, including those pointing into a prior
worktree's agents/) without clobbering currently-valid entries, so
re-install remains idempotent.
Integration tests verify: install creates resolvable symlinks at the
target, dry-run is side-effect-free, the install step's InstallResult
reports counts, uninstall preserves user-authored files, re-install is
idempotent, and stale symlinks from a prior worktree are purged.
* refactor(installer): tighten agents cleanup heuristic and aggregation message
Code review and fact-check on commit 65253279 surfaced four findings:
- The install-time cleanup pre-pass marked an entry stale when its
basename was not in the current spellbook source set, which would
unlink user-authored alias symlinks pointing to valid agent files
under non-canonical names. Switched to a resolved-target-path set
so aliases pointing to valid current sources are preserved.
- The broken-symlink fallback heuristic relied on a "spellbook"
substring match, which produced false positives for broken symlinks
pointing at a *different* spellbook installation. Tightened to
compare the broken symlink's parent dir against the current
installation's agents/ dir via exact resolved-path equality.
- The aggregation message dropped the failed count, leaving the
user with success=False and no breakdown when a partial failure
occurred. Now includes "{N} failed" when nonzero.
- Reworded the uninstall-side comment block to accurately describe
what cleanup_spellbook_symlinks does (it also removes any broken
symlink unconditionally, not just spellbook-substring matches).
* feat(agents): add canonical implementer.md and schema validation test
Adds the canonical narrowing-role agent (implementer.md) as the seed
for the remaining 8 new agents. tools: frontmatter narrows to
Edit, Write, Read, Grep, Glob, Bash — Bash is gated by the L4 PreToolUse
hook for tier-based authorization. Body uses the 5-section schema
(Purpose, Tools, Output Schema, Guardrails, Constraints).
Schema test parametrizes over the 9 expected new agents (validating
whichever currently exist) and the 7 existing agents (byte-identical
snapshot). The snapshot file at tests/test_security/agent_snapshots.json
guards against accidental modification of the existing 7 during WI-5
work. code-reviewer.md and justice-resolver.md are exempt from the
tools-presence check pending a separate cleanup task.
* refactor(agents): describe current bash gate, harden schema test
Fact-check on commit 508665c2 found that implementer.md described the
WI-6 L4 tier classifier as the bash gate, but that infrastructure
doesn't exist on main yet (lives on security-architecture-phase-3).
Rewrote the description, body, guardrails, and constraints to describe
the spellbook PreToolUse bash gate that exists today — pattern-based
denial of dangerous commands. WI-6 can later expand the language when
tier-based narrowing lands on main.
Code review additionally suggested:
- Body section ordering test now matches headings as full lines,
immunizing against future agent files that include literal '## X'
inside fenced code blocks.
- Replaced opaque "F3" reference in TOOLS_EXEMPT_EXISTING comment with
inline rationale.
- Clarified the test_tools_exempt docstring to enumerate the three
steps required when removing the exemption (regenerate snapshot,
delete entry, delete test).
* feat(agents): add 8 narrowing-role subagents
Author the 8 remaining narrowing-role subagent files in agents/, each
conforming to the canonical 5-section schema established by
agents/implementer.md (Task B). Each file's `tools:` frontmatter is a
narrowing list — the agent has access to those tools and only those
tools, never more — and matches the canonical mapping in
tests/test_security/test_agent_frontmatter.py.
- web-researcher: WebFetch, WebSearch, Read. Quarantined web research
returning structured findings. Flagged in Constraints as requiring
WI-8 (devcontainer) before being safe to dispatch in production.
- git-committer: Bash, Read. Local git only — read, status, diff, log,
add, commit, branch, fetch, worktree. Never pushes.
- git-pusher: Bash, Read. Only `git push`. Operator confirmation
required for every push.
- pr-creator: Bash, Read. Only `gh pr create`, `gh pr edit`,
`gh pr view`, `gh pr diff`, `gh pr list`. No merge or ready-mark.
- pr-merger: Bash, Read. Only `gh pr merge`, `gh pr ready`. Operator
confirmation required for every merge and ready-mark.
- jira-reader: Read (frontmatter); Atlassian MCP read tools at
runtime. Read-only Jira inspection, no mutations.
- jira-mutator: Read (frontmatter); Atlassian MCP read+write tools at
runtime. Operator confirmation required for state transitions.
- test-runner: Bash, Read, Grep. Test commands only (pytest, npm test,
etc.); no source edits, no git side effects.
The schema test parametrizes over `_existing_new_agents()` so all 9
narrowing-role agents (the canonical implementer plus these 8) now
validate against the canonical contract: 55 passed, 1 skipped (regen
mode). Full security suite remains green: 389 passed.
* refactor(agents): align bash-gate framing with current gate scope
Two groups of fixes from code review + fact-check on the 8 narrowing-role
agent files added in the prior commit.
Group A — gate-attribution accuracy. Five files (git-committer, git-pusher,
pr-creator, pr-merger, test-runner) over-claimed that the spellbook
PreToolUse bash gate blocks specific git/gh subcommand patterns
(e.g. `git push`, `git reset --hard`, `gh pr merge`, `gh pr ready`,
`--force` to protected branches, `gh pr merge --admin`, "any git side
effects"). The actual gate blocks generic escalation/destructive shell
patterns, not per-agent subcommand allow-lists. Softened the gate-claim
language to match the canonical seed in agents/implementer.md
("destructive shell idioms, exfiltration shapes") and reframed the
guardrails so operator confirmation and agent role separation are
the primary enforcement, with the bash gate as defense-in-depth. The
role-contract MUST NOT bullets stay — only the false gate attribution
is removed.
Group B — reviewer wording fixes:
* jira-mutator description: "expected" → "REQUIRED" for state
transitions, aligning with sibling git-pusher / pr-merger phrasing
and the existing body Guardrail.
* git-pusher Output Schema: renamed `remote` → `remote_refspec` and
rewrote the description to clarify the `<remote>/<ref>` format with
examples that include nested refs. Updated the `required` array.
* git-pusher upstream-verify guardrail: rewrote to handle the
first-push case (no upstream yet) without ambiguity.
* test-runner Tools: dropped the `cat` example; `Read` handles file
content and `ls`/`find` cover read-only Bash verbs. Added a
sentence explaining why `Glob` is omitted (broader than the
test-runner's scoping discipline; `find` invocations inherit the
bash-gate's scoping constraints).
* jira-reader / jira-mutator descriptions: appended the
narrowing-list + MCP runtime-discovery semantics so the
declarable `tools: Read` is no longer surprising.
Verifications:
- tests/test_security/test_agent_frontmatter.py: 55 passed, 1 skipped
- tests/test_security/ full suite: 389 passed, 1 skipped
- tests/installer/ full suite: 61 passed
- Forbidden term scan (L4 / tier classifier / tier-based / capability
escalation / WI-4): zero matches.
- Forbidden claim scan (gate also blocks / gate blocks gh pr / gate
blocks git push|reset|checkout|stash): zero matches.
agents/implementer.md is the canonical seed and was not modified.
* test(hooks): isolate Stop-hook tests from real cache and worker-LLM gate
`TestStopHookHarvest::test_stop_hook_harvests_single_candidate` (and
sibling tests in the same class) was order-dependent in two ways:
1. `STOP_HARVEST_CACHE_PATH` resolves at import time to
`~/.local/spellbook/cache/last-stop-harvest.json`. The class's
methods that exercise the success path call `_record_stop_harvest`
on the real on-disk cache. Subsequent runs whose `tmp_path/
session.jsonl` collided with a cached entry got short-circuited at
the idempotency check, leaving `mock_http.calls == []` and tripping
the assertion. Reproduced deterministically by running with
`--basetemp=<previously-cached-pytest-N-dir>`.
2. `_handle_stop` calls `feature_enabled("transcript_harvest")` from
`spellbook.worker_llm.config`, which reads the user's real
`spellbook.json`. If any test (or env override) flipped that flag
on, the hook took the worker-LLM branch and never POSTed to
`/api/memory/unconsolidated`.
Add an `autouse` fixture on `TestStopHookHarvest` that:
- rebinds `STOP_HARVEST_CACHE_PATH` to a per-test path under
`tmp_path` (no leakage to or from `~/.local`);
- forces `feature_enabled` to return False so the regex path is
taken regardless of user config or leaked monkeypatches.
Tests that already monkeypatch `STOP_HARVEST_CACHE_PATH` themselves
(idempotent / records-sha / retries-on-failure / network-failure)
still override the autouse stub via `monkeypatch.setattr` ordering;
no change to their behavior.
No production code changes. Full suite: 4564 passed, 132 skipped.
* test(agents): close green-mirage gaps in WI-5 schema and uninstall coverage
Address findings F1-F4 from the green-mirage audit on the WI-5 surface.
F1: Add `test_all_expected_new_agents_exist` — set-equality assertion that
every key in EXPECTED_NEW_AGENTS has a matching file in agents/. Without
this, `_existing_new_agents()` would silently shrink its parametrize input
if an expected agent file were deleted, and the per-agent schema tests
would still pass on the smaller set.
F4: Remove `test_canonical_implementer_exists`. F1's set-equality check
strictly subsumes it (implementer.md is in EXPECTED_NEW_AGENTS).
F2: Strip fenced code blocks before scanning for required headings in
`test_new_agent_has_required_body_sections_in_order`. The prior
"\n{heading}\n" substring scan would still false-match a heading literal-
embedded in a fenced block; the docstring overclaimed immunity. New
helper `_strip_fenced_blocks` (DOTALL ``` ... ``` removal) plus updated
comment make the assertion match its stated intent.
F3: Add two tests in `TestUninstallAgents` covering the previously
unreached broken-symlink branch in `uninstall_agents`:
- `test_uninstall_removes_broken_symlink_into_spellbook`: positive case,
parent-dir resolves into $SPELLBOOK/agents/ -> link is removed.
- `test_uninstall_preserves_broken_symlink_pointing_elsewhere`: negative
case, parent-dir is OUTSIDE the spellbook -> link is preserved and
its readlink target is unchanged.
Verification:
- tests/test_security/test_agent_frontmatter.py + tests/installer/test_agents_symlink.py: 82 passed, 1 skipped (snapshot regen).
- Full suite: 4566 passed, 132 skipped, 531 deselected.
* docs(changelog): add Phase 5 narrowing-role subagents entry
Document the user-facing surface introduced by WI-5: nine narrowing-role
agents in agents/ and the per-config-dir symlink installer that lands
them under $CLAUDE_CONFIG_DIR/agents/. Covers the narrowing-vs-granting
semantics so users understand the tools: frontmatter contract.
* docs(claude): record PR review bot info
* test(installer): register posix_only/windows_only pytest marks
Add two new pytest marks for platform-conditional skipping:
- posix_only: auto-skipped on Windows (sys.platform.startswith('win'))
- windows_only: auto-skipped on POSIX
Wire them into tests/conftest.py's pytest_collection_modifyitems hook
alongside the existing requires_memory_tools and docker handling.
Foundational scaffolding for WI-7 (security architecture phase 7);
subsequent platform-dispatched installer tests rely on these marks.
* test(installer): polish posix_only/windows_only mark scaffolding
Address review nits: prose marker descriptions, module docstring
explaining placement under tests/installer/, clearer comment on
_FakeConfig pytest plugin shape, and simplified _patch_platform
fixture docstring.
* feat(sandbox): pin cco at SHA 9744b9f for L5 Linux sandbox
Pin nikvdp/cco at commit 9744b9f (2026-04-30) inside scripts/spellbook-sandbox
per Sec 9.3 audit (2026-05-06). Linux bwrap backend audited PASS:
kernel-managed /proc in unshared PID namespace, --cap-drop ALL,
--die-with-parent, seccomp TIOCSTI/TIOCLINUX filter.
macOS L5 documented as intentionally absent; macOS sessions rely on L4
PreToolUse hooks (shipped) and L6 devcontainer (WI-8 planned) instead of
cco-based isolation, since cco's sandbox-exec profile is write-protect-only
((allow default) baseline, no process-fork/exec deny, no DYLD scrubbing).
Pinning mechanism: a CCO_PINNED_SHA constant near the top of the script
plus a runtime gate that compares the short SHA reported by 'cco --version'
(cco resolves --version from 'git rev-parse --short HEAD' of its install
dir at ~/.local/share/cco). The gate aborts on mismatch unless the
operator sets SPELLBOOK_SANDBOX_SKIP_CCO_PIN=1 (audited downgrade).
Tests in tests/installer/test_aliases.py (posix_only):
- pin SHA parsed from a structured header line == 9744b9f
- Sec 9.3 audit doc filename present
- macOS L5 rationale anchor phrase present in script or sidecar
- executable bit preserved
- --help smoke test (cco-not-found stderr is exact-matched)
Part of WI-7 security architecture phase 7.
* test(sandbox): tighten cco SHA pin tests + improve error guidance
Address review findings from Task 2 of WI-7:
- H1: integration test now branches on three exit conditions
(success, cco-absent, SHA-mismatch); robust on developer machines
with cco installed at a non-pinned SHA
- M1: new posix_only test simulates fake cco shim with bad/empty/
mismatched --version output, asserts fail-closed exit
- M2: SHA-mismatch error message now includes cco install URL
- L1: documented legitimate use case for SPELLBOOK_SANDBOX_SKIP_CCO_PIN
- L2: moved posix_only mark from module-level to per-test, allowing
static-content checks to run on all platforms
* docs(sandbox): correct WI attribution for L4 components
bashlex AST is WI-6a's deliverable; secret-path denylist is WI-6d.
Parenthetical now lists both, matching impl plan attribution.
* feat(installer): add install_aliases_windows() stub for Q-O deferral
Add a documented stub for Windows alias install that returns a noop
dict matching install_aliases()'s return shape. Windows alias install
and sandbox path are deferred to a later WI (Q-O); cco lacks a
Windows backend and spellbook-sandbox is POSIX-only.
The stub satisfies the API contract for the upcoming claude_code.py
platform dispatch (Task 4 of WI-7) without committing to Windows
behavior that hasn't been audited.
Tests at tests/installer/test_aliases_windows.py marked windows_only;
skipped on POSIX dev machines; exercised when a Windows CI is added.
* fix(installer): polish Windows alias stub per review
Address Phase 4.5 review findings on Task 3/5:
- M1: tests now monkeypatch Path.home and snapshot it too, catching
any regression that writes to the user's actual rc files
- M2: skipped_reason switched to human-readable prose, matching the
existing skip-path string style
- L1 (pre-existing): install_aliases() docstring now correctly types
rc_path as str | None (matches actual return on skip paths)
- L2: stub docstring now sketches the production implementation
scaffolding (PowerShell $PROFILE, marker reuse, return shape)
- N1: stub log message surfaces dry_run for operator visibility
* feat(installer): platform-aware Claude Code alias dispatch
Add a dispatch helper in installer/platforms/claude_code.py that routes
alias install based on get_platform():
- LINUX -> install_aliases() (wraps `claude` -> spellbook-sandbox)
- MACOS -> documented noop; cco sandbox-exec is insufficient per Sec 9.3
- WINDOWS -> install_aliases_windows() (Q-O stub from Task 3)
Unknown platforms raise NotImplementedError to surface future enum
additions explicitly. The existing install.py:1148 interactive offer
remains as a separate opt-in path; this is the platform-install-time
path that runs unconditionally during Claude Code platform install.
* fix(installer): harden Claude Code alias dispatch per review
Address review findings on commit 24ba6861 (Phase 4.5 review of WI-7
Task 4/5):
- F1: dispatch call in ClaudeCodeInstaller.install() is now wrapped in
try/except, so a bad rc-file does not abort the security-critical
hooks install step that follows
- F2: LINUX branch now checks shutil.which("cco") and noops with a
clear skipped_reason if cco is missing, matching the legacy
install.py:1126 gating pattern
- F3: new behavioral test verifies the dispatch is actually called
from install() and produces an "aliases" InstallResult in the
returned list
- F4: new test pins the F1 contract -- _install_claude_code_aliases
raising does NOT abort install(); subsequent components still run
(CLAUDE.md, hooks)
- F5/F6: comments clarify the success=True+action=skipped convention
and document the unknown-platform test's scope (control-flow vs
contract assertion)
The existing LINUX dispatch test now stubs shutil.which so the cco
gate is deterministic on developer machines. A new
test_dispatch_linux_skips_install_aliases_when_cco_missing covers the
F2 negative path with caplog assertion on the operator-facing message.
* docs(readme): document Windows alias + sandbox deferral (WI-7 Q-O)
Add a Windows subsection under Sandboxed Usage that records the
Phase 7 deferral of Windows alias install and sandbox path (open
question Q-O). Documents the current installer noop behavior, the
cmd.exe doskey limitation, and points Windows users at WSL2 + the
Linux install path for full sandboxing.
* fix(docs): tighten Windows TBD note per review
- Direct WSL2 readers to install cco inside the WSL distro before running the installer (L1)
- Tighten the direct-invoke bullet now that the WSL2 bullet covers cco placement (N1)
- Replace "documented noop" with "intentional noop" for clarity (N2)
* test: mark pre-existing test_aliases.py POSIX-only
These tests assume POSIX rc-file paths and "\n" line endings; mark them
POSIX-only so Windows runners skip them via the conftest auto-skip hook.
* chore: add ruff to dev deps + minimal config
Adds `ruff>=0.6.0` to the dev dependency group and a minimal
`[tool.ruff]` configuration block (line-length=100, target-version=py311,
default lint rules, `__init__.py` F401 carve-out, and `versions/` Alembic
revisions excluded).
Also fixes the `bigfoot[http]` / `python-tripwire[http]` extras: those
extras were dropped in python-tripwire 0.21+, leaving the test suite
unable to import `tripwire.plugins.http` (the `bigfoot.http.mock_response`
shim used in tests/test_worker_llm/conftest.py). Drops the `[http]`
markers and adds explicit `requests>=2.31` since the http plugin needs
it. Without this, `uv run pytest` fails on first worker_llm test.
target-version is py311 (not py310 to match `requires-python`) because
the codebase uses 3.11 syntax (`except*` in spellbook/admin/routes/ws.py).
That mismatch is pre-existing and out of scope for this commit; ruff's
target-version follows the actual code.
* style: ruff auto-fix + manual cleanup (pre-existing violations)
Resolves all 506 ruff lint violations on the branch so that
`uv run ruff check .` exits 0.
Approach (per plan):
1. `ruff check --fix` (safe-fix only) — 394 violations auto-fixed
(mostly F401 unused imports, F541 empty f-strings).
2. Manual fixes for the 111 remaining (F841 with unsafe-fix, plus E402,
E731, E702, E741, E712, F821).
Notable manual carve-outs:
- installer/compat.py: restored re-exported names that ruff removed as
F401 (`UnsupportedPlatformError`, `ServiceManager`, `get_data_dir`).
These are part of the documented re-export contract; added explicit
`# noqa: F401` markers.
- tests/test_spellbook_mcp/test_server_no_skill_tools.py: restored
`from spellbook import skill_ops` inside `with pytest.raises(ImportError):`
— the import IS the test.
- spellbook/sessions/watcher.py: restored `import asyncio` (F821 — the
auto-fix removed it but lines 275/278 use `asyncio.get_running_loop()`
/ `asyncio.run()`).
- E402 sites with `logger = logging.getLogger(__name__)` between imports
carry a `# noqa: E402` with rationale.
Test status:
- `uv run ruff check .` → All checks passed!
- `uv run pytest -x` → 4508 passed, 135 skipped, 531 deselected.
* fix(installer): broaden F1 test assertion + fix audit doc path reference
- F1 contract test now also asserts `mcp` component runs after alias dispatch raises (per Phase 4.6.4 fact-check finding on test_install_does_not_abort_when_dispatch_raises)
- Drop non-existent `docs/security/audits/sec_9_3_result.md` path from _install_claude_code_aliases docstring; the script header carries the canonical path
* test(installer): update test_aliases.py for spellbook-cco fork (Task 7)
Flips the audit-pinned SHA constant to the elijahr/cco fork tip
(d7044ef) and adds EXPECTED_VANILLA_CCO_SHA (9744b9f) as the
fallback constant for SPELLBOOK_USE_VANILLA_CCO=1 rollback coverage.
Updates MACOS_RATIONALE_PHRASE to match the revised header phrasing
landed by Task 6 in scripts/spellbook-sandbox.
Renames the fail-closed-gate shim from cco to spellbook-cco so
the three SHA-mismatch parametrize cases drive the gating binary
the script now actually checks. Updates --help-runs-cleanly to look
for the new error shape.
Adds a new truth-table-row test
(test_sandbox_both_skip_vars_set_to_one_warn_and_skip) covering the
SKIP_PIN=1 + SKIP_CCO_PIN=1 case, which the existing class did
not exercise: new var wins skip precedence (gate skipped), legacy
var still triggers DEPRECATION because operator is using the
deprecated name.
Consolidates the Task-6-local EXPECTED_FORK_SHA / EXPECTED_VANILLA_SHA
constants to alias the module-level constants so the SHA values live
in exactly one place.
* feat(installer): add spellbook_cco component for fork-pinned cco wrapper
Introduce installer/components/spellbook_cco.py: clones the elijahr/cco fork at audit-pinned SHA d7044ef into ~/.local/spellbook/cco, writes ~/.local/bin/spellbook-cco with a managed-tag header, and verifies the pin via two-step check (git rev-parse + cco --version awk-parse).
The wrapper exposes the hardened fork's macOS SBPL profile (DYLD scrub + file-read denies + scoped process-exec deny + mach-priv-task-port deny) to the rest of spellbook. emit_rollback_warning() is exported as the single source of truth for the SPELLBOOK_USE_VANILLA_CCO=1 rollback warning. SPELLBOOK_INSTALLER_SKIP_FORK_PIN=1 bypasses pin verification with a stderr WARNING.
* test(installer): add tests for spellbook_cco component
Adds 22 tests covering Tier-0 constants, Tier-1 mocked-shape paths, and Tier-2 paths exercised against a file:// bare-repo fixture whose stub emits the fixture's actual head_sha. Tests cover dry-run shape, real clone-and-verify, rollback on each pin-verify step in isolation, idempotency, namespaced wrapper tag, untagged-wrapper warning, dirty/wrong-remote aborts, PATH warnings, SKIP_FORK_PIN bypass, vanilla-rollback env var, Windows shape-noop, uninstall happy path, untagged-preserve, idempotency, dry-run, and a darwin smoke.
* feat(installer): wire spellbook_cco install/uninstall into Claude Code platform
Adds a once-globally install_spellbook_cco hook to ClaudeCodeInstaller.install() that runs before the per-config-dir loop, and a mirroring uninstall_spellbook_cco call in uninstall(). Merges the Linux and macOS branches in _install_claude_code_aliases into a shared POSIX path that gates on shutil.which(sandbox_binary), where sandbox_binary defaults to 'spellbook-cco' (or 'cco' under SPELLBOOK_USE_VANILLA_CCO=1). Updates the L5 macOS rationale comment block to cite the elijahr/cco fork's hardened SBPL profile, and rewrites all 'cco not on PATH' diagnostics to point at re-running install.py. The rollback WARNING is emitted via the imported emit_rollback_warning() so installer-side warnings stay byte-equal across call sites.
* refactor(installer): tui detects spellbook-cco binary on PATH
Replaces shutil.which('cco') in the TUI install offer with shutil.which(sandbox_binary), where sandbox_binary defaults to 'spellbook-cco' (or 'cco' under SPELLBOOK_USE_VANILLA_CCO=1). Updates the rationale string to reference the audit-pinned elijahr/cco fork and removes the inline nikvdp/cco install URL in favor of pointing operators at install.py. Emits the canonical rollback WARNING via emit_rollback_warning() under the env-var override.
* refactor(installer): install.py offers spellbook-cco aliases
Replaces shutil.which('cco') in the post-install alias offer with shutil.which(sandbox_binary), and extracts the offer block into _offer_sandbox_aliases() so the TUI and CLI flows share a single implementation. Under SPELLBOOK_USE_VANILLA_CCO=1 the offer falls back to gating on 'cco' and emits the canonical rollback WARNING via emit_rollback_warning().
Adds tests/installer/test_cco_entry_points.py with 7 tests covering both tui.py and install.py wiring: default-path detection, env-override routing, WARNING emission on env override, and WARNING-absence regression guards on the default path.
* refactor(scripts): spellbook-sandbox gates on spellbook-cco fork pin
Rewrites scripts/spellbook-sandbox to gate on the elijahr/cco fork's spellbook-cco wrapper and verify the d7044ef pin via 'spellbook-cco --version'. SPELLBOOK_USE_VANILLA_CCO=1 falls back to vanilla 'cco' at the prior 9744b9f pin and emits a script-prefixed WARNING that the SBPL profile is bypassed.
Adds the SPELLBOOK_SANDBOX_SKIP_PIN env var; SPELLBOOK_SANDBOX_SKIP_CCO_PIN remains accepted for one release with a DEPRECATION warning per a documented truth table. The truth-table predicates use POSIX ${VAR+x} to distinguish unset-vs-empty so set-but-empty does not fall through to the legacy var.
* docs(readme): describe spellbook-cco onboarding instead of vanilla cco
install.py now writes ~/.local/bin/spellbook-cco automatically (audit-pinned elijahr/cco fork at d7044ef); the wrapper is the only spellbook-supported entry point. Vanilla nikvdp/cco is retained solely for the post-deploy rollback path via SPELLBOOK_USE_VANILLA_CCO=1. References docs/security.md for the full threat model and hardening details.
* test(installer): assert spellbook-cco which() interaction in alias-dispatch tests
WI-7's `_install_claude_code_aliases` calls `shutil.which("spellbook-cco")`
to gate the per-dir alias install on the wrapper's presence. Tripwire's
SubprocessPlugin always intercepts `shutil.which`; without an explicit
`assert_which` after the sandbox closes, the strict verifier raises
`UnassertedInteractionsError` at teardown when the binary is absent from
PATH.
This is the case in CI (ubuntu-latest and the non-dev macOS runner shape)
but NOT on the macOS dev environment, where `spellbook-cco` IS on PATH
during local runs -- masking the failure during pre-merge verification.
The dispatcher only runs on POSIX (LINUX/MACOS); Windows takes the
`install_aliases_windows` branch which does not call `shutil.which` for
the cco wrapper. Hence the new helper / inline assertion is gated on
`sys.platform != "win32"`, mirroring the existing
`_assert_powershell_which_if_windows` pattern.
Tests fixed (all in alias-dispatch sandboxes):
- tests/installer/test_claude_code_wires_default_mode_and_permissions.py:
- test_install_emits_default_mode_result
- test_install_emits_permissions_result
- test_install_writes_acceptedits_default_mode
- test_uninstall_emits_default_mode_and_permissions_results
- test_uninstall_clears_managed_default_mode_from_settings
- tests/installer/test_l2_derivation.py:
- test_claude_code_installer_uses_derived_deny
The assertion is exact-equality on both `name` and `returns`: any change
to the dispatcher's gating call (different binary name, dropped call,
duplicate call, real-path return) would fail the assertion.
* fix(tests): pin LF line endings for agent snapshot test
The WI-5 byte-identity snapshot test
(tests/test_security/test_agent_frontmatter.py::
test_existing_agent_byte_identical_to_snapshot) computes SHA-256 over
the raw bytes of each existing agent file and compares against a
committed snapshot generated on macOS (LF line endings).
On Windows CI, git's default checkout converts LF to CRLF, so
Path(...).read_bytes() returns CRLF-encoded bytes whose SHA does not
match the committed LF-based snapshot. All 7 existing-agent snapshot
parameter cases fail on Windows for this reason. Verified locally:
the same file yields different SHAs under LF vs CRLF.
Pin LF on the agent markdown files and on the snapshot file itself via
.gitattributes. This makes byte-identity platform-agnostic by
construction without changing the snapshot's semantics: "byte-identical
to the committed snapshot" continues to mean exactly that, regardless
of host OS or git autocrlf configuration.
Cannot directly verify Windows behavior from macOS; the fix is
mechanical (git's documented gitattributes contract) and the local
test suite continues to pass (55 passed, 1 skipped).
* docs(security): rename cco section to spellbook-cco and document rollback
Rename '## Sandboxing with cco (macOS)' to '## Sandboxing with spellbook-cco (Linux + macOS)' so the auto-anchor matches the README link target ('#sandboxing-with-spellbook-cco-linux--macos'). Replace 'cco' with 'spellbook-cco' across the section body and CLI examples where the reference is to spellbook's own invocation; preserve 'cco' in the upstream nikvdp/cco attribution and in vanilla-fallback prose. Replace the now-stale 'curl | bash' install snippet with a one-liner noting that the spellbook installer handles spellbook-cco.
Add a new '### Rolling back to vanilla cco' subsection covering the SPELLBOOK_USE_VANILLA_CCO=1 env override: when to use it (catastrophic post-deploy breakage), the SBPL/DYLD hardening that the override bypasses, and how to remove the override afterward. The runtime WARNING wording remains owned by installer/components/spellbook_cco.py:_WARNING_USE_VANILLA_CCO; this doc paraphrases the *why* rather than duplicating the warning bytes.
* docs(changelog): add spellbook-cco fork integration entry
Document the user-facing surface introduced by Task 13 of the spellbook-cco integration: new installer/components/spellbook_cco.py component, hardened SBPL profile (DYLD scrub + dylib-path denies + scoped process-exec + mach-priv-task-port deny), pin-bypass and rename of SPELLBOOK_SANDBOX_SKIP_PIN env var, macOS feature parity via install_spellbook_cco chained before install_aliases, and SPELLBOOK_USE_VANILLA_CCO=1 rollback path. Placed under the [0.57.0] banner adjacent to the prior Phase 5 narrowing-role subagents entry to keep all security architecture Phase 5 work grouped on this branch.
* test(installer): mark spellbook_cco install tests posix_only
The install_spellbook_cco() function short-circuits to
{"action": "skipped", "skipped_reason": "spellbook-cco unavailable on
Windows"} when os.name == "nt". Eleven tests in
tests/installer/test_spellbook_cco.py drive the install path expecting
the POSIX result shape (action="installed", a clone on disk, a wrapper
file with shebang, etc.) and so fail on Windows CI under the
short-circuit.
Mark those eleven tests with @pytest.mark.posix_only so the existing
conftest.py skip rule excludes them on Windows. The Windows shape is
already covered indirectly by the constant tests at the top of the
file (which import the module and read its public attributes), and
the early-return is a one-line constant that does not need its own
test.
Tests deliberately left unmarked: the Tier-0 constant assertions
(test_pinned_sha_constant et al.), the dry-run short-circuits
(test_install_dry_run_does_not_clone, test_uninstall_dry_run_does_not_remove)
which return before the Windows check, and all uninstall tests since
uninstall_spellbook_cco has no Windows short-circuit.
* test(installer): tighten green-mirage assertions on canonical strings
Phase 4.6.3 green-mirage audit surfaced 6 IMPORTANT findings on the
spellbook-cco fork integration test suite — all substring-on-known-
canonical patterns where production code is correct but the assertions
were too lax to defend against silent prose drift.
- Import _WARNING_USE_VANILLA_CCO / _WARNING_PATH_NOT_SET /
_WARNING_SKIP_FORK_PIN as the chokepoint and assert full stderr
equality (or k * canonical for double-emit codepaths) instead of
fragmented "WARNING:" + "VAR=1" substring pairs.
- Replace partial InstallResult field-by-field assertions in chain-
wiring tests with full-record equality on the dataclass.
- Add a shared events list to test_install_chains_install_spellbook_cco
to verify that install_spellbook_cco runs strictly before the per-dir
alias dispatcher, matching the docstring contract.
- Tighten lax message substrings ("cco" / "spellbook-cco") to canonical
skipped_reason strings produced by _install_claude_code_aliases.
No production code changes. 63 tests pass, ruff clean.
* fix: remove dead expressions left by ruff F841 cleanup
Three call-statements whose return values were dropped during the
branch-wide `ruff --fix` pass (commit 983fe62b) but whose expressions
remained as orphan statements with no side effect:
- spellbook/admin/routes/fractal.py:73 — `max(1, math.ceil(total / per_page))`
was originally `total_pages = ...`. The variable is unused (pagination
is computed inside build_list_response), so the line is pure dead. Also
drop the now-unused `import math`.
- spellbook/cli/commands/session.py:119 — `getattr(args, "json", False)`
was originally `json_mode = ...`. The variable is unused.
- spellbook/mcp/routes.py:208 — `bool(body.get("is_primary", False))`
was originally a temp assignment. The value is unused.
Flagged by gemini-code-assist review on PR 284. Not the asyncio.to_thread
call in routes/config.py:688 — that one is the actual write side effect,
not dead.
* test(installer): convert monkeypatch.setattr to tripwire in alias + cco entry-point tests
Address gemini-code-assist HIGH findings on PR #284: replace
monkeypatch.setattr / monkeypatch.setitem call sites in three
tests/installer files with the canonical tripwire patterns:
- tripwire.mock module-attribute mocks (get_platform, install_aliases,
install_aliases_windows, install_spellbook_cco, uninstall_spellbook_cco,
_install_claude_code_aliases, install_aliases module attrs).
- tripwire.subprocess.mock_which / assert_which for shutil.which gates.
- tripwire.mock for pathlib Path.home patching.
Tests covered:
- tests/installer/test_aliases.py (50 monkeypatch sites, 32 tests).
- tests/installer/test_aliases_windows.py (2 sites, Path.home mocking).
- tests/installer/test_cco_entry_points.py (9 sites).
All tests pass locally. monkeypatch.setenv and monkeypatch.delenv are
preserved per the AGENTS.md allowed-monkeypatch list (env vars only).
Per AGENTS.md Testing with Tripwire canonical section: tripwire is the
only acceptable mocking framework; monkeypatch.setattr on module
attributes, class methods, or functions is forbidden.
* fix(tests): missing WebSocketDisconnect import + align marker-description assertions
CI failures from prior commit f5785298:
- tests/admin/test_ws.py used WebSocketDisconnect without importing it
- tests/installer/test_marks.py asserted old marker descriptions; pyproject.toml
carries clearer text via a prior commit on main
* test(diagram): convert test_diagram_update.py from unittest.mock to tripwire
Addresses gemini HIGH finding at tests/unit/test_diagram_update.py:270 plus
the rest of the file's unittest.mock.patch and mock.AsyncMock sites that
gemini's single anchor finding stood in for.
SUT refactor:
- scripts/generate_diagrams.py: extract _get_repo_root() callable wrapping
the bare REPO_ROOT module constant. Tripwire cannot intercept module-
level non-callable attribute reads, so the indirection layer is what makes
get_source_diff and the PATCH metadata path testable through the strict
verifier.
- main_async() now accepts an optional argv parameter so tests can drive
argparse without monkeypatching sys.argv (also forbidden under the
tripwire-only mocking policy).
Test rewrite:
- subprocess.run mocked via tripwire.subprocess.mock_run / assert_run
(requires @pytest.mark.allow("subprocess")).
- Module-level async/sync functions mocked via tripwire.mock("module:attr")
with .returns()/.calls(coro).
- builtins.input mocked via tripwire.mock("builtins:input").calls(...).
- Asserting a function is NOT called is done by leaving it unmocked;
tripwire's strict verifier raises UnmockedInteractionError if invoked.
This is stricter than the old .assert_not_called() pattern.
- Multi-mock assertion blocks wrapped in tripwire.in_any_order() since
the discover/classify/generate fan-out has non-deterministic ordering.
* test(memory): convert test_memory_auto_store.py from monkeypatch to tripwire
Addresses gemini HIGH findings at tests/test_hooks/test_memory_auto_store.py:338
(STOP_HARVEST_CACHE_PATH constant) and :347 (function monkeypatches), plus all
other monkeypatch.setattr sites in the file.
SUT refactor (hooks/spellbook_hook.py):
- Extract _get_stop_harvest_cache_path() callable wrapping the bare
STOP_HARVEST_CACHE_PATH module constant. Tripwire cannot intercept
module-level non-callable attribute reads, so the indirection layer
makes the Stop-hook idempotency cache testable through the strict
verifier.
- Convert datetime.now(timezone.utc) in _emit_hook_error to use the
existing _utcnow() helper. Test for oversized prompt can now freeze
time via tripwire.mock('spellbook_hook:_utcnow') instead of replacing
the module-level datetime class.
Test rewrite:
- _CountingMock + _make_counting helper: wraps tripwire.mock(...) with
a fire counter and pre-stacks a FIFO budget of repeatable .calls(fn)
configs (50 entries with required(False)). Required because the SUT
invokes the same mock multiple times per test, and tripwire's FIFO
semantics consume one entry per call.
- mock_http controller exposes .track(counting_mock) and .drain().
drain() satisfies tripwire's strict 'every interaction must be
asserted' contract by issuing one wildcard assert_call per recorded
call, wrapped in in_any_order(). Content assertions on
controller.calls remain in each test — strength is unchanged.
- Autouse stop_hook_cache_path fixture (renamed from
_isolate_stop_hook_state) registers both the cache-path and
feature_enabled mocks via _make_counting, auto-tracks them on
mock_http if present, and returns the per-test cache Path so tests
that need to inspect the cache file have a stable handle.
- Tests that needed monkeypatch.setattr(Path, 'home', ...) now use
the stop_hook_cache_path fixture directly. Tests that previously
monkeypatched _http_post to a flaky callable now use
mock_http.set_handler() — a per-call hook on the controller that
composes cleanly with the existing tripwire mock without re-
registering the import-site mock mid-test.
* test(cco): convert test_spellbook_cco.py from monkeypatch to tripwire
The 32 monkeypatch.setattr sites in tests/installer/test_spellbook_cco.py
replaced module-level Path/str constants and the _verify_pin helper,
which tripwire cannot intercept directly. Two-part fix:
1. SUT: install_spellbook_cco / uninstall_spellbook_cco / _write_wrapper
/ _clone_or_fetch / _verify_pin gain keyword-only override args
(repo_url, pinned_sha, wrapper_dir, wrapper_path, verify_pin_fn).
Production callers leave them None and the SUT falls back to the
existing module-level getters (_get_repo_url / _get_pinned_sha /
_get_wrapper_dir / _get_wrapper_path); tests pass explicit values.
2. Tests: drop all monkeypatch.setattr on module constants. Tier 1
tests (subprocess plumbing mocked) use tripwire.subprocess.mock_run
inside a sandbox; Tier 2 tests (real file:// fixture repo) run
outside the sandbox with real subprocess and pass redirected paths
directly to the SUT via the new keyword args.
22 tests pass. Production behaviour is byte-for-byte unchanged when
callers do not pass the new keyword args.
* refactor(agents): extract stale-symlink cleanup; handle relative targets
Two MEDIUM gemini findings addressed in one commit:
1. installer/components/agents.py:235 -- the uninstall_agents broken-
symlink path previously only honoured ABSOLUTE raw targets via
raw_target.is_absolute(); relative symlink targets were silently
skipped. Fix: resolve the raw target against entry.parent for the
relative case so cleanup is robust to both ln-s styles.
2. installer/platforms/claude_code.py:412 -- the ~50-line install-time
stale-symlink cleanup duplicated logic from uninstall_agents.
Extracted to a new reusable helper cleanup_stale_agent_symlinks in
installer.components.agents. The helper is the single source of
truth for the staleness heuristic (resolved-target set + broken-link
parent-dir match with strict=True/False fallback). claude_code's
install path now calls the helper instead of inlining the loop.
The helper deliberately differs from uninstall_agents in semantics:
uninstall removes ALL spellbook-resolving symlinks; cleanup_stale only
removes symlinks that point at spellbook agents that no longer exist
(renamed / moved source files, worktree-switch stale).
* fix(tests): correct tripwire required(False) ordering in Windows aliases tests
`tripwire.mock(...).returns(value).required(False)` does NOT mark the
already-registered entry as optional; `.required(False)` is a STICKY
flag that applies to SUBSEQUENT .returns() / .calls() / .raises()
calls on the same proxy. The two Windows-aliases tests had the flag
AFTER the value-stacking call, so the lone FIFO entry inherited the
default required=True. When ``install_aliases_windows`` did not invoke
Path.home (the SUT's dry-run path short-circuits before the call),
tripwire raised UnusedMocksError at teardown -- only visible on the
Windows CI runner where these tests are not skipped.
Fix: ``home_mock.required(False).returns(fake_home)``. The sticky flag
is set before the FIFO push so the entry inherits required=False and
no longer trips UnusedMocksError when the SUT short-circuits.
* fix: address gemini MEDIUM findings (logging, type import, nested sandbox)
Three findings from gemini's 2026-05-18 review:
1. hooks/spellbook_hook.py:1549 -- silent `except OSError: pass` in
`_record_stop_harvest` swallowed disk-full / permission errors.
Logger.warning now surfaces them; the operation remains best-effort
(lossy cache is recoverable on the next harvest because dedup
catches the re-emitted candidates).
2. installer/components/spellbook_cco.py -- `Callable[[Path, str],
tuple[bool, str]] | None` annotation referenced an unimported name.
Although `from __future__ import annotations` defers evaluation,
static analysis still flags the undefined name. Added
`from collections.abc import Callable` and dropped the string-literal
quoting on the annotation.
3. tests/test_hooks/test_memory_auto_store.py:884 -- redundant nested
`with tripwire:` inside an outer `with tripwire:` block. The outer
sandbox is sufficient; the inner one was a leftover from the
monkeypatch -> tripwire conversion.
* fix(tests): access required() via __call__ MethodProxy on Windows tests
The previous fix mis-targeted `.required(False)`: invoking it as
`home_mock.required(False)` routes through `_BaseMock.__getattr__`,
which constructs a fresh MethodProxy for a (non-existent) "required"
method. Calling that proxy runs the standard MethodProxy.__call__
path, which asserts sandbox active and raises SandboxNotActiveError
when the test sets up mocks BEFORE entering `with tripwire:`.
The correct API (per other test_workflow_state_tools.py and
test_spellbook_mcp/test_spawn_session.py usages) is to access the
__call__ MethodProxy explicitly:
home_mock.__call__.required(False).returns(fake_home)
This goes directly to MethodProxy.required (line 89 of
_mock_plugin.py), which sets the sticky flag without invoking
__call__ -- and so doesn't need an active sandbox.
* fix: gemini cycle-3 findings (stub class, silent errors, mid-file import)
Four findings from gemini's 2026-05-18T03:41 review:
1. tests/unit/test_diagram_update.py:79 (HIGH) -- the _StubAgentClient
class and its _stub_client_returns / _stub_client_raises factories
were hand-rolled stand-ins for the SDK AgentClient, forbidden per
AGENTS.md. Replaced with `types.SimpleNamespace(run=async_callable)`
helpers (_agent_client_returning / _raising / _capturing). The
SimpleNamespace is a stdlib data carrier, not a class that exists
only to stand in for a real dependency. The two inline
_CapturingClient classes in TestPatchDiagram and the classify-
prompt test are removed in favour of the same shared helper.
2. installer/components/spellbook_cco.py:530 (MEDIUM) -- the
`shutil.rmtree(..., ignore_errors=True)` rollback silently swallowed
permission/IO failures. Switched to `shutil.rmtree(..., onerror=
_onerror)` with a logger.warning callback so disk-full or chmod
issues are observable. The rollback remains best-effort: the SUT
still returns the canonical pin-verification skipped-reason
regardless of removal success.
3. spellbook/sessions/skill_analyzer.py:418 (MEDIUM) -- the broad
`except Exception: continue` block in skill-usage aggregation
silently dropped malformed sessions. Added module-level logger and
`logger.warning("Skipping malformed session at %s: %s", path, exc)`
so corrupted transcripts surface in operator logs rather than
vanishing from the aggregate.
4. tests/unit/test_stint_tools.py:124 (MEDIUM) -- the mid-file import
of spellbook.coordination.stint at line 116 was deferred behind a
schema-test block from a much earlier code structure. With the
autouse `isolated_db` fixture handling init_db, the deferral is no
longer necessary. Moved the import to the file header alongside
spellbook.core.db.
* fix: gemini cycle-4 findings (hardcoded audit path, memory test mocks)
Two real findings from gemini's cycle-4 review:
1. installer/components/spellbook_cco.py:185 -- hardcoded
`Users-eek-Development-spellbook` in the wrapper template's
Audit comment baked the developer's project path into every
installation. Replaced with runtime-computed project-encoded
path derived from the installer module's repo root
(`Path(__file__).resolve().parent.parent.parent`).
New helpers (with tripwire-callable indirection):
- _get_spellbook_repo_root()
- _encode_project_path()
_wrapper_template, _write_wrapper, install_spellbook_cco gain
a keyword-only `spellbook_repo_root` override so tests pin the
expected encoding deterministically. Production callers leave
it None and fall back to _get_spellbook_repo_root().
The duplicated encoding helper (instead of importing from
spellbook.memory.claude_memory._encode_project) is intentional:
the installer component runs in the bootstrap path before the
spellbook venv is guaranteed importable.
2. tests/test_cli/test_memory_cmd.py:69, :91 -- two unflagged
`monkeypatch.setattr` sites converted to tripwire.mock per
AGENTS.md "Testing with Tripwire" policy.
Three other cycle-4 findings rejected as hallucinations with
inline rationale on the gemini threads:
- _utcnow "missing" (defined at hooks/spellbook_hook.py:73)
- Platform/get_platform "wrong module" (canonical site is
spellbook.core.compat)
- tripwire.in_any_order "doesn't exist" (it does; tripwire.unordered
does not)
Test coverage: 230 passed, 4 skipped on installer/ + memory_cmd.
* fix: gemini cycle-5 findings (wrapper quoting, silent errors, tripwire assert)
Four real findings from gemini's cycle-5 review (CHANGELOG nag deferred
per release-manager policy):
1. installer/components/spellbook_cco.py:224 -- the `exec` line in the
wrapper template left `install_root` unquoted, so an install_root
path containing spaces (or other shell-metasignificant chars) would
fail at runtime. Quoted the `${install_root}/cco` argument and
updated EXPECTED_WRAPPER_TEMPLATE in tests/installer/test_spellbook_cco.py
to match.
2. installer/components/agents.py:240 -- broken-symlink readlink()
OSError silently swallowed; replaced with logger.warning() noting
the failing path + exception, per style guide.
3. installer/components/agents.py:354 -- stale-symlink unlink() OSError
silently swallowed; same pattern, replaced with logger.warning().
Added module-level `logger = logging.getLogger(__name__)`.
4. tests/test_cli/test_memory_cmd.py -- tripwire.mock for get_db_path
was registered but never asserted and the SUT call site was not
wrapped in `with tripwire:`. Both tests now:
- wrap args.func(args) in `with tripwire:`
- drain the timeline post-sandbox via `tripwire.in_any_order()`
- search-path additionally drains the qmd/serena shutil.which()
probes that do_memory_recall triggers (export-path takes the
no-DB short-circuit before any subprocess calls).
* fix: gemini cycle-6 findings (logging, dup nav, phase label, rmtree)
Five real findings from gemini's cycle-6 review (CHANGELOG version-bump
nag deferred per release-manager policy):
1. CHANGELOG.md -- the spellbook_cco entry was labeled
"security architecture Phase 5" but matches the "Phase 7" wiring
from the PR title. Relabeled to Phase 7.
2. mkdocs.yml:222 -- duplicate `commands/canvas.md` entry in the nav
(already declared on line 220). Removed.
3. installer/platforms/claude_code.py:421 -- the wide `except Exception`
that catches install_spellbook_cco failures recorded the message in
the InstallResult but did not log. Added logger.exception() so the
traceback is observable in debug logs alongside the stringified
result.
4. installer/platforms/claude_code.py:461 -- same pattern for the
alias-install dispatch wrapper. Added logger.exception().
5. installer/components/spellbook_cco.py:684 -- uninstall path used
shutil.rmtree(ignore_errors=True), silently swallowing all
permission/IO failures. Replaced with the same onerror logger
callback used elsewhere in this module (the install-path pin-
verification rollback). Per-path failures now surface as
logger.warning(); the uninstall path remains best-effort.
Test updates:
- tests/installer/test_aliases.py::test_install_does_not_abort_when_dispatch_raises
drains the new claude_code.py logger.exception ERROR entry via
tripwire.log.assert_log.
- tests/installer/test_claude_code_wires_default_mode_and_permissions.py
-- _assert_install_mocks and _assert_install_and_uninstall_mocks call
a new _drain_cco_install_exception_log helper that drains the
optional ERROR log emitted when the unmocked install_spellbook_cco
subprocess call fails. required=False with a fallback for tripwire
builds without the kwarg.
Test coverage: 222 passed, 4 skipped on installer/ + 8 on memory_cmd.
* docs(changelog): restore lost space in `SB-BASH-* rules`
Gemini cycle-7 caught a stray missing space between `SB-BASH-*` and
`rules` in the "Bash policy unified across Claude and Gemini paths"
entry. Cosmetic regression; restored.
* test(memory): replace _CountingMock helper with strict tripwire registrations
Removes the _CountingMock / _make_counting hand-rolled wrapper that pre-stacked
50 .required(False).calls(_wrapped) entries per fixture. That pattern was an
optional-mock cargo-cult: it satisfied tripwire's "every intercepted call must
be asserted" contract but destroyed the strict-verification signal by suppressing
both UnusedMocksError (extra registrations) and UnmockedInteractionError
(over-call beyond the budget).
The replacement uses a small _MockBuilder that:
- Wraps tripwire.mock(path) bound to a default side-effect callable.
- Exposes .expect(n) which registers exactly n strict .calls(fn) entries.
- Tracks itself in a per-test registry; an autouse finalizer wildcard-asserts
exactly n recorded interactions after the sandbox exits to satisfy tripwire's
intercepted-call contract.
Every test in this file now declares the SUT's exact mock call count via
.expect(n). Tripwire's FIFO enforces the count strictly:
- SUT calls greater than registered: UnmockedInteractionError on the extra call.
- SUT calls less than registered: UnusedMocksError at teardown.
The real test signal (POST URL/payload shape) is still asserted via the
existing mock_http.calls equality checks; the strict count adds quantitative
discipline back on top.
mock_http.drain() is gone; the controller exposes the same .calls / .responses /
.set_response / .set_handler surface plus .expect(n) and a duck-typed
.verify_all() that the autouse finalizer invokes.
Verified by intentionally over-counting (UnusedMocksError fires) and under-
counting (UnmockedInteractionError fires); both raise as expected.
Per-test call accounting is documented inline at the top of each TestClass.
* fix: gemini cycle-8 findings (changelog typo, Path/utf-8/PATH/silent except)
- CHANGELOG: restore missing space in "SB-BASH-* rules" on line 356.
The prior cycle-7 commit afa9dd87 that claimed this fix landed as an
empty commit, so the typo persisted. This commit applies it for real.
- installer/components/spellbook_cco.py: drop redundant str() wrapping
around Path objects in subprocess.run argv lists. subprocess accepts
path-like objects since 3.6. Updates the tripwire mock_run command=
registrations in tests/installer/test_spellbook_cco.py to match.
- installer/components/spellbook_cco.py: add encoding="utf-8" to all
Path.write_text / Path.read_text calls (the managed-clone marker and
the spellbook-cco wrapper).
- installer/components/spellbook_cco.py: resolve each PATH entry
(expanduser + resolve) before comparing against the wrapper dir so
the on-PATH check handles trailing slashes, "~", and relative entries
the same way as the resolved wrapper dir.
- installer/components/agents.py: replace two silent "except: pass"
blocks in _prune_stale_agent_symlinks with logger.debug(..., exc_info=
True) calls. Debug-level so normal worktree teardowns don't spam WARN,
but the resolve failures are still auditable.
* docs(changelog): backtick `SB-BASH-*` so the missing-space fix sticks
The cycle-7 commit afa9dd87 lost the space ("SB-BASH-*rules" -> intended
"SB-BASH-* rules") and the cycle-8 attempt landed as an empty commit
because the legacy pre-commit markdownlint --fix hook quietly reformats
the bare `SB-BASH-* ` token (the `*` becomes ambiguous emphasis to
markdownlint MD037 and the space is stripped).
Wrap the identifier in backticks so it is a code span, which:
- makes the missing-space fix durable against markdownlint --fix
- typographically marks `SB-BASH-*` as a rule-id family, matching how
BASH-* / EXF-* are referenced two lines below.
* docs(changelog): backtick `BASH-*` / `EXF-*` so markdownlint stops eating the space
Side-effect cleanup from the previous commit: markdownlint --fix
collapsed `BASH-* / EXF-*` to `BASH-*/ EXF-*` (the same MD037 emphasis
detection that ate the SB-BASH-* space). Wrap each identifier in
backticks so the slash separator and surrounding whitespace are stable.
---------
Co-authored-by: elijahr <153711+elijahr@users.noreply.github.com>1 parent 94d9da8 commit be81446
242 files changed
Lines changed: 9810 additions & 1058 deletions
File tree
- agents
- docs
- agents
- hooks
- installer
- components
- platforms
- scripts
- skills/fact-checking/scripts
- spellbook_mcp
- spellbook
- admin/routes
- cli/commands
- code_review
- coordination
- core
- daemon
- db
- forged
- fractal
- health
- mcp
- tools
- memory
- pr_distill
- sdk
- sessions
- updates
- tests
- admin
- docker
- installer
- integration
- test_cli
- test_core
- test_forged
- test_fractal
- test_hooks
- test_orm_migration
- test_reorg
- test_security
- test_spellbook_mcp
- test_code_review
- test_pr_distill
- test_worker_llm
- unit
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
361 | 361 | | |
362 | 362 | | |
363 | 363 | | |
364 | | - | |
| 364 | + | |
365 | 365 | | |
366 | 366 | | |
367 | | - | |
| 367 | + | |
368 | 368 | | |
369 | 369 | | |
370 | 370 | | |
| |||
576 | 576 | | |
577 | 577 | | |
578 | 578 | | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
579 | 647 | | |
580 | 648 | | |
581 | 649 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| 39 | + | |
| 40 | + | |
39 | 41 | | |
40 | 42 | | |
41 | 43 | | |
| |||
50 | 52 | | |
51 | 53 | | |
52 | 54 | | |
53 | | - | |
| 55 | + | |
54 | 56 | | |
55 | 57 | | |
56 | 58 | | |
| |||
108 | 110 | | |
109 | 111 | | |
110 | 112 | | |
111 | | - | |
| 113 | + | |
112 | 114 | | |
113 | | - | |
114 | | - | |
| 115 | + | |
115 | 116 | | |
116 | | - | |
| 117 | + | |
| 118 | + | |
117 | 119 | | |
118 | 120 | | |
119 | 121 | | |
120 | 122 | | |
121 | 123 | | |
122 | 124 | | |
123 | | - | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
124 | 145 | | |
125 | 146 | | |
126 | 147 | | |
| |||
270 | 291 | | |
271 | 292 | | |
272 | 293 | | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
273 | 297 | | |
274 | 298 | | |
275 | 299 | | |
| |||
495 | 519 | | |
496 | 520 | | |
497 | 521 | | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
498 | 525 | | |
499 | | - | |
| 526 | + | |
500 | 527 | | |
501 | 528 | | |
502 | 529 | | |
| |||
507 | 534 | | |
508 | 535 | | |
509 | 536 | | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
510 | 546 | | |
511 | 547 | | |
512 | 548 | | |
| |||
517 | 553 | | |
518 | 554 | | |
519 | 555 | | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
520 | 565 | | |
521 | 566 | | |
522 | 567 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
0 commit comments