Post-deploy fixes: pin search_vec, add /office/policy to sidebar, doc push hazard#9
Merged
Conversation
…ar; doc push hazard
Three corrections after the post-deploy state of the moderator slice:
1. submissions.search_vec was dropped from prod when drizzle-kit push
was used to apply migrations 0018-0021 (push hangs on migrate over
the HTTP-only Neon driver). The column is `tsvector GENERATED ALWAYS
AS (...) STORED` per migration 0003_fts.sql; Drizzle's schema DSL
can't fully express it, so push saw a phantom column and dropped
it. Now declared via customType<string>("tsvector") so push sees
a match and leaves it alone. The actual column DDL still lives in
migration 0003 — schema declaration is just the "preserve this"
signal for future pushes.
Re-creating search_vec in the live DB needs to be done out of
band (Neon dashboard SQL editor or psql) since drizzle-kit push
cannot author the GENERATED clause. The required SQL is in
migration 0003_fts.sql verbatim, idempotent (IF NOT EXISTS).
2. /office/policy now uses the OfficeSidebar layout instead of a
standalone breadcrumb. The page was orphaned — visible to readers
navigating directly but invisible from /office's left rail. Added
"policy" to OfficeSidebarPage union, wired a new entry with
the Shield icon, and rewrapped the page in the proto-page-aside
shell so navigation between Office sections works consistently.
3. New rule: .claude/rules/db-migrations.md — never use `drizzle-kit
push` against prod. Documents the search_vec incident, what other
DB features push will want to drop (triggers, generated columns,
non-Drizzle constraint names), and the safer path (psql or Neon
dashboard with the explicit migration files in
web/src/db/migrations/*.sql).
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
xiaolai
added a commit
that referenced
this pull request
May 8, 2026
HIGH fixes: - register_from_browser: rollback private blob + cleanup temp dir on store.insert() failure (#4) - Orphan detection: roundtrip check prevents lossy unsanitize_path from causing false-positive deletions (#1, #10) - --from-token reads from stdin when value is "-" to avoid shell history exposure (#9) MEDIUM fixes: - swap switch(): rollback CC credentials on set_active_cli failure (#11) - register_account_from_profile: delete orphaned private blob on insert failure (#17) - remove_account: reorder to clear pointers + remove DB row before irreversible file deletions (#18) - desktop switch(): propagate DB update failures instead of ignoring (#22) - account.rs: propagate DB permission-setting failures (#20) - project.rs: post-move failures become warnings instead of errors when directory already moved (#16) - account remove --json: emit structured JSON response (#28) - doctor summary: count expired accounts as warnings (#26) LOW fixes: - doctor: surface store.list() failures via db_error field (#29) - keychain status: distinguish "no credential" from "access error" (#30)
xiaolai
added a commit
that referenced
this pull request
May 8, 2026
Independent Codex audit (5-dim mini, threadId 019de662) flagged 2 High + 8 Medium + 1 Low across the templates feature surface. Independent verification (threadId 019de792) confirmed 10 fixes landed; #11 is deferred with rationale. ### High #1 prerun probe ignored route auth — `probe_sync` in `automations/prerun.rs` now attaches `Authorization: Bearer <key>` (or `Basic <key>` when `auth_scheme` is set) from the route's `GatewayConfig.api_key` when non-empty. LiteLLM / Cloudflare AI Gateway endpoints behind auth no longer false- fail with 401. #2 sidecar swap in `templates_apply_pending` — `commands_templates.rs` now rejects early when `pending.automation_id != automation_id`. Without this, an attacker (or a buggy caller) could pair automation A's pending- changes.json with automation B's broader apply policy. ### Medium #3 Ollama probe detection too broad — was `cfg.base_url.contains ("/api/")` which false-positives on `/api/v1` (OpenRouter, LiteLLM). Now narrowed to `:11434` port OR explicit `/api/tags` path. #4 hardcoded `~/.claudepot` — both `templates_pending_changes` and `templates_read_report` now use `claudepot_data_dir()` from claudepot-core, honoring `CLAUDEPOT_DATA_DIR` overrides correctly. #5 `supported_platforms` was only enforced in `templates_list` — `templates_install` now ALSO rejects when `!bp.supports(HostPlatform::current())`. The contract is symmetric: a direct IPC bypass of the gallery filter no longer lets a macOS-only template instantiate on Linux/Windows. #6 Windows ACL trusted `USERNAME` — `tighten_consent_acl` now resolves the current user SID via `whoami /user /fo csv /nh`, grants `*<SID>:F` in icacls (well-known SID notation), grants SYSTEM via the literal `S-1-5-18` SID, and surfaces every failure path through `tracing::warn` instead of swallowing silently. #7 cron test wrote `CLAUDE_OAUTH_TOKEN` to disk via `automation.extra_env` (which `install_shim` renders into `run.sh`). Token now lives in a 0600 sibling file (`<tmp>/.oauth-token`); a 0700 wrapper script (`<tmp>/claude-with-token.sh`) reads the file at fire time and exec's the real claude. The shim sees only the wrapper path, never the token. #8 cron test scheduled `now + 1 minute`, which races the case where `install_shim + scheduler.register` span past the next minute boundary — launchd would defer to the same time tomorrow and the test would hang. Now schedules `now + 2 minutes`; deadline extended to 180s. #9 cron test passed on a `"result":` substring match, which is too permissive — error rows mention `result` too. Now parses `stdout.log` as a JSON event array, finds the terminal `result` event, asserts `is_error == false`. Verified locally: the cron run produced `is_error=false result_first_80= CRON_TEMPLATE_OK`. #10 `CronCleanupGuard` did not restore the previous `CLAUDEPOT_DATA_DIR` — risk of polluting subsequent tests in the same process. Now captures the prior value as `Option<OsString>` at construction and restores it on Drop. Also explicitly zero-fills + unlinks the token file so a tempdir-removal race doesn't leave the token on disk. ### Low (deferred) #11 validator rebuilds globset matcher per call. Apply is interactive (~5-50 items × 1-3 globs each, user reviews + clicks per run); not a hot loop. Caching would require an `ApplyConfig`-level matcher cache and is not worth the complexity. Documented in the audit-fix verification report. ### Verification Independent Codex re-audit on a fresh thread: | Status | Count | |---|---:| | FIXED | 10 | | NOT FIXED | 0 | | PARTIAL | 0 | | REGRESSED | 0 | | DEFERRED | 1 | Local validation: - `cargo test --workspace` — all green - `cargo test -p claudepot-core --test templates_real_llm cron_schedule_fires -- --ignored` — passes with new is_error=false assertion (110s wall, ~$0.30) Audit threadId: 019de662-808c-7b62-80b3-e85d373f92b5 Verify threadId: 019de792-3515-7533-aac0-0203a96388be
xiaolai
added a commit
that referenced
this pull request
May 8, 2026
Post-deploy fixes: pin search_vec, add /office/policy to sidebar, doc push hazard
xiaolai
added a commit
that referenced
this pull request
May 8, 2026
Independent Codex audit (5-dim mini, threadId 019de662) flagged 2 High + 8 Medium + 1 Low across the templates feature surface. Independent verification (threadId 019de792) confirmed 10 fixes landed; #11 is deferred with rationale. ### High #1 prerun probe ignored route auth — `probe_sync` in `automations/prerun.rs` now attaches `Authorization: Bearer <key>` (or `Basic <key>` when `auth_scheme` is set) from the route's `GatewayConfig.api_key` when non-empty. LiteLLM / Cloudflare AI Gateway endpoints behind auth no longer false- fail with 401. #2 sidecar swap in `templates_apply_pending` — `commands_templates.rs` now rejects early when `pending.automation_id != automation_id`. Without this, an attacker (or a buggy caller) could pair automation A's pending- changes.json with automation B's broader apply policy. ### Medium #3 Ollama probe detection too broad — was `cfg.base_url.contains ("/api/")` which false-positives on `/api/v1` (OpenRouter, LiteLLM). Now narrowed to `:11434` port OR explicit `/api/tags` path. #4 hardcoded `~/.claudepot` — both `templates_pending_changes` and `templates_read_report` now use `claudepot_data_dir()` from claudepot-core, honoring `CLAUDEPOT_DATA_DIR` overrides correctly. #5 `supported_platforms` was only enforced in `templates_list` — `templates_install` now ALSO rejects when `!bp.supports(HostPlatform::current())`. The contract is symmetric: a direct IPC bypass of the gallery filter no longer lets a macOS-only template instantiate on Linux/Windows. #6 Windows ACL trusted `USERNAME` — `tighten_consent_acl` now resolves the current user SID via `whoami /user /fo csv /nh`, grants `*<SID>:F` in icacls (well-known SID notation), grants SYSTEM via the literal `S-1-5-18` SID, and surfaces every failure path through `tracing::warn` instead of swallowing silently. #7 cron test wrote `CLAUDE_OAUTH_TOKEN` to disk via `automation.extra_env` (which `install_shim` renders into `run.sh`). Token now lives in a 0600 sibling file (`<tmp>/.oauth-token`); a 0700 wrapper script (`<tmp>/claude-with-token.sh`) reads the file at fire time and exec's the real claude. The shim sees only the wrapper path, never the token. #8 cron test scheduled `now + 1 minute`, which races the case where `install_shim + scheduler.register` span past the next minute boundary — launchd would defer to the same time tomorrow and the test would hang. Now schedules `now + 2 minutes`; deadline extended to 180s. #9 cron test passed on a `"result":` substring match, which is too permissive — error rows mention `result` too. Now parses `stdout.log` as a JSON event array, finds the terminal `result` event, asserts `is_error == false`. Verified locally: the cron run produced `is_error=false result_first_80= CRON_TEMPLATE_OK`. #10 `CronCleanupGuard` did not restore the previous `CLAUDEPOT_DATA_DIR` — risk of polluting subsequent tests in the same process. Now captures the prior value as `Option<OsString>` at construction and restores it on Drop. Also explicitly zero-fills + unlinks the token file so a tempdir-removal race doesn't leave the token on disk. ### Low (deferred) #11 validator rebuilds globset matcher per call. Apply is interactive (~5-50 items × 1-3 globs each, user reviews + clicks per run); not a hot loop. Caching would require an `ApplyConfig`-level matcher cache and is not worth the complexity. Documented in the audit-fix verification report. ### Verification Independent Codex re-audit on a fresh thread: | Status | Count | |---|---:| | FIXED | 10 | | NOT FIXED | 0 | | PARTIAL | 0 | | REGRESSED | 0 | | DEFERRED | 1 | Local validation: - `cargo test --workspace` — all green - `cargo test -p claudepot-core --test templates_real_llm cron_schedule_fires -- --ignored` — passes with new is_error=false assertion (110s wall, ~$0.30) Audit threadId: 019de662-808c-7b62-80b3-e85d373f92b5 Verify threadId: 019de792-3515-7533-aac0-0203a96388be
xiaolai
added a commit
that referenced
this pull request
May 8, 2026
Post-deploy fixes: pin search_vec, add /office/policy to sidebar, doc push hazard
xiaolai
added a commit
that referenced
this pull request
May 8, 2026
HIGH fixes: - register_from_browser: rollback private blob + cleanup temp dir on store.insert() failure (#4) - Orphan detection: roundtrip check prevents lossy unsanitize_path from causing false-positive deletions (#1, #10) - --from-token reads from stdin when value is "-" to avoid shell history exposure (#9) MEDIUM fixes: - swap switch(): rollback CC credentials on set_active_cli failure (#11) - register_account_from_profile: delete orphaned private blob on insert failure (#17) - remove_account: reorder to clear pointers + remove DB row before irreversible file deletions (#18) - desktop switch(): propagate DB update failures instead of ignoring (#22) - account.rs: propagate DB permission-setting failures (#20) - project.rs: post-move failures become warnings instead of errors when directory already moved (#16) - account remove --json: emit structured JSON response (#28) - doctor summary: count expired accounts as warnings (#26) LOW fixes: - doctor: surface store.list() failures via db_error field (#29) - keychain status: distinguish "no credential" from "access error" (#30)
xiaolai
added a commit
that referenced
this pull request
May 8, 2026
Independent Codex audit (5-dim mini, threadId 019de662) flagged 2 High + 8 Medium + 1 Low across the templates feature surface. Independent verification (threadId 019de792) confirmed 10 fixes landed; #11 is deferred with rationale. ### High #1 prerun probe ignored route auth — `probe_sync` in `automations/prerun.rs` now attaches `Authorization: Bearer <key>` (or `Basic <key>` when `auth_scheme` is set) from the route's `GatewayConfig.api_key` when non-empty. LiteLLM / Cloudflare AI Gateway endpoints behind auth no longer false- fail with 401. #2 sidecar swap in `templates_apply_pending` — `commands_templates.rs` now rejects early when `pending.automation_id != automation_id`. Without this, an attacker (or a buggy caller) could pair automation A's pending- changes.json with automation B's broader apply policy. ### Medium #3 Ollama probe detection too broad — was `cfg.base_url.contains ("/api/")` which false-positives on `/api/v1` (OpenRouter, LiteLLM). Now narrowed to `:11434` port OR explicit `/api/tags` path. #4 hardcoded `~/.claudepot` — both `templates_pending_changes` and `templates_read_report` now use `claudepot_data_dir()` from claudepot-core, honoring `CLAUDEPOT_DATA_DIR` overrides correctly. #5 `supported_platforms` was only enforced in `templates_list` — `templates_install` now ALSO rejects when `!bp.supports(HostPlatform::current())`. The contract is symmetric: a direct IPC bypass of the gallery filter no longer lets a macOS-only template instantiate on Linux/Windows. #6 Windows ACL trusted `USERNAME` — `tighten_consent_acl` now resolves the current user SID via `whoami /user /fo csv /nh`, grants `*<SID>:F` in icacls (well-known SID notation), grants SYSTEM via the literal `S-1-5-18` SID, and surfaces every failure path through `tracing::warn` instead of swallowing silently. #7 cron test wrote `CLAUDE_OAUTH_TOKEN` to disk via `automation.extra_env` (which `install_shim` renders into `run.sh`). Token now lives in a 0600 sibling file (`<tmp>/.oauth-token`); a 0700 wrapper script (`<tmp>/claude-with-token.sh`) reads the file at fire time and exec's the real claude. The shim sees only the wrapper path, never the token. #8 cron test scheduled `now + 1 minute`, which races the case where `install_shim + scheduler.register` span past the next minute boundary — launchd would defer to the same time tomorrow and the test would hang. Now schedules `now + 2 minutes`; deadline extended to 180s. #9 cron test passed on a `"result":` substring match, which is too permissive — error rows mention `result` too. Now parses `stdout.log` as a JSON event array, finds the terminal `result` event, asserts `is_error == false`. Verified locally: the cron run produced `is_error=false result_first_80= CRON_TEMPLATE_OK`. #10 `CronCleanupGuard` did not restore the previous `CLAUDEPOT_DATA_DIR` — risk of polluting subsequent tests in the same process. Now captures the prior value as `Option<OsString>` at construction and restores it on Drop. Also explicitly zero-fills + unlinks the token file so a tempdir-removal race doesn't leave the token on disk. ### Low (deferred) #11 validator rebuilds globset matcher per call. Apply is interactive (~5-50 items × 1-3 globs each, user reviews + clicks per run); not a hot loop. Caching would require an `ApplyConfig`-level matcher cache and is not worth the complexity. Documented in the audit-fix verification report. ### Verification Independent Codex re-audit on a fresh thread: | Status | Count | |---|---:| | FIXED | 10 | | NOT FIXED | 0 | | PARTIAL | 0 | | REGRESSED | 0 | | DEFERRED | 1 | Local validation: - `cargo test --workspace` — all green - `cargo test -p claudepot-core --test templates_real_llm cron_schedule_fires -- --ignored` — passes with new is_error=false assertion (110s wall, ~$0.30) Audit threadId: 019de662-808c-7b62-80b3-e85d373f92b5 Verify threadId: 019de792-3515-7533-aac0-0203a96388be
xiaolai
added a commit
that referenced
this pull request
May 8, 2026
Post-deploy fixes: pin search_vec, add /office/policy to sidebar, doc push hazard
xiaolai
added a commit
that referenced
this pull request
May 18, 2026
Eight findings collapsed into one refactor cycle plus two small cleanups. The grill report (in conversation; not on disk) flagged 0 critical / 0 high / 3 medium / 8 low — this addresses every actionable item. Core reshape (#1 + #6 + #7): - detect_pr returns DetectOutcome { branch, pr } so the orchestrator keys the cache directly. Halves the per-tick git invocation count (no more standalone current_branch call). - PrCache keyed by repo_root only; insert() overwrites unconditionally so a branch flip is absorbed by the next tick without explicit detection. Drops get_any_for (was O(n)), drops the dead Entry.branch field, drops refresh_one's early-return guard. Orchestrator parallelism + gh latching (#2 + #9): - tick_all fans out via tokio Semaphore + JoinSet with MAX_PARALLEL=4 cap on simultaneous gh invocations. 30-project tick now completes in ~8 batches instead of 30 sequential calls. - First MissingCli("gh") flips an AtomicBool that short-circuits every subsequent tick. gh-less users pay one detect attempt per process lifetime, then nothing. Test coverage (#8): - New PrDetector trait + RealDetector wrapper so the orchestrator is testable against a FakeDetector. Four new tokio tests cover cache round-trip, negative caching, gh-absent latch, and the bounded-parallel fan-out (12 roots vs MAX_PARALLEL=4). - DTO tests: PrInfoDto serializes PrState lowercase; ProjectInfoDto omits the pr field when None. - PrCache test for explicit overwrite semantics. UI hygiene (#3, #4, #5, #11): - Extract liveDotTitle to src/components/primitives so ProjectDetail and ProjectsTable share one implementation. - LiveStatusDot.title is now required in the prop type — the "mandatory by comment" contract was decay-prone. - BrandGithubMark license comment now correctly says the silhouette is a GitHub Inc. trademark used under the design.md brand-mark exception (was claiming CC-BY 4.0 on GitHub's mark, which doesn't apply). - cwdMatchesProject now normalizes both inputs to forward slash before prefix-checking, so a mixed-separator Windows project path matches its live cwd. Three new regression tests cover the cases. Tests: 81 + 613 green. Clippy --all-targets clean. Fmt clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three corrections after the moderator slice deploy.
1.
submissions.search_vecgot dropped from proddrizzle-kit pushwas used to apply migrations 0018–0021 (becausemigratehangs over the HTTP-only Neon driver). Push diffs againstweb/src/db/schema/*.ts, sees nosearch_vecdeclaration there, and drops the column. The column wastsvector GENERATED ALWAYS AS (...) STOREDfrom migration0003_fts.sql— Drizzle's schema DSL can't fully express generated columns.Restore SQL (run via Neon dashboard SQL editor against prod):
This PR adds the schema-side declaration so a future push sees the column as expected and won't drop it again. The GENERATED clause stays owned by migration 0003.
2.
/office/policywas orphanedThe page rendered, but used a standalone breadcrumb instead of
OfficeSidebar— readers navigating from/officecouldn't find it from the left rail. Added"policy"toOfficeSidebarPage, wired a Shield-icon entry, and rewrapped the page in theproto-page-asideshell.3. New project rule: never
drizzle-kit pushagainst prod.claude/rules/db-migrations.mddocuments the search_vec incident, lists what else push will silently drop (triggers, generated columns, custom constraint names), and prescribes the safer path: explicit SQL migrations inweb/src/db/migrations/<NNNN>_<slug>.sqlapplied via psql or the Neon dashboard.Test plan
curl https://claudepot.com/search?q=testreturns 200 (was likely 500)/officeleft rail shows the new "Policy moderation" link → click navigates to/office/policy→ sidebar persists with active highlight on the policy entry