Post-deploy fixes: pin search_vec, add /office/policy to sidebar, doc push hazard by xiaolai · Pull Request #9 · xiaolai/claudepot-app

xiaolai · 2026-05-06T12:39:40Z

Summary

Three corrections after the moderator slice deploy.

1. `submissions.search_vec` got dropped from prod

drizzle-kit push was used to apply migrations 0018–0021 (because migrate hangs over the HTTP-only Neon driver). Push diffs against web/src/db/schema/*.ts, sees no search_vec declaration there, and drops the column. The column was tsvector GENERATED ALWAYS AS (...) STORED from migration 0003_fts.sql — Drizzle's schema DSL can't fully express generated columns.

Restore SQL (run via Neon dashboard SQL editor against prod):

ALTER TABLE submissions
  ADD COLUMN IF NOT EXISTS search_vec tsvector
  GENERATED ALWAYS AS (
    to_tsvector('english',
      coalesce(title, '') || ' ' || coalesce(text, '') || ' ' || coalesce(url, ''))
  ) STORED;
CREATE INDEX IF NOT EXISTS idx_submissions_search_vec ON submissions USING GIN (search_vec);

This PR adds the schema-side declaration so a future push sees the column as expected and won't drop it again. The GENERATED clause stays owned by migration 0003.

2. `/office/policy` was orphaned

The page rendered, but used a standalone breadcrumb instead of OfficeSidebar — readers navigating from /office couldn't find it from the left rail. Added "policy" to OfficeSidebarPage, wired a Shield-icon entry, and rewrapped the page in the proto-page-aside shell.

3. New project rule: never `drizzle-kit push` against prod

.claude/rules/db-migrations.md documents the search_vec incident, lists what else push will silently drop (triggers, generated columns, custom constraint names), and prescribes the safer path: explicit SQL migrations in web/src/db/migrations/<NNNN>_<slug>.sql applied via psql or the Neon dashboard.

Test plan

tsc clean
Apply the restore SQL above to prod via Neon dashboard → curl https://claudepot.com/search?q=test returns 200 (was likely 500)
After merge, /office left rail shows the new "Policy moderation" link → click navigates to /office/policy → sidebar persists with active highlight on the policy entry

…ar; doc push hazard Three corrections after the post-deploy state of the moderator slice: 1. submissions.search_vec was dropped from prod when drizzle-kit push was used to apply migrations 0018-0021 (push hangs on migrate over the HTTP-only Neon driver). The column is `tsvector GENERATED ALWAYS AS (...) STORED` per migration 0003_fts.sql; Drizzle's schema DSL can't fully express it, so push saw a phantom column and dropped it. Now declared via customType<string>("tsvector") so push sees a match and leaves it alone. The actual column DDL still lives in migration 0003 — schema declaration is just the "preserve this" signal for future pushes. Re-creating search_vec in the live DB needs to be done out of band (Neon dashboard SQL editor or psql) since drizzle-kit push cannot author the GENERATED clause. The required SQL is in migration 0003_fts.sql verbatim, idempotent (IF NOT EXISTS). 2. /office/policy now uses the OfficeSidebar layout instead of a standalone breadcrumb. The page was orphaned — visible to readers navigating directly but invisible from /office's left rail. Added "policy" to OfficeSidebarPage union, wired a new entry with the Shield icon, and rewrapped the page in the proto-page-aside shell so navigation between Office sections works consistently. 3. New rule: .claude/rules/db-migrations.md — never use `drizzle-kit push` against prod. Documents the search_vec incident, what other DB features push will want to drop (triggers, generated columns, non-Drizzle constraint names), and the safer path (psql or Neon dashboard with the explicit migration files in web/src/db/migrations/*.sql).

vercel · 2026-05-06T12:39:45Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
claudepot-com	Error		May 6, 2026 0:40am

HIGH fixes: - register_from_browser: rollback private blob + cleanup temp dir on store.insert() failure (#4) - Orphan detection: roundtrip check prevents lossy unsanitize_path from causing false-positive deletions (#1, #10) - --from-token reads from stdin when value is "-" to avoid shell history exposure (#9) MEDIUM fixes: - swap switch(): rollback CC credentials on set_active_cli failure (#11) - register_account_from_profile: delete orphaned private blob on insert failure (#17) - remove_account: reorder to clear pointers + remove DB row before irreversible file deletions (#18) - desktop switch(): propagate DB update failures instead of ignoring (#22) - account.rs: propagate DB permission-setting failures (#20) - project.rs: post-move failures become warnings instead of errors when directory already moved (#16) - account remove --json: emit structured JSON response (#28) - doctor summary: count expired accounts as warnings (#26) LOW fixes: - doctor: surface store.list() failures via db_error field (#29) - keychain status: distinguish "no credential" from "access error" (#30)

Independent Codex audit (5-dim mini, threadId 019de662) flagged 2 High + 8 Medium + 1 Low across the templates feature surface. Independent verification (threadId 019de792) confirmed 10 fixes landed; #11 is deferred with rationale. ### High #1 prerun probe ignored route auth — `probe_sync` in `automations/prerun.rs` now attaches `Authorization: Bearer <key>` (or `Basic <key>` when `auth_scheme` is set) from the route's `GatewayConfig.api_key` when non-empty. LiteLLM / Cloudflare AI Gateway endpoints behind auth no longer false- fail with 401. #2 sidecar swap in `templates_apply_pending` — `commands_templates.rs` now rejects early when `pending.automation_id != automation_id`. Without this, an attacker (or a buggy caller) could pair automation A's pending- changes.json with automation B's broader apply policy. ### Medium #3 Ollama probe detection too broad — was `cfg.base_url.contains ("/api/")` which false-positives on `/api/v1` (OpenRouter, LiteLLM). Now narrowed to `:11434` port OR explicit `/api/tags` path. #4 hardcoded `~/.claudepot` — both `templates_pending_changes` and `templates_read_report` now use `claudepot_data_dir()` from claudepot-core, honoring `CLAUDEPOT_DATA_DIR` overrides correctly. #5 `supported_platforms` was only enforced in `templates_list` — `templates_install` now ALSO rejects when `!bp.supports(HostPlatform::current())`. The contract is symmetric: a direct IPC bypass of the gallery filter no longer lets a macOS-only template instantiate on Linux/Windows. #6 Windows ACL trusted `USERNAME` — `tighten_consent_acl` now resolves the current user SID via `whoami /user /fo csv /nh`, grants `*<SID>:F` in icacls (well-known SID notation), grants SYSTEM via the literal `S-1-5-18` SID, and surfaces every failure path through `tracing::warn` instead of swallowing silently. #7 cron test wrote `CLAUDE_OAUTH_TOKEN` to disk via `automation.extra_env` (which `install_shim` renders into `run.sh`). Token now lives in a 0600 sibling file (`<tmp>/.oauth-token`); a 0700 wrapper script (`<tmp>/claude-with-token.sh`) reads the file at fire time and exec's the real claude. The shim sees only the wrapper path, never the token. #8 cron test scheduled `now + 1 minute`, which races the case where `install_shim + scheduler.register` span past the next minute boundary — launchd would defer to the same time tomorrow and the test would hang. Now schedules `now + 2 minutes`; deadline extended to 180s. #9 cron test passed on a `"result":` substring match, which is too permissive — error rows mention `result` too. Now parses `stdout.log` as a JSON event array, finds the terminal `result` event, asserts `is_error == false`. Verified locally: the cron run produced `is_error=false result_first_80= CRON_TEMPLATE_OK`. #10 `CronCleanupGuard` did not restore the previous `CLAUDEPOT_DATA_DIR` — risk of polluting subsequent tests in the same process. Now captures the prior value as `Option<OsString>` at construction and restores it on Drop. Also explicitly zero-fills + unlinks the token file so a tempdir-removal race doesn't leave the token on disk. ### Low (deferred) #11 validator rebuilds globset matcher per call. Apply is interactive (~5-50 items × 1-3 globs each, user reviews + clicks per run); not a hot loop. Caching would require an `ApplyConfig`-level matcher cache and is not worth the complexity. Documented in the audit-fix verification report. ### Verification Independent Codex re-audit on a fresh thread: | Status | Count | |---|---:| | FIXED | 10 | | NOT FIXED | 0 | | PARTIAL | 0 | | REGRESSED | 0 | | DEFERRED | 1 | Local validation: - `cargo test --workspace` — all green - `cargo test -p claudepot-core --test templates_real_llm cron_schedule_fires -- --ignored` — passes with new is_error=false assertion (110s wall, ~$0.30) Audit threadId: 019de662-808c-7b62-80b3-e85d373f92b5 Verify threadId: 019de792-3515-7533-aac0-0203a96388be

Post-deploy fixes: pin search_vec, add /office/policy to sidebar, doc push hazard

Independent Codex audit (5-dim mini, threadId 019de662) flagged 2 High + 8 Medium + 1 Low across the templates feature surface. Independent verification (threadId 019de792) confirmed 10 fixes landed; #11 is deferred with rationale. ### High #1 prerun probe ignored route auth — `probe_sync` in `automations/prerun.rs` now attaches `Authorization: Bearer <key>` (or `Basic <key>` when `auth_scheme` is set) from the route's `GatewayConfig.api_key` when non-empty. LiteLLM / Cloudflare AI Gateway endpoints behind auth no longer false- fail with 401. #2 sidecar swap in `templates_apply_pending` — `commands_templates.rs` now rejects early when `pending.automation_id != automation_id`. Without this, an attacker (or a buggy caller) could pair automation A's pending- changes.json with automation B's broader apply policy. ### Medium #3 Ollama probe detection too broad — was `cfg.base_url.contains ("/api/")` which false-positives on `/api/v1` (OpenRouter, LiteLLM). Now narrowed to `:11434` port OR explicit `/api/tags` path. #4 hardcoded `~/.claudepot` — both `templates_pending_changes` and `templates_read_report` now use `claudepot_data_dir()` from claudepot-core, honoring `CLAUDEPOT_DATA_DIR` overrides correctly. #5 `supported_platforms` was only enforced in `templates_list` — `templates_install` now ALSO rejects when `!bp.supports(HostPlatform::current())`. The contract is symmetric: a direct IPC bypass of the gallery filter no longer lets a macOS-only template instantiate on Linux/Windows. #6 Windows ACL trusted `USERNAME` — `tighten_consent_acl` now resolves the current user SID via `whoami /user /fo csv /nh`, grants `*<SID>:F` in icacls (well-known SID notation), grants SYSTEM via the literal `S-1-5-18` SID, and surfaces every failure path through `tracing::warn` instead of swallowing silently. #7 cron test wrote `CLAUDE_OAUTH_TOKEN` to disk via `automation.extra_env` (which `install_shim` renders into `run.sh`). Token now lives in a 0600 sibling file (`<tmp>/.oauth-token`); a 0700 wrapper script (`<tmp>/claude-with-token.sh`) reads the file at fire time and exec's the real claude. The shim sees only the wrapper path, never the token. #8 cron test scheduled `now + 1 minute`, which races the case where `install_shim + scheduler.register` span past the next minute boundary — launchd would defer to the same time tomorrow and the test would hang. Now schedules `now + 2 minutes`; deadline extended to 180s. #9 cron test passed on a `"result":` substring match, which is too permissive — error rows mention `result` too. Now parses `stdout.log` as a JSON event array, finds the terminal `result` event, asserts `is_error == false`. Verified locally: the cron run produced `is_error=false result_first_80= CRON_TEMPLATE_OK`. #10 `CronCleanupGuard` did not restore the previous `CLAUDEPOT_DATA_DIR` — risk of polluting subsequent tests in the same process. Now captures the prior value as `Option<OsString>` at construction and restores it on Drop. Also explicitly zero-fills + unlinks the token file so a tempdir-removal race doesn't leave the token on disk. ### Low (deferred) #11 validator rebuilds globset matcher per call. Apply is interactive (~5-50 items × 1-3 globs each, user reviews + clicks per run); not a hot loop. Caching would require an `ApplyConfig`-level matcher cache and is not worth the complexity. Documented in the audit-fix verification report. ### Verification Independent Codex re-audit on a fresh thread: | Status | Count | |---|---:| | FIXED | 10 | | NOT FIXED | 0 | | PARTIAL | 0 | | REGRESSED | 0 | | DEFERRED | 1 | Local validation: - `cargo test --workspace` — all green - `cargo test -p claudepot-core --test templates_real_llm cron_schedule_fires -- --ignored` — passes with new is_error=false assertion (110s wall, ~$0.30) Audit threadId: 019de662-808c-7b62-80b3-e85d373f92b5 Verify threadId: 019de792-3515-7533-aac0-0203a96388be

Post-deploy fixes: pin search_vec, add /office/policy to sidebar, doc push hazard

HIGH fixes: - register_from_browser: rollback private blob + cleanup temp dir on store.insert() failure (#4) - Orphan detection: roundtrip check prevents lossy unsanitize_path from causing false-positive deletions (#1, #10) - --from-token reads from stdin when value is "-" to avoid shell history exposure (#9) MEDIUM fixes: - swap switch(): rollback CC credentials on set_active_cli failure (#11) - register_account_from_profile: delete orphaned private blob on insert failure (#17) - remove_account: reorder to clear pointers + remove DB row before irreversible file deletions (#18) - desktop switch(): propagate DB update failures instead of ignoring (#22) - account.rs: propagate DB permission-setting failures (#20) - project.rs: post-move failures become warnings instead of errors when directory already moved (#16) - account remove --json: emit structured JSON response (#28) - doctor summary: count expired accounts as warnings (#26) LOW fixes: - doctor: surface store.list() failures via db_error field (#29) - keychain status: distinguish "no credential" from "access error" (#30)

Independent Codex audit (5-dim mini, threadId 019de662) flagged 2 High + 8 Medium + 1 Low across the templates feature surface. Independent verification (threadId 019de792) confirmed 10 fixes landed; #11 is deferred with rationale. ### High #1 prerun probe ignored route auth — `probe_sync` in `automations/prerun.rs` now attaches `Authorization: Bearer <key>` (or `Basic <key>` when `auth_scheme` is set) from the route's `GatewayConfig.api_key` when non-empty. LiteLLM / Cloudflare AI Gateway endpoints behind auth no longer false- fail with 401. #2 sidecar swap in `templates_apply_pending` — `commands_templates.rs` now rejects early when `pending.automation_id != automation_id`. Without this, an attacker (or a buggy caller) could pair automation A's pending- changes.json with automation B's broader apply policy. ### Medium #3 Ollama probe detection too broad — was `cfg.base_url.contains ("/api/")` which false-positives on `/api/v1` (OpenRouter, LiteLLM). Now narrowed to `:11434` port OR explicit `/api/tags` path. #4 hardcoded `~/.claudepot` — both `templates_pending_changes` and `templates_read_report` now use `claudepot_data_dir()` from claudepot-core, honoring `CLAUDEPOT_DATA_DIR` overrides correctly. #5 `supported_platforms` was only enforced in `templates_list` — `templates_install` now ALSO rejects when `!bp.supports(HostPlatform::current())`. The contract is symmetric: a direct IPC bypass of the gallery filter no longer lets a macOS-only template instantiate on Linux/Windows. #6 Windows ACL trusted `USERNAME` — `tighten_consent_acl` now resolves the current user SID via `whoami /user /fo csv /nh`, grants `*<SID>:F` in icacls (well-known SID notation), grants SYSTEM via the literal `S-1-5-18` SID, and surfaces every failure path through `tracing::warn` instead of swallowing silently. #7 cron test wrote `CLAUDE_OAUTH_TOKEN` to disk via `automation.extra_env` (which `install_shim` renders into `run.sh`). Token now lives in a 0600 sibling file (`<tmp>/.oauth-token`); a 0700 wrapper script (`<tmp>/claude-with-token.sh`) reads the file at fire time and exec's the real claude. The shim sees only the wrapper path, never the token. #8 cron test scheduled `now + 1 minute`, which races the case where `install_shim + scheduler.register` span past the next minute boundary — launchd would defer to the same time tomorrow and the test would hang. Now schedules `now + 2 minutes`; deadline extended to 180s. #9 cron test passed on a `"result":` substring match, which is too permissive — error rows mention `result` too. Now parses `stdout.log` as a JSON event array, finds the terminal `result` event, asserts `is_error == false`. Verified locally: the cron run produced `is_error=false result_first_80= CRON_TEMPLATE_OK`. #10 `CronCleanupGuard` did not restore the previous `CLAUDEPOT_DATA_DIR` — risk of polluting subsequent tests in the same process. Now captures the prior value as `Option<OsString>` at construction and restores it on Drop. Also explicitly zero-fills + unlinks the token file so a tempdir-removal race doesn't leave the token on disk. ### Low (deferred) #11 validator rebuilds globset matcher per call. Apply is interactive (~5-50 items × 1-3 globs each, user reviews + clicks per run); not a hot loop. Caching would require an `ApplyConfig`-level matcher cache and is not worth the complexity. Documented in the audit-fix verification report. ### Verification Independent Codex re-audit on a fresh thread: | Status | Count | |---|---:| | FIXED | 10 | | NOT FIXED | 0 | | PARTIAL | 0 | | REGRESSED | 0 | | DEFERRED | 1 | Local validation: - `cargo test --workspace` — all green - `cargo test -p claudepot-core --test templates_real_llm cron_schedule_fires -- --ignored` — passes with new is_error=false assertion (110s wall, ~$0.30) Audit threadId: 019de662-808c-7b62-80b3-e85d373f92b5 Verify threadId: 019de792-3515-7533-aac0-0203a96388be

Post-deploy fixes: pin search_vec, add /office/policy to sidebar, doc push hazard

Eight findings collapsed into one refactor cycle plus two small cleanups. The grill report (in conversation; not on disk) flagged 0 critical / 0 high / 3 medium / 8 low — this addresses every actionable item. Core reshape (#1 + #6 + #7): - detect_pr returns DetectOutcome { branch, pr } so the orchestrator keys the cache directly. Halves the per-tick git invocation count (no more standalone current_branch call). - PrCache keyed by repo_root only; insert() overwrites unconditionally so a branch flip is absorbed by the next tick without explicit detection. Drops get_any_for (was O(n)), drops the dead Entry.branch field, drops refresh_one's early-return guard. Orchestrator parallelism + gh latching (#2 + #9): - tick_all fans out via tokio Semaphore + JoinSet with MAX_PARALLEL=4 cap on simultaneous gh invocations. 30-project tick now completes in ~8 batches instead of 30 sequential calls. - First MissingCli("gh") flips an AtomicBool that short-circuits every subsequent tick. gh-less users pay one detect attempt per process lifetime, then nothing. Test coverage (#8): - New PrDetector trait + RealDetector wrapper so the orchestrator is testable against a FakeDetector. Four new tokio tests cover cache round-trip, negative caching, gh-absent latch, and the bounded-parallel fan-out (12 roots vs MAX_PARALLEL=4). - DTO tests: PrInfoDto serializes PrState lowercase; ProjectInfoDto omits the pr field when None. - PrCache test for explicit overwrite semantics. UI hygiene (#3, #4, #5, #11): - Extract liveDotTitle to src/components/primitives so ProjectDetail and ProjectsTable share one implementation. - LiveStatusDot.title is now required in the prop type — the "mandatory by comment" contract was decay-prone. - BrandGithubMark license comment now correctly says the silhouette is a GitHub Inc. trademark used under the design.md brand-mark exception (was claiming CC-BY 4.0 on GitHub's mark, which doesn't apply). - cwdMatchesProject now normalizes both inputs to forward slash before prefix-checking, so a mixed-separator Windows project path matches its live cwd. Three new regression tests cover the cases. Tests: 81 + 613 green. Clippy --all-targets clean. Fmt clean.

vercel Bot had a problem deploying to Preview May 6, 2026 12:40 Failure

xiaolai merged commit 4870a33 into main May 6, 2026
7 of 8 checks passed

xiaolai deleted the post-deploy-fixes branch May 6, 2026 12:43

xiaolai added a commit that referenced this pull request May 8, 2026

Merge pull request #9 from xiaolai/post-deploy-fixes

fa98404

Post-deploy fixes: pin search_vec, add /office/policy to sidebar, doc push hazard

xiaolai added a commit that referenced this pull request May 8, 2026

Merge pull request #9 from xiaolai/post-deploy-fixes

16b256e

Post-deploy fixes: pin search_vec, add /office/policy to sidebar, doc push hazard

xiaolai added a commit that referenced this pull request May 8, 2026

Merge pull request #9 from xiaolai/post-deploy-fixes

bc88ac2

Post-deploy fixes: pin search_vec, add /office/policy to sidebar, doc push hazard

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Post-deploy fixes: pin search_vec, add /office/policy to sidebar, doc push hazard#9

Post-deploy fixes: pin search_vec, add /office/policy to sidebar, doc push hazard#9
xiaolai merged 1 commit into
mainfrom
post-deploy-fixes

xiaolai commented May 6, 2026

Uh oh!

vercel Bot commented May 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xiaolai commented May 6, 2026

Summary

1. submissions.search_vec got dropped from prod

2. /office/policy was orphaned

3. New project rule: never drizzle-kit push against prod

Test plan

Uh oh!

vercel Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `submissions.search_vec` got dropped from prod

2. `/office/policy` was orphaned

3. New project rule: never `drizzle-kit push` against prod

vercel Bot commented May 6, 2026 •

edited

Loading