Summary
Deleting all agents from the VSIX did not remove all messages — message rows/UI entries survived after every agent was gone. Either (A) the DB is not actually cascade-deleting messages on the live database, or (B) the UI is not reflecting the deletion (stale view / silent refresh failure). Both are plausible given the evidence below. Investigate and confirm which before fixing. Per request: do NOT fix yet — this issue is to nail the root cause.
Hypothesis A — FK cascade isn't actually running on the live DB
The schema declares the right cascades, and a fresh DB passes E2E — but an existing data.db may not, because of how SQLite stores FK actions.
Schema is correct — packages/too-many-cooks/prisma/schema.prisma:
Message.from (from_agent) and Message.to (to_agent) both onDelete: Cascade (lines ~47-48)
Lock.identity onDelete: Cascade (line ~34), Plan.identity onDelete: Cascade (line ~73)
Pragma is set per-connection — packages/too-many-cooks/src/db-sqlite.ts:153: db.pragma("foreign_keys = ON").
Delete relies purely on cascade — db-sqlite.ts:917-945 adminDeleteAgent runs a single DELETE FROM identity WHERE agent_name = ? and trusts the DB to cascade.
Fresh-DB E2E passes — too_many_cooks_vscode_extension/test/suite/deleteAllAgents.test.ts proves cascade works on a newly created DB.
Why an existing DB can still leak messages
The messages.to_agent cascade was added only recently — migration packages/too-many-cooks/prisma/migrations/20260525000000_add_to_agent_fk_cascade/migration.sql. SQLite bakes FK actions into the table DDL at CREATE time; PRAGMA foreign_keys = ON cannot retroactively add a cascade. That migration is a full table rebuild (RedefineTables: create new_messages with both FKs → copy → drop → rename). So whether a given data.db cascades inbound messages depends entirely on that migration having actually run against it.
Drift risk between two schema-apply paths:
- Boot path uses
prisma migrate deploy — db-sqlite.ts:148 (applyMigrations), error string "Prisma migrate deploy failed", log "Schema applied via prisma migrate deploy".
- But
packages/too-many-cooks/src/migrate.ts uses prisma db push --accept-data-loss.
A DB ever created via db push has no _prisma_migrations history; a later migrate deploy can then fail/skip applying 20260525..., leaving the old messages table without the to_agent cascade. Result: deleting an agent removes the agent but orphans every message addressed to it — exactly the reported symptom (messages survive agent deletion). CLAUDE.md states there is no legacy DB migration support ("delete the stale DB and recreate"), which makes a pre-cascade data.db a live hazard rather than a handled case.
Investigation steps (read-only)
On an affected data.db:
SELECT sql FROM sqlite_master WHERE name = 'messages'; -- does to_agent FK say ON DELETE CASCADE?
PRAGMA foreign_key_list('messages'); -- both from_agent AND to_agent present with cascade?
SELECT * FROM _prisma_migrations WHERE migration_name LIKE '20260525%'; -- was the cascade migration applied?
PRAGMA foreign_keys; -- is it ON for this connection?
If the messages DDL lacks ON DELETE CASCADE on to_agent (or the migration row is missing) → Hypothesis A confirmed: the live schema, not the code, is the bug, and the migrate deploy vs db push drift is the cause.
Hypothesis B — UI not reactive / silent refresh failure
The delete path doesn't mutate messages locally; it refetches server truth — but that refetch can silently no-op, leaving stale messages on screen.
Delete path — too_many_cooks_vscode_extension/src/services/storeManager.ts:238-257:
deleteAgent / deleteAllAgents POST /admin/delete-agent (once per agent), then call refreshStatus().
refreshStatus() swallows failures silently — storeManager.ts:196-227:
- Line 204 & 210: a
refreshSeq race guard early-returns if a newer refresh started — if requests overlap, an in-flight refresh can bail without ever dispatching SetMessages.
- Lines 205-208: a non-ok HTTP response is logged and swallowed —
return; with no error surfaced and no state update. The UI keeps showing the pre-delete messages and the user sees no indication anything failed.
Only on the happy path does it dispatch({ messages, type: 'SetMessages' }) (line 225) with server truth.
Latent reducer landmine — too_many_cooks_vscode_extension/src/state/store.ts:13-25:
the RemoveAgent reducer filters agents, locks, and plans for the removed agent but not messages. It's currently never dispatched (dead branch — grep finds no dispatch({ type: 'RemoveAgent' })), so it isn't the active cause, but if anyone later wires optimistic single-agent removal to it, it will leave orphaned messages in the store. Should be fixed for consistency.
State-architecture audit (re: "is everything on screen using signals / centralized state?")
- State IS centralized in a single immutable store —
src/state/store.ts (Store class, getState/dispatch/subscribe, immutable spread updates). Single source of truth. ✅
- It is NOT signal-based — it's a hand-rolled Redux-style
EventEmitter. Tree views subscribe and re-render on every dispatch: e.g. MessagesTreeProvider fires onDidChangeTreeData on any store change (src/ui/tree/messagesTreeProvider.ts:27-30) and re-reads selectMessages(state) in getChildren. So reactivity wiring is present and centralized. ✅
- The gap is not scattered/global mutable UI state; it's (1) the silent
refreshStatus failure path and (2) the incomplete RemoveAgent reducer. So if the symptom is UI-side, the root cause is a silent refresh no-op, not a missing-signal problem.
Repro plan (do NOT fix yet)
- Reproduce against an existing
data.db that predates 20260525... (or one created via db push). Send messages between agents A→B and B→A, delete all agents, then query the DB directly (Hypothesis A queries above) and observe the VSIX message tree. Compare DB rows vs. UI.
- If DB still has message rows → A (cascade not applied on this DB / migrate-deploy-vs-db-push drift).
- If DB rows are gone but UI still shows them → B (refresh silently failed or didn't fire). Check the extension log for
refreshStatus: response not ok / a swallowed return.
Acceptance criteria (for the eventual fix)
- Deleting all agents leaves zero message rows in the DB and zero messages in the UI, verified on a DB that predates the
to_agent cascade migration (not just a fresh DB).
- A single source-of-truth for schema application (no
migrate deploy vs db push divergence) OR an explicit guard that detects a pre-cascade messages table and rebuilds it.
refreshStatus surfaces failures instead of swallowing them (no silent stale UI).
RemoveAgent reducer also filters messages (consistency, even though currently unused).
- Regression tests covering both a fresh DB and a simulated pre-cascade DB.
Do not fix in this issue — confirm the root cause first. Marked critical: deleting agents leaving live message rows is a data-integrity / privacy concern.
Summary
Deleting all agents from the VSIX did not remove all messages — message rows/UI entries survived after every agent was gone. Either (A) the DB is not actually cascade-deleting messages on the live database, or (B) the UI is not reflecting the deletion (stale view / silent refresh failure). Both are plausible given the evidence below. Investigate and confirm which before fixing. Per request: do NOT fix yet — this issue is to nail the root cause.
Hypothesis A — FK cascade isn't actually running on the live DB
The schema declares the right cascades, and a fresh DB passes E2E — but an existing
data.dbmay not, because of how SQLite stores FK actions.Schema is correct —
packages/too-many-cooks/prisma/schema.prisma:Message.from(from_agent) andMessage.to(to_agent) bothonDelete: Cascade(lines ~47-48)Lock.identityonDelete: Cascade(line ~34),Plan.identityonDelete: Cascade(line ~73)Pragma is set per-connection —
packages/too-many-cooks/src/db-sqlite.ts:153:db.pragma("foreign_keys = ON").Delete relies purely on cascade —
db-sqlite.ts:917-945adminDeleteAgentruns a singleDELETE FROM identity WHERE agent_name = ?and trusts the DB to cascade.Fresh-DB E2E passes —
too_many_cooks_vscode_extension/test/suite/deleteAllAgents.test.tsproves cascade works on a newly created DB.Why an existing DB can still leak messages
The
messages.to_agentcascade was added only recently — migrationpackages/too-many-cooks/prisma/migrations/20260525000000_add_to_agent_fk_cascade/migration.sql. SQLite bakes FK actions into the table DDL at CREATE time;PRAGMA foreign_keys = ONcannot retroactively add a cascade. That migration is a full table rebuild (RedefineTables: createnew_messageswith both FKs → copy → drop → rename). So whether a givendata.dbcascades inbound messages depends entirely on that migration having actually run against it.Drift risk between two schema-apply paths:
prisma migrate deploy—db-sqlite.ts:148(applyMigrations), error string "Prisma migrate deploy failed", log "Schema applied via prisma migrate deploy".packages/too-many-cooks/src/migrate.tsusesprisma db push --accept-data-loss.A DB ever created via
db pushhas no_prisma_migrationshistory; a latermigrate deploycan then fail/skip applying20260525..., leaving the oldmessagestable without theto_agentcascade. Result: deleting an agent removes the agent but orphans every message addressed to it — exactly the reported symptom (messages survive agent deletion). CLAUDE.md states there is no legacy DB migration support ("delete the stale DB and recreate"), which makes a pre-cascadedata.dba live hazard rather than a handled case.Investigation steps (read-only)
On an affected
data.db:If the
messagesDDL lacksON DELETE CASCADEonto_agent(or the migration row is missing) → Hypothesis A confirmed: the live schema, not the code, is the bug, and themigrate deployvsdb pushdrift is the cause.Hypothesis B — UI not reactive / silent refresh failure
The delete path doesn't mutate messages locally; it refetches server truth — but that refetch can silently no-op, leaving stale messages on screen.
Delete path —
too_many_cooks_vscode_extension/src/services/storeManager.ts:238-257:deleteAgent/deleteAllAgentsPOST/admin/delete-agent(once per agent), then callrefreshStatus().refreshStatus()swallows failures silently —storeManager.ts:196-227:refreshSeqrace guard early-returns if a newer refresh started — if requests overlap, an in-flight refresh can bail without ever dispatchingSetMessages.return;with no error surfaced and no state update. The UI keeps showing the pre-delete messages and the user sees no indication anything failed.Only on the happy path does it
dispatch({ messages, type: 'SetMessages' })(line 225) with server truth.Latent reducer landmine —
too_many_cooks_vscode_extension/src/state/store.ts:13-25:the
RemoveAgentreducer filtersagents,locks, andplansfor the removed agent but notmessages. It's currently never dispatched (dead branch — grep finds nodispatch({ type: 'RemoveAgent' })), so it isn't the active cause, but if anyone later wires optimistic single-agent removal to it, it will leave orphaned messages in the store. Should be fixed for consistency.State-architecture audit (re: "is everything on screen using signals / centralized state?")
src/state/store.ts(Storeclass,getState/dispatch/subscribe, immutable spread updates). Single source of truth. ✅EventEmitter. Tree views subscribe and re-render on every dispatch: e.g.MessagesTreeProviderfiresonDidChangeTreeDataon any store change (src/ui/tree/messagesTreeProvider.ts:27-30) and re-readsselectMessages(state)ingetChildren. So reactivity wiring is present and centralized. ✅refreshStatusfailure path and (2) the incompleteRemoveAgentreducer. So if the symptom is UI-side, the root cause is a silent refresh no-op, not a missing-signal problem.Repro plan (do NOT fix yet)
data.dbthat predates20260525...(or one created viadb push). Send messages between agents A→B and B→A, delete all agents, then query the DB directly (Hypothesis A queries above) and observe the VSIX message tree. Compare DB rows vs. UI.refreshStatus: response not ok/ a swallowed return.Acceptance criteria (for the eventual fix)
to_agentcascade migration (not just a fresh DB).migrate deployvsdb pushdivergence) OR an explicit guard that detects a pre-cascademessagestable and rebuilds it.refreshStatussurfaces failures instead of swallowing them (no silent stale UI).RemoveAgentreducer also filtersmessages(consistency, even though currently unused).Do not fix in this issue — confirm the root cause first. Marked critical: deleting agents leaving live message rows is a data-integrity / privacy concern.