fix: clean up ClickHouse analytics rows on user/team/website/link/pixel deletion#4246
fix: clean up ClickHouse analytics rows on user/team/website/link/pixel deletion#4246anvme wants to merge 6 commits into
Conversation
deleteUser and deleteTeam left link/pixel/board rows (and their share rows) in the database after the owner was removed. /q/<slug> and /p/<slug> also kept serving deleted entries because the routes did not filter deletedAt and Redis cached lookups for 24h. - deleteUser: clean up link/pixel/board + shares for the deleted user. Cloud mode: soft-delete link/pixel, hard-delete board, only userId-owned. Non-cloud: hard-delete everything matching userId or owned teamIds. - deleteTeam: same cleanup, scoped to teamId. - /q and /p route handlers: filter deletedAt: null at the call sites (not in findLink/findPixel helpers, which would null-deref the permission checks at src/permissions/link.ts and pixel.ts). - Post-transaction Redis invalidation mirrors deleteWebsite.
…eted slugs Address Greptile review feedback on umami-software#4243. - Cloud-mode link.updateMany / pixel.updateMany now filter where: { ..., deletedAt: null } so a previously soft-deleted row keeps its original deletion timestamp instead of being restamped with the current time. - Pre-transaction findMany now selects deletedAt; the Redis invalidation list filters to only live slugs, avoiding harmless but wasted DEL calls for already-soft-deleted entries. Note: the share.deleteMany cleanup still uses the broad entityId list (not filtered by deletedAt) so that orphan share rows of already-soft-deleted links/pixels are still cleaned up. Filtering the prefetch itself, as Greptile's exact suggestion proposed, would skip those shares while link.deleteMany still hard-deletes the rows, leaving orphan share rows behind. Verified empirically with a 3-scenario reproduction.
…st queries Stacks on top of umami-software#4243. Five adjacent bugs from the same family: - deleteTeam left team-owned websites (and all dependent rows) orphaned. Added inline cleanup mirroring deleteWebsite. - non-cloud deleteUser, when hard-deleting the user's owned teams, also left team-owned websites orphaned. Extended the existing ownedFilter pattern (cloud-gated OR) to cover websites. - getTeamLinks/getUserPixels/getTeamPixels did not filter deletedAt: null, leaking soft-deleted entries into list views. - cloud deleteUser restamped already-soft-deleted websites' deleted_at; added deletedAt: null guard (same shape as link/pixel restamping fix). - Surfaced pre-existing gaps in deleteUser non-cloud: missing sessionReplaySaved/sessionReplay/revenue/segment cleanups, entityIds excluded website ids so website-shares were orphaned.
Address Greptile review feedback on umami-software#4245. - deleteLink and deletePixel now redis.client.del('link:slug' / 'pixel:slug') using the slug returned by Prisma's delete(). Previously the row was hard- deleted but the Redis cache (24h TTL) kept serving the slug, so /q/<slug> and /p/<slug> kept firing for up to a day after deletion. - updateLink and updatePixel now invalidate the cache for the current slug, and additionally for the previous slug if the slug was changed. Previously changing a link's destination URL or slug left the public cache stale. - Cloud-mode link.updateMany and pixel.updateMany in deleteUser now spread ownedFilter (which is { userId } in cloud mode) instead of hardcoding { userId }, so the cleanup intent stays consistent if ownedFilter ever evolves. Verified empirically against a Docker Postgres + Redis: deleted link's /q/<slug> returns 404 immediately (was: still redirected to old URL for 24h); slug rename invalidates both old and new cache keys.
…el deletion No delete path touched ClickHouse before this, leaving analytics events orphaned in CH after entities were removed in PostgreSQL. With CH enabled (typical high-volume setup), every delete left potentially millions of rows behind. - src/lib/clickhouse.ts: new deleteByWebsiteIds helper that issues ALTER TABLE DELETE WHERE website_id IN (...) against all 7 CH tables. Materialized View target tables (website_revenue, website_event_stats_hourly, event_data_pivot) need explicit DELETE — ClickHouse MVs don't cascade. - All 5 delete paths call the helper after the PG transaction, non-cloud only. Cloud preserves CH for billing/audit, matching the existing soft-delete retention philosophy. - For user/team, cleanup IDs are websites + links + pixels. Boards have no events. /api/send writes link.id and pixel.id into the website_id column for /q/<slug> and /p/<slug> events. - Failure handling matches the existing Redis invalidation pattern.
|
@anvme is attempting to deploy a commit to the Umami Software Team on Vercel. A member of the Team first needs to authorize it. |
Greptile SummaryThis PR adds ClickHouse analytics cleanup to every deletion path (website, link, pixel, user, team) and closes a long-standing gap where hard-deleting any entity left orphan rows in all 7 CH tables forever. It also fixes soft-delete filters on list and collect routes and wires Redis slug-cache invalidation for link/pixel updates and deletes.
Confidence Score: 4/5Safe to merge with one targeted fix in deleteWebsite. The deleteWebsite non-cloud else branch calls CH cleanup but omits redis.client.del for the website key. Because load.ts warms that Redis key with a 24h TTL on every analytics load regardless of cloud mode, a non-cloud + Redis deployment will serve the deleted website from cache for up to a day after deletion. The analogous handlers in user.ts and team.ts correctly call invalidateRedis() before the CH cleanup — this is an isolated gap in website.ts only. src/queries/prisma/website.ts — the non-cloud .then() branch needs redis.client.del alongside the ClickHouse call. Important Files Changed
Sequence DiagramsequenceDiagram
participant API
participant PG as Postgres (Prisma tx)
participant Redis
participant CH as ClickHouse
API->>PG: collect entity IDs (links, pixels, websites)
API->>PG: transaction — hard-delete / soft-delete rows
PG-->>API: commit
API->>Redis: del link/pixel/website cache keys
alt cloudMode
note over Redis: website key already cleared above
else not cloudMode and CH enabled
API->>CH: ALTER TABLE website_event DELETE WHERE website_id IN (...)
API->>CH: ALTER TABLE event_data DELETE ...
API->>CH: ALTER TABLE session_data DELETE ...
API->>CH: ALTER TABLE session_replay DELETE ...
API->>CH: ALTER TABLE website_revenue DELETE ...
API->>CH: ALTER TABLE website_event_stats_hourly DELETE ...
API->>CH: ALTER TABLE event_data_pivot DELETE ...
CH-->>API: mutations queued (async in CH)
end
API-->>API: return result
Reviews (1): Last reviewed commit: "fix: clean up ClickHouse analytics rows ..." | Re-trigger Greptile |
Pre-existing bug surfaced by this PR's restructure: deleteWebsite only
called redis.client.del('website:${id}') in cloud mode, but
fetchWebsite (src/lib/load.ts) caches the same key whenever Redis is
enabled regardless of cloud mode. So non-cloud + Redis deployments served
deleted websites from cache for up to 24h.
Split the .then() into two independent guards:
- Redis del runs whenever redis.enabled (matches user.ts/team.ts pattern)
- CH cleanup runs only when !cloudMode (preserves cloud retention policy)
Verified empirically: cache key EXISTS=1 before delete, EXISTS=0 after.
|
Hey @mikecao I hope you merge this |
Third in the deletion-cleanup series after #4243 (link/pixel/board orphans on user/team delete) and #4245 (cascade websites + soft-delete leaks). Stacks on top of #4245 — until those merge, the GitHub diff includes their commits too. For incremental review now, focus on commit
871aa3b.Problem
No delete path in the codebase touched ClickHouse before this. With
CLICKHOUSE_URLset (the typical high-volume setup), every delete left analytics events orphaned forever:website_event,event_data,session_data,session_replay,website_revenue,website_event_stats_hourly,event_data_pivot./api/sendwriteslink.idandpixel.idinto thewebsite_idcolumn for/q/<slug>and/p/<slug>events (src/app/api/send/route.ts:98), so link/pixel events orphan under their entity id, not a website id.Solution
src/lib/clickhouse.tsdeleteByWebsiteIds(ids)helper. IssuesALTER TABLE ... DELETE WHERE website_id IN (...)against all 7 CH tables in parallel viaclient.command(). ClickHouse Materialized Views are insert-time triggers and do not cascade mutations, so the 3 MV target tables (website_revenue,website_event_stats_hourly,event_data_pivot) need explicit DELETEs.deleteWebsite/deleteUser/deleteTeam/deleteLink/deletePixelFor
deleteUser/deleteTeam, the cleanup ID set iswebsites + links + pixels(boards generate no analytics events).Why
ALTER TABLE DELETEinstead of lightweightDELETE FROM?Lightweight
DELETE FROMis generally recommended by CH docs, BUT it throws by default on tables with projections (website_eventhas 2 projections,session_datahas 1).ALTER TABLE DELETEworks uniformly across all 7 tables without per-query setting tweaks. Trade-off: heavier mutation, async by default — acceptable for the rare delete-on-erasure use case.Failure handling
awaitand propagate (matches the existing Redis invalidation pattern). If CH is briefly unreachable, the function rejects after PG has already committed — same dual-write race as Redis. Outbox/retry pattern is flagged as a separate PR for closing that race across both Redis and CH consistently.Test plan
pnpm build-appclean/api/send, deleted website. CH state:website_event3 -> 0,website_event_stats_hourly3 -> 0, all 7 mutationsis_done=1insystem.mutations. Both projection tables (website_event,session_data) processed cleanly./q/<slug>to fire events, deleted link.website_eventfor link.id: 2 -> 0.Out of scope (separate PRs / discussion)
resetWebsite: also wipes PG analytics without touching CH today; same fix pattern would apply.