Skip to content

hasql-interpolate + record-dot migration: derived ENUM/CITEXT decoders#409

Merged
tonyalaribe merged 17 commits into
masterfrom
telemetry-derive-row-instances
Jun 2, 2026
Merged

hasql-interpolate + record-dot migration: derived ENUM/CITEXT decoders#409
tonyalaribe merged 17 commits into
masterfrom
telemetry-derive-row-instances

Conversation

@tonyalaribe

@tonyalaribe tonyalaribe commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Lands the in-progress hasql-interpolate migration on top of master. Replaces ad-hoc per-query ::text casts with a single derived path: WrappedEnumSC now carries the backing Postgres type, so DecodeRow/EncodeRow resolve real CREATE TYPE … AS ENUM and citext/email columns through hasql's OID lookup automatically — without changing the deriving site twice or sprinkling casts into the SQL.

WrappedEnumSC shape change

  • Now: WrappedEnumSC (qualType :: Maybe Symbol) (prefix :: Symbol) a.
    • 'Nothing → unchanged behaviour, uses D.text/E.text (TEXT-backed enums).
    • ('Just \"schema.type_name\") → uses D.enum/E.enum, so the column's domain-specific OID is matched at session init.
  • HI.DecodeValue (CI Text) now uses D.citext (instead of D.text).
  • Helper refineText deduplicates the three near-identical WrappedEnum* DecodeValue instances.

Sites switched to qualified DB enums

Type Schema.name
Permissions projects.project_permissions
IssueType apis.issue_type
AnomalyTypes apis.anomaly_type
AnomalyActions apis.anomaly_action
QueryLibType projects.query_library_kind
KeyKind apis.schema_key_kind

Every other WrappedEnumSC deriving site got 'Nothing prefixed (mechanical).

Migrations

  • 0097_hasql_decoder_alignment.sql — drop the email domain (column becomes plain citext + format CHECK constraint), widen apis.log_patterns.baseline_samples and projects.replay_sessions.event_file_count to BIGINT.
  • 0098_widen_int_to_bigint.sql — batch int4 → int8 for every INT column whose Haskell field is Int (decoded via D.int8): apis.issues.{affected_requests,affected_clients,llm_enhancement_version,seq_num}, apis.issues.error_rateDOUBLE PRECISION, all apis.error_patterns.occurrences_* / quiet_minutes / resolution_threshold_minutes / baseline_samples / notify_every_minutes / regression_count, all monitors.query_monitors.{check_interval_mins,threshold_sustained_for_mins,renotify_interval_mins,stop_after_count,time_window_mins,notification_count}, apis.notification_rate_limit.count.

Code fixes surfaced by the strict decoders

  • Issue gains cooldownUntil / lastNotifiedAt (rows already existed since migrations 0075 / 0085); the IssueL list SQL selects them.
  • BackgroundJobs SafetyNet reprocess query now reuses Telemetry.otelSpanColsSql (was missing errors + message_size_bytes).
  • consumeNotificationToken returns count directly (was explicitly casting to int4).
  • make_interval call casts hours to int (only signature PG provides).
  • getAnomaliesVM format_examples placeholder is '{}'::TEXT[] (was jsonb[]).
  • LogQueries.bis/cnts and totalSessions widened to Int64.
  • Pkg.Parser alert select emits count(*)::float8 (was ::integer).
  • Projects.downgradeToFree / upgradeToPaid cast bigint params to text for the TEXT order_id comparison.
  • Models.Apis.Issues.isInCooldown SELECTs 1::bigint (so the Int64 decoder matches).

Test plan

  • make test-unit — 207 examples, 0 failures.
  • make test-integration — 530 examples, 0 failures, 25 pending.
  • Review migration ordering against any other open branches before merge (0097/0098 currently land on top of 0096).

tonyalaribe and others added 8 commits June 2, 2026 16:46
Reorder OtelLogsAndSpans fields to match otelSpanColsSql so both row
instances drop to one-liner anyclass derives. Add the previously-absent
errors and message_size_bytes columns to the SELECT path (and errors to
the INSERT path) so they round-trip instead of being silently defaulted
to Nothing / 0 on read. errors switches to Maybe (AesonText Value) for
consistency with body/events. Consolidates spanRecordByName onto the
shared otelSpanColsSql snippet so it can't drift again. Net: ~110 lines
of hand-written instance plumbing gone.
QueryCache.hs (hit_count) and Issues.hs (affected_requests,
affected_clients) decode into Haskell Int (= int8) but the underlying
columns are INTEGER (int4), so every dashboard fetch and every cache
lookup was throwing UnexpectedColumnTypeStatementError under hasql.
Cast on the SELECT rather than narrowing the record field — safer
because the same field types are used as Int across the codebase.
apis.issues.seq_num is INT; cast to bigint in the issue-list SELECT so
the generic DecodeRow doesn't blow up on int4-vs-int8.

projects.query_library.query_type is an enum; cast to text in the three
UNION branches of queryLibHistoryForUser so the generic decoder reads it
as Text instead of a custom OID.

Add migration 0096 to convert the four remaining VARCHAR(n) columns
(users.first_name/last_name, projects.teams.name/handle) to TEXT.
VARCHAR's only practical difference is the catalog OID, and that OID
mismatch was crashing every team-list and member-list lookup under the
new hasql decoders.
cabal.project imported a 3300-line in-repo stackage snapshot just to strip
hs-opentelemetry-* pins so the source-repository-package override at 1.0.0.0
could win. cabal.project.freeze already covers all pinned versions
including source-repo packages, so the vendored snapshot is redundant.

Also fixes CI: the Dockerfile COPY glob did not include stackage-pins.config,
so docker builds were failing with 'stackage-pins.config: does not exist'.
The sidenav active-class hyperscript matched only the link href prefix, so
navigating to /p/<pid>/endpoints (rendered by the API Catalog handler) lost
the highlight after htmx pushState. Add a data-match attribute carrying
extra path prefixes; the API Catalog item now declares /endpoints as a
secondary match.
These were accidentally committed in 3e6f8a2; they are local work-in-progress
notes, not source. Keep working copies on disk.
@claude

claude Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review — Telemetry derive row instances

Overview

Solid cleanup PR. The headline change — deriving FromRow/HI.DecodeRow for OtelLogsAndSpans generically and deleting ~111 lines of hand-written boilerplate — is exactly right. The intbigint sweep, removal of intermediate let pidText = ... bindings in favour of #{pid.toText}, and the ProjectPatch co-location are all good. Repo-level noise (stackage-pins.config deletion, cabal.project.freeze refresh) is clean.


Issues

1. OtelLogsAndSpans field order is a silent footgun

Deriving generic DecodeRow ties the record's field declaration order to otelSpanColsSql's column order permanently. Reordering either side for any reason will silently decode wrong values with no compile error. The comment warns about it but there's no mechanical guard. Consider a compile-time assertion (type-level length check) or at minimum a doc-test that round-trips a known row, so a future mismatch is caught at CI rather than in production.

2. queryLibHistoryForUser UNION has inconsistent query_type casts

Branches 1 and 3 cast query_type::text; branch 2 does not. PostgreSQL resolves UNION column types from the first SELECT, so this works silently — but the inconsistency is confusing and will produce a type error if the branch order ever changes. Make all three uniform.

-- branch 2 should be:
SELECT id, project_id, created_at, updated_at, user_id, query_type::text, ...

3. WrappedEnum* decode instances — repeated pattern, could use a helper

All three DecodeValue instances (WrappedEnum, WrappedEnumSC, WrappedEnumShow) now have the same 5-line D.refine structure. A small helper would halve this and make future additions trivial:

refineWith :: (Text -> Maybe a) -> Text -> D.Value a
refineWith f ctx = D.refine (\t -> maybe (Left (ctx <> ": cannot parse " <> t)) Right (f t)) D.text

Then each instance becomes a one-liner again, matching the previous density.

4. hourlyRows type annotation vs SQL type mismatch in LogQueries.hs

hourlyRows :: [(Text, UTCTime, Int)] <- Hasql.interp [HI.sql|... event_count::BIGINT ...|]

The annotation says Int; the SQL now returns bigint. On 64-bit GHC Int is 64 bits so hasql's decoder accepts this, but the annotation is misleading. Change to Int64 or Int with a note, whichever the rest of the call-site expects.

5. fst3 import for a trivial projection

Data.Tuple.Extra is imported solely for fst3. This is lighter as a lambda: map (\(a,_,_) -> a) rMsgs. Avoids the import and is immediately readable.


Minor nits

  • getField u = UUID.toText u.getUserId / getField l = UUID.toText l.lemonSubId — these HasField instances now name the lambda argument but don't need to; eta-reduction (getField = UUID.toText . (.getUserId)) is shorter and consistent with point-free style elsewhere.
  • issueMutate signature change V.Vector IssueId → [IssueId] is good — removes a needless conversion — but the setAckState/setArchiveState change deserves a note that the SQL uses = ANY(#{iids}) and hasql-interpolate encodes [a] the same as V.Vector a.
  • The cabal.project fork comment for hasql-interpolate is well-written; include a link to the upstream PR so it's easy to drop once merged.

Summary

The generic derivation and SQL type fixes are correct and the net line count reduction is substantial. Address the UNION cast inconsistency (bug-risk), add a safety net for the field-order coupling (correctness risk), and clean up the three minor points above before merging.

@blacksmith-sh

This comment has been minimized.

Replace string-concat + rawSql glue with single multiline [HI.sql|...|]
quasiquotes using ^{sql} splices for fragments and #{val} for values.
Drop NeatInterpolation usage. Collapse countEndpointInbox default to
correct enp alias.
@claude

claude Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review

Overview

This PR upgrades hasql-interpolate to a fork commit, derives FromRow/DecodeRow for OtelLogsAndSpans generically (eliminating ~120 lines of hand-rolled boilerplate), migrates ::int::bigint SQL casts throughout, removes V.fromList wrappers now that the library accepts [a] directly, introduces a ProjectPatch record to replace 7-arg patchProjectSettings, and refactors summary-generation helpers.

The net delta is a solid -3700 lines — a big win.


Potential Bugs

1. Int vs Int64 mismatch after ::bigint migration (several sites)

In hasql, int8/bigint decodes as Int64, not Int. Multiple call sites still annotate return types as Int while the SQL now casts to ::bigint:

-- BackgroundJobs.hs
jobStats :: [(Text, Int)] <- Hasql.interp [HI.sql|SELECT status, COUNT(*)::bigint ...|]
stuckJobs :: [Int]        <- Hasql.interp [HI.sql|SELECT COUNT(*)::bigint ...|]

-- Telemetry.hs  getUsageTotals
getUsageTotals :: ... -> Eff es (Int, Int64, Int, Int64)
(eC, eB) <- Hasql.interpOne [HI.sql| SELECT count(*)::bigint, ...|]

-- Telemetry.hs  getProjectStatsForReport / getProjectStatsBySpanType
... -> Eff es [(Text, Int, Int)]
[HI.sql| ... COUNT(*)::bigint AS total_error_events, COUNT(*)::bigint ...|]

In contrast, LogQueries.hs correctly updated the annotation:

hourlyRows :: [(Text, UTCTime, Int64)] <- Hasql.interp [HI.sql|SELECT ..., event_count::BIGINT ...|]

If hasql-interpolate uses OID-strict decoders (as the email::text comment suggests), these will fail at runtime with a type-mismatch decode error. Please audit all Int-annotated decode sites touched by this PR and either restore ::int casts where Int is needed, or update the annotations to Int64.


2. downgradeToFreeInt passed where Text column is likely expected

-- Before
let oid = show orderId' :: Text
EHasql.interpExecute [HI.sql|... WHERE order_id = #{oid}|]

-- After
EHasql.interpExecute [HI.sql|... WHERE order_id = #{orderId}|]

orderId is now an Int. If order_id is a text column (which the pre-existing show strongly implies), PostgreSQL won't find a text = int4 operator and the query will fail at runtime. Similarly for upgradeToPaid where subId and subItemId have explicit ::text casts added — but orderId in the OR branch does not:

OR (sub_id IS NULL AND order_id = #{orderId})

Please verify the schema type of order_id / sub_id / first_sub_item_id, or add explicit ::text casts uniformly.


3. insertProjectMembers — return value semantics changed

-- Before: returned count of inserted/updated member rows
pure n

-- After: returns count of @everyone teams updated
pure teamsUpdated

The function signature -> Eff es Int64 is unchanged. If callers treat the result as "how many members were inserted" (a reasonable assumption), they'll now get the number of teams updated instead, which could be 0 even when members were inserted (e.g. if @everyone already had those emails). Worth double-checking callers' expectations, or adding a note in the signature's Haddock.


Minor Issues

4. generateLogSummary fallback logic change

-- Before: if null elements (which was `normalLogElements`), use `rawDataLogElements`
V.fromList $ if null elements then rawDataLogElements else elements

-- After
V.fromList $ if isRawDataLog || null normalLogElements then rawDataLogElements else normalLogElements

The old code used the runtime-selected elements variable. The new code is more explicit, but the guard condition is slightly different. isRawDataLog may already imply null normalLogElements, but this should be double-checked to ensure no regression in log display.


5. spanRecordByName now returns errors and message_size_bytes

-- Before: hand-rolled SELECT without errors/message_size_bytes (set to Nothing/0)
-- After: uses otelSpanColsSql which includes both columns

This is correct and fixes a latent incompleteness. Just verify no caller was relying on errors = Nothing or message_size_bytes = 0 for this specific query path.


Improvements Worth Noting

  • Derived FromRow/DecodeRow for OtelLogsAndSpans with a doctest pinning length otelColumns == 25 is a great guard against sync drift.
  • foldMap for Maybe SQL fragments instead of maybe "" is more idiomatic.
  • ProjectPatch record replacing 7 positional Maybe args is a clear readability win.
  • patchProjectSettings callers with positional Maybes were easy to accidentally reorder; the record eliminates that class of bug.
  • submissionOutcome as a pure projection keeps UsageSubmission plain-data and the business logic separate.
  • Removing NeatInterpolation in favour of HI.sql throughout is more consistent.

@claude

claude Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review: Telemetry derive row instances

Great direction overall — eliminating ~4400 lines of hand-written boilerplate (row decoders, let (a,b,c) = (r.a, r.b, r.c) bindings, V.fromList wrappers, string-concatenated SQL) in exchange for 673 additions is an impressive ratio. The SQL reformatting with the quasi-quoter's ^{} fragment splicing is much more readable. A few things to check before merge:


Potential bugs

[BackgroundJobs.hs] type annotation / SQL type mismatch

jobStats   :: [(Text, Int)]  <- Hasql.interp  [HI.sql|… COUNT(*)::bigint …|]
stuckJobs  :: [Int]          <- Hasql.interp  [HI.sql|… COUNT(*)::bigint …|]

COUNT(*)::bigint returns int8; hasql-interpolate's DecodeValue Int instance decodes from int4. On a 64-bit system Haskell's Int is 64-bit but the wire OID still mismatches. This will produce a runtime UnexpectedResult decode error. Fix: annotate both as Int64 (or cast to ::int in SQL, but you've intentionally moved everything to bigint). The same pattern appears in Settings.hs (recentCount :: Int / COUNT(*)::BIGINT) and a handful of other call sites.


generateSpanSummary: db.system appears twice in normalElements

normalElements =
  catMaybes
    [ 
    , tag "db.system" "neutral" <$> dbSys          -- plain text badge
    , 
    , dbBadge <$> dbSys                              -- vendor-coloured right badge
    , 
    ]

The old code produced only the vendor-coloured right badge (e.g. db.system;right-badge-postgres⇒postgres) and used kindElt to show kind;neutral⇒database. The new code emits three database-related elements for the same span. If both badges are intentional, fine; otherwise drop tag "db.system" "neutral" <$> dbSys.


resourceFallbackElements: flat key vs. nested traversal
Old code traversed resource → process (Object) → executable (Object) → name / pid. New code:

tag "process" "neutral" <$>
  (nonEmptyT (atMapText "process.executable.name" resM) <|>
   ("PID " <>) . show <$> atMapInt "process.pid" resM)

If atMapText does a flat-key lookup (looks for the literal key "process.executable.name" in the resource map) and the resource is stored as a nested JSON object ({"process": {"executable": {"name": "..."}}}}), this silently produces Nothing for all spans that previously showed a process name. Verify atMapText handles dotted paths, or restore the nested traversal.


Minor

downgradeToFree — type change needs DB column verification
Old: let oid = show orderId' :: Text then WHERE order_id = #{oid}.
New: passes orderId :: Int directly without explicit cast. If order_id is a text column, hasql-interpolate will refuse to encode an Int there (OID mismatch). Confirm the column type; if it's text, keep the show + ::text cast.

LogQueries.hsfromIntegral round-trip

hourlyRows :: [(Text, UTCTime, Int64)] <- Hasql.interp [HI.sql|… event_count::BIGINT …|]
let volumeMap = HM.fromListWith (++) [(h, [(t, fromIntegral c :: Int)]) | (h, t, c) <- hourlyRows]

Fetching bigint into Int64 then immediately converting back to Int with fromIntegral is awkward. If buildHourlyBuckets can accept Int64, drop the conversion entirely; otherwise keep the old ::int SQL cast and Int type.


Things done well

  • Deriving HI.DecodeRow / FromRow for OtelLogsAndSpans generically deletes ~90 lines of field-by-field boilerplate that was perpetually out of sync.
  • The doctest pin -- >>> length otelColumns -- 25 to catch column-count drift between the record, the SELECT list, and the INSERT list at CI time is excellent.
  • foldMap on Maybe instead of maybe mempty f for optional SQL fragments is idiomatic.
  • patchProjectSettings gaining a ProjectPatch struct improves all call sites.
  • Eliminating V.fromList / intermediate let (a,b,c) = (r.a, r.b, r.c) bindings is exactly the right use of OverloadedRecordDot + the updated hasql-interpolate list encoder.
  • insertProjectMembers CTE: grouping emails per project before the UPDATE is more correct and avoids N UPDATE calls.
  • isTransientUsageError delegating to IsError.isTransient is the right approach for library-version resilience.

tonyalaribe and others added 2 commits June 2, 2026 21:04
WrappedEnumSC now takes a leading `Maybe Symbol` pinning the backing PG
type. `'Nothing` keeps the old text-backed behaviour; `('Just "schema.name")`
swaps in `D.enum`/`E.enum` so hasql-interpolate's strict OID lookup resolves
real `CREATE TYPE … AS ENUM` columns through generic DecodeRow without
per-query `::text` casts. `HI.DecodeValue (CI Text)` switches to `D.citext`
to cover the citext extension type.

Updated qualified sites: Permissions, IssueType, AnomalyTypes,
AnomalyActions, QueryLibType, KeyKind. Everything else gets `'Nothing`.

Migrations:
- 0097 drops the `email` domain (now plain citext + format CHECK) and
  widens `apis.log_patterns.baseline_samples` /
  `projects.replay_sessions.event_file_count` to BIGINT.
- 0098 batches the rest of the int4 → int8 widening (issues affected_*,
  llm_enhancement_version, seq_num; error_patterns occurrences/etc.;
  query_monitors check_interval_mins/threshold/notification_count/…;
  notification_rate_limit.count) plus `apis.issues.error_rate` → float8.

Code fixes surfaced by the strict decoders:
- `Issue` gains `cooldownUntil` / `lastNotifiedAt` (rows existed since
  0075/0085) and the IssueL list SQL selects them.
- BackgroundJobs SafetyNet uses `Telemetry.otelSpanColsSql` (was missing
  `errors` + `message_size_bytes`).
- consumeNotificationToken returns `count` directly (no int4 cast).
- make_interval call casts hours to `int` (only signature PG provides).
- Anomalies VM format_examples now `TEXT[]` (matches Haskell Vector Text).
- LogQueries bis/cnts and totalSessions widened to Int64.
- Parser alert select emits `count(*)::float8` for Double consumers.
- Projects.downgradeToFree / upgradeToPaid cast bigint params to text for
  the TEXT `order_id` comparison.
- isInCooldown SELECTs `1::bigint` so the Int64 decoder matches.
@tonyalaribe tonyalaribe changed the title Telemetry derive row instances hasql-interpolate + record-dot migration: derived ENUM/CITEXT decoders Jun 2, 2026
@claude

claude Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review

Overview

This PR migrates to hasql 1.10's new API (schema-qualified E.enum, isTransient, toPreparableStatement), adds a qualType parameter to WrappedEnumSC to distinguish real PG ENUM-backed columns from TEXT ones, and eliminates ~60 lines of manual HI.DecodeRow OtelLogsAndSpans by reordering the record fields to match otelSpanColsSql and deriving it via anyclass. Also adds three schema migrations to fix OID mismatches exposed by the strict generic decoder.


Bugs / Risks

1. OtelLogsAndSpans field–column order is only count-pinned, not order-pinned

The doctest -- >>> length otelColumns pins the field count at 25. But generic DecodeRow requires that the Haskell field order exactly matches the SQL column order in otelSpanColsSql. A future author who adds a field in the middle of the record (or reorders otelSpanColsSql) will get silently misdecoded values — no compile error, no runtime error, just wrong data. The doctest won't catch that.

Consider adding a second doctest or a unit test that round-trips a known OtelLogsAndSpans value through the decoder to validate order, not just count.

2. errors field type change — no matching DB migration

-- before
errors :: Maybe AE.Value

-- after
errors :: Maybe (AesonText AE.Value)

AesonText decodes by reading the raw bytes as text and then parsing JSON, while direct AE.Value decodes from JSONB natively. If the errors column is jsonb, the new AesonText decoder will receive a JSONB OID, not TEXT, and will either fail or need a different path. None of the three new migrations touch the errors column. This needs either a migration or confirmation that AesonText handles JSONB OIDs correctly.

3. IssueHI.EncodeRow derived with new fields not in insertIssue

cooldownUntil :: Maybe ZonedTime
lastNotifiedAt :: Maybe ZonedTime
-- ...
deriving anyclass (FromRow, HI.DecodeRow, HI.EncodeRow, NFData, ToRow)

insertIssue doesn't insert cooldown_until / last_notified_at, which is correct. But HI.EncodeRow is now derived generically and will include these fields. If any code path does a full-row encode (e.g. UPDATE … SET (col1, col2, …) = ROW(issue) style), it will encode the wrong number of columns. Is EncodeRow actually used anywhere for Issue? If not, consider removing it from the derive list to avoid the trap.


Minor Issues

4. Redundant ::apis.issue_type cast in insertIssue

#{i.issueType}::apis.issue_type

IssueType is now WrappedEnumSC ('Just "apis.issue_type") "" IssueType, whose EncodeValue already calls E.enum (Just "apis") "issue_type" id. The explicit SQL cast is now redundant. It's harmless, but confusing — a reader might think the OID hint isn't being set at the Haskell level.

5. splitQualType called at value level in every encoder/decoder

instance ... => HI.EncodeValue (WrappedEnumSC qualType prefix a) where
  encodeValue = case maybeSymbolVal (Proxy @qualType) of
    Just qn ->
      let (sch, typ) = splitQualType qn  -- called once per dictionary construction
       in contramap ... (E.enum sch typ id)

This runs splitQualType each time the instance dictionary is built (once per statement). GHC will almost certainly CSE/specialise this away, but making it explicit with {-# SPECIALIZE #-} or moving it to a where with a NOINLINE guard would make the intent clear. As-is, not a real problem — just something to be aware of.

6. WrappedEnum / WrappedEnumShow silently switched from E.enum to E.text

The old E.enum in hasql 1.x was a text-wire encoder (it sent the value as a text string that Postgres would cast). The new E.enum sch typ fn is an OID-aware encoder. Switching to contramap ... E.text is semantically correct only if these columns are TEXT-typed in Postgres. The migrations don't touch them, so presumably they've always been TEXT — but this deserves a note since it's a silent wire-format implication.


What's Good

  • Removing the 60-line manual HI.DecodeRow OtelLogsAndSpans is a huge win. The record reorder + anyclass approach is exactly the right move.
  • KnownMaybeSymbol typeclass is clean and idiomatic type-level plumbing.
  • refineText helper is a good abstraction — replaces ad-hoc D.enum-as-text-decoder with an explicit, well-named utility.
  • Inlining pid.toText directly into interpolation sites (dropping let pidTxt = …) is a nice cleanup.
  • hasql error message upgrade via toDetailedText and delegating isTransient is strictly better than the hand-rolled pattern match.
  • Migration docs are excellent — each file explains the OID-mismatch root cause clearly.
  • spanRecordByName using otelSpanColsSql instead of a bespoke column list closes a drift vector.

PG rewrites the heap+TOAST+indexes once per ALTER TABLE statement
regardless of how many ALTER COLUMN clauses it contains, so collapsing
23 statements into 4 (one per table) cuts the migration's I/O by ~6x.
Notable on apis.error_patterns (8.9 GB TOAST) which goes from 9
rewrites to 1; reduces lock-hold window during prod deploy.

Also drop the Int64->Int round-trip in fetchLogPatterns: hasql's
DecodeValue Int instance already reads int8, so decoding directly into
Int (and dropping the fromIntegral) matches buildHourlyBuckets's
signature without the cast.
@claude

claude Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review

Overview

Excellent PR overall. Migrates string-interpolated SQL (NeatInterpolation, raw concatenation) to proper parameterised hasql-interpolate quasi-quotes throughout, aligns Haskell int types with Postgres bigint columns, and lands the WrappedEnumSC shape change for OID-matched enum decoding. The reduction in boilerplate (pidText, V.fromList, intermediate let bindings, split deriving lines) is substantial and good.

Bug: V.Vector Int32 vs BIGINT[] in fetchSessionSummary

In LogQueries.hs the type annotation is unchanged:

rows :: [(Int64, ..., V.Vector Int32, V.Vector Int32, V.Vector Int32)]

This PR changed the SQL to produce BIGINT[] for the three array columns (bi, c, e in the bkt CTE, changed from ::INT to ::BIGINT). The Haskell decoder for V.Vector Int32 uses D.arrayOf D.int4 and expects int4[]. The OID will not match int8[] and will throw UnexpectedColumnTypeStatementError at runtime on any non-empty result set. RawSessionRow was correctly updated (bis/cnts :: V.Vector Int64) but this separate query was missed.

Fix: change the three trailing V.Vector Int32 to V.Vector Int64 in that tuple.

Style: two separate deriving lines still on FieldTypes / FieldCategoryEnum

In Catalog.hs these two types still have two separate deriving lines for WrappedEnumSC 'Nothing "FT". Every other type in the PR was collapsed to one deriving line. Combine them.

Minor: runtime case on a type-level Maybe Symbol

The new encodeValue/decodeValue instances for WrappedEnumSC dispatch via a runtime case on maybeSymbolVal. Since qualType is a type-level Symbol the branch is determined at compile time, but GHC only eliminates the case if it specialises the instance. A fully-static alternative is two overlapping instances (one for Nothing, one for KnownSymbol s => Just s) which eliminates the KnownMaybeSymbol class entirely. Low-urgency given -O2 but worth the small change for conciseness.

Positive notes

  • foldMap replacing maybe "" is idiomatic and correct.
  • Dropping V.fromList wrappers before interpolated params is clean; hasql-interpolate handles lists natively.
  • Inlining pid.toText at the call site cuts noise across 10+ sites.
  • The selectIssues rewrite is a correctness win: parameterised timestamptz bindings replace formatUTC raw-string interpolation.
  • insertProjectMembers CTE simplification using array_agg then unnest is cleaner and more efficient than the old correlated subquery.
  • Replacing manual DecodeRow instances for UsageSubmission and OtelLogsAndSpans with anyclass derivation is a good reduction; the field-order dependency is documented in the PR description.
  • D.citext for CI Text is a genuine correctness fix; the old D.text would fail on citext/email columns with strict OID checking.

…lTupleReponse

getNormalTupleReponse returns only the first line as text and the
second line as Maybe — so 'lines response' always yielded one element
and every classifyIssueCriticality call short-circuited to 'Invalid
response format from LLM'. Parse the raw LLM response with a prefix
pattern and strip each line.
@claude

claude Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review — PR #409: hasql-interpolate + WrappedEnumSC derived ENUM/CITEXT decoders

Overview

Substantial, well-motivated PR. The -4655 / +924 ratio is the headline: replacing hand-rolled string SQL, per-query ::text casts, and manual FromRow/DecodeRow instances with derived generics + proper OID-aware enum encoding. The OtelLogsAndSpans boilerplate deletion (~100 lines) and the generateSpanSummary / generateLogSummary rewrites are particularly clean.


Potential Bugs / Type Mismatches

1. fetchSessionSummary array element type still Int32

The SQL now emits BIGINT[] but the explicit type annotation still decodes as Vector Int32:

-- src/Models/Apis/LogQueries.hs
rows :: [(Int64, Int64, Int64, Int64, Int64, Int64, Int64, Int64, V.Vector Int32, V.Vector Int32, V.Vector Int32)] <-

Meanwhile the query now returns '{}'::BIGINT[] fallbacks and ARRAY_AGG(bi) where bi :: bigint. hasql-interpolate performs OID-based dispatch; Vector Int32 maps to int4[], not int8[]. This will throw at runtime on any non-empty result.

2. hourlyRows :: [(Text, UTCTime, Int)] vs ::BIGINT

-- src/Models/Apis/LogQueries.hs
hourlyRows :: [(Text, UTCTime, Int)] <- Hasql.interp [HI.sql|... event_count::BIGINT ...|]

Int in hasql-interpolate resolves to D.int4 (OID 23), not D.int8 (OID 20). Will fail with UnexpectedColumnTypeStatementError when the statement is prepared. Use Int64 or keep the column as int4.

3. stuckJobs :: [Int] vs ::bigint

Same issue in BackgroundJobs.hs:

stuckJobs :: [Int] <- Hasql.interp [HI.sql|SELECT COUNT(*)::bigint ...|]

Succinctness

4. Two deriving lines remain for FieldTypes and FieldCategoryEnum

-- src/Pkg/SchemaLearning/Catalog.hs
deriving (AE.FromJSON, AE.ToJSON, FromField, ToField) via WrappedEnumSC 'Nothing "FT" FieldTypes
deriving (HI.DecodeValue, HI.EncodeValue) via WrappedEnumSC 'Nothing "FT" FieldTypes

Every other site in this PR merged them into one line. Should be:

deriving (AE.FromJSON, AE.ToJSON, FromField, HI.DecodeValue, HI.EncodeValue, ToField) via WrappedEnumSC 'Nothing "FT" FieldTypes

Same for FieldCategoryEnum.

5. Unnecessary let bindings in archiveHosts / unarchiveHosts

let dir = directionClauseSql outgoingM
 in Hasql.interpExecute [HI.sql| ... AND archived_at IS NULL ^{dir} |]

^{} accepts any expression — inline it:

Hasql.interpExecute [HI.sql| ... AND archived_at IS NULL ^{directionClauseSql outgoingM} |]

6. uncurry for severityBadge

(\(style, txt) -> tag "severity_text" style txt) . severityBadge

is just:

uncurry (tag "severity_text") . severityBadge

Design Notes (non-blocking)

7. UsageSubmission field explosion

The old manual DecodeRow folded three DB columns into one SubmissionOutcome. The new approach derives HI.DecodeRow generically but exposes status :: Text, submittedAt, lastError as raw fields and adds a submissionOutcome helper. Acceptable trade, but callers must call submissionOutcome rather than pattern-matching a typed field directly.

8. WrappedEnum / WrappedEnumShow decode changed from D.enum to D.text

refineText wraps D.text, so any column backed by these newtypes that is a real PostgreSQL ENUM type (not TEXT) will fail with UnexpectedColumnTypeStatementError. The PR assumes all WrappedEnum/WrappedEnumShow sites are TEXT-backed. Worth a quick grep to confirm no WrappedEnum-backed column is a real PG enum.

9. toPreparableStatement replaces dynamicallyParameterized ... True

The True flag on the old call enabled client-side prepared statement caching. Verify toPreparableStatement preserves the same caching semantics for the chunked OTEL insert path (performance-sensitive).


Minor positives worth calling out

  • Eliminating pidText = pid.toText locals throughout is correct and idiomatic with record-dot.
  • V.fromList removal everywhere; hasql-interpolate's list instances avoid the intermediate vector allocation.
  • otelSpanColsSql now includes errors + message_size_bytes (fixing the silent Nothing/0 defaults in the old manual instances).
  • D.citext for CI Text is the right fix — the old D.text would have broken once the email domain is dropped.
  • BS.filter (/= 0) msg in processMessages is more efficient than the old decode-scrub-reencode round-trip.

Summary

Items 1-3 are type mismatches that cause runtime failures when OID-based dispatch fires; likely only surface under load when arrays/counts exceed int4 range. Item 4 is a consistency nit against the rest of the PR. Items 5-6 are small succinctness misses. Everything else is non-blocking.

@claude

claude Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review

Great migration — the WrappedEnumSC unification and refineText deduplication are clean wins. A few issues found, ranging from likely runtime crashes to minor style points.


🔴 Critical: Int vs Int64 type mismatches (runtime UnexpectedColumnTypeStatementError)

The SQL was widened to ::bigint (OID 20) in many places, but the Haskell type annotations were not updated. hasql-interpolate's strict OID checker will fail at decode time:

BackgroundJobs.hs:

-- These all say Int but bind against ::bigint SQL:
issueCounts  :: [(Projects.ProjectId, Text, Int)]
alertCounts  :: [(Projects.ProjectId, Text, Int)]
growthData   :: [(Projects.ProjectId, Int, Int)]
jobStats     :: [(Text, Int)]
stuckJobs    :: [Int]

All five need Int64.

LogQueries.hs:

hourlyRows  :: [(Text, UTCTime, Int)]          -- event_count::BIGINT
precomputed :: [(..., Int, Int, ...)]          -- merged_count/total_patterns ::BIGINT
rawResults  :: [(Text, Int, Int, ...)]         -- cnt::BIGINT
rows        :: [..., V.Vector Int32, V.Vector Int32, V.Vector Int32]  -- ::BIGINT[]

Telemetry.hs:

getTotalEventsToReport :: ... -> Eff es Int   -- count(*)::bigint
getTotalMetricsCount   :: ... -> Eff es Int   -- count(*)::bigint
getUsageTotals returns (Int, Int64, Int, Int64) -- first/third should be Int64

Models/Apis/Issues.hs: IssueL.eventCount :: Int and activityBuckets :: V.Vector Int are both decoded from ::bigint/::bigint[] SQL after migration 0098.

Issue record: affectedRequests, affectedClients, llmEnhancementVersion, seqNum fields are still Int/Maybe Int but migration 0098 widens the DB columns to BIGINT. insertIssue binds these without explicit casts so they'll be encoded as int4 against int8 columns.


🔴 Bug: HI.EncodeRow on Issue now has 2 extra fields

Issue now derives HI.EncodeRow and gains cooldownUntil / lastNotifiedAt. The named-column insertIssue is fine, but any code path that triggers a full-row positional encode (e.g. a generic insertRow) will see 2 more columns than the schema expects.


🟡 Migration safety: DROP DOMAIN IF EXISTS email is global

0097_hasql_decoder_alignment.sql drops the email domain unconditionally. If any other table beyond users.users uses it, those columns lose the domain type silently. Worth querying pg_attribute first or at minimum documenting that the domain is confirmed single-use.


🟡 isTransient delegation may change retry behaviour

-- src/Data/Effectful/Hasql.hs
isTransientUsageError = IsError.isTransient

The old code explicitly classified ResultError (server-side decode / OID errors) as non-transient (programmer bugs, no retry). The new delegation to hasql's IsError.isTransient class may mark some server errors (e.g. serialization failures, 40001) as transient. Verify that OID mismatch and decode errors remain non-transient to avoid spurious retries on programming mistakes.


🟡 alertSelect cast changed from ::integer to ::float8 — unexplained semantic shift

-- Pkg/Parser.hs
alertSelect = [fmt| count(*)::float8|] :: Text

count(*) is an integer; casting to float8 for alert threshold comparisons is unusual and may introduce floating-point equality issues if thresholds are compared with =. Needs a comment explaining why float8 is required here.


🟢 Style / conciseness

Consolidate deriving via lines (2 types still split into two separate lines):

-- Before (FieldTypes, FieldCategoryEnum):
deriving (AE.FromJSON, AE.ToJSON, FromField, ToField) via WrappedEnumSC 'Nothing "FT" FieldTypes
deriving (HI.DecodeValue, HI.EncodeValue) via WrappedEnumSC 'Nothing "FT" FieldTypes

-- After:
deriving (AE.FromJSON, AE.ToJSON, FromField, HI.DecodeValue, HI.EncodeValue, ToField) via WrappedEnumSC 'Nothing "FT" FieldTypes

Every other site in the PR was consolidated; these two were missed.

insertProjectMembers — triple traversal of members:

-- Current: three separate map passes
unnest(#{pids}::uuid[], #{map (.userId) members}::uuid[], #{map (.permission) members}::...)
-- where pids = map (.projectId) members

-- Prefer a single unzip3:
let (pids, uids, perms) = unzip3 $ map (\m -> (m.projectId, m.userId, m.permission)) members

splitQualType silent fallback: The helper returns (Nothing, qn) when there is no dot, but the surrounding comment says "schema is required". If a bare type name reaches D.enum/E.enum, the behaviour (implicit public schema, or error) should be explicit — either panic or document the fallback.

refineText error context: Since symbolVal (Proxy @prefix) is already in scope at the call site, the context string passed to refineText could include the actual prefix (e.g. "WrappedEnumSC \"FT\"") instead of the hardcoded "WrappedEnumSC", making decode errors more actionable at zero extra cost.


Summary: The IntInt64 mismatches are the highest-priority fix — they will crash at the first query execution in affected code paths. Everything else is lower priority.

- archiveHosts/unarchiveHosts: drop let-bound dir, inline as ^{} splice
- normalLogElements: replace lambda-on-tuple with uncurry
Both types are written via named-column INSERTs (insertIssue, the
reports insert); the positional encoders were never exercised. Dropping
them removes a latent foot-gun now that Issue carries cooldown_until
and last_notified_at columns whose positions don't match the named
insertIssue column list.
@tonyalaribe tonyalaribe merged commit 030b103 into master Jun 2, 2026
8 of 9 checks passed
@tonyalaribe tonyalaribe deleted the telemetry-derive-row-instances branch June 2, 2026 20:45
@claude

claude Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review

This is a well-structured migration PR. The core idea — encoding the backing PG type at the type level via WrappedEnumSC (qualType :: Maybe Symbol) — is clean and eliminates the ::text cast spray across every query site. The test counts back this up. Some targeted observations:


DeriveUtils.hs — EncodeValue duplication

In the new HI.EncodeValue (WrappedEnumSC qualType prefix a) instance, the render lambda appears twice:

encodeValue = case maybeSymbolVal (Proxy @qualType) of
  Nothing -> contramap (\(WrappedEnumSC a) -> toText (encodeEnumSC @prefix a)) E.text
  Just qn ->
    let (sch, typ) = splitQualType qn
     in contramap (\(WrappedEnumSC a) -> toText (encodeEnumSC @prefix a)) (E.enum sch typ id)

Extract it to avoid the duplication:

encodeValue =
  let render (WrappedEnumSC a) = toText (encodeEnumSC @prefix a)
  in case maybeSymbolVal (Proxy @qualType) of
    Nothing -> contramap render E.text
    Just qn -> let (sch, typ) = splitQualType qn
                in contramap render (E.enum sch typ id)

Same duplication exists in the DecodeValue instance — splitQualType qn is called identically in both branches. Worth extracting at a higher scope since encodeValue/decodeValue share the same dispatch logic.


WrappedEnum / WrappedEnumShow — encoder semantic fix is correct

The change from E.enum (...) to contramap (...) E.text for WrappedEnum and WrappedEnumShow is a real correctness fix — those types back TEXT columns, not PG native enums, so the old OID-matching E.enum encoder would silently fail at statement prepare time if hasql ever validated the OID. Good catch.

Similarly, the 'Nothing branch switching from D.enum to refineText (...) D.text is correct. D.enum OID-matches against the PG enum catalog; using it for TEXT-backed columns would throw UnexpectedColumnTypeStatementError as soon as the prepared statement is inspected.


refineText helper

refineText :: Text -> (Text -> Maybe a) -> D.Value a
refineText ctx f = D.refine (\t -> maybeToRight (ctx <> ": cannot parse " <> t) (f t)) D.text

Good deduplication. One minor note: the error message ctx <> ": cannot parse " <> t prints the raw DB value in the error, which is useful for debugging. No issue.


splitQualType — could use T.breakOn

splitQualType qn = case T.break (== '.') qn of
  (sch, dotTyp) | Just typ <- T.stripPrefix "." dotTyp -> (Just sch, typ)
  _ -> (Nothing, qn)

T.breakOn "." qn is the conventional spelling here and avoids the lambda (== '.'). Minor.


IssueEnhancement.hs — relaxed pattern match

The pattern changed from:

case lines response of
  [criticalStr, breakingStr, incrementalStr] -> ...

to:

case T.strip <$> lines (T.strip res) of
  (criticalStr : breakingStr : incrementalStr : _) -> ...

The _ tail means a 4+ line response now succeeds silently instead of falling through to the error branch. Probably fine for handling trailing newlines from the LLM, but intentionality is worth confirming.


CI Text decoder — fix is correct

-  decodeValue = CI.mk <$> D.text
+  decodeValue = CI.mk <$> D.citext

Necessary fix. citext/email domain columns report a different OID than text; D.text rejects them. The accompanying migration (dropping the email domain → plain citext + CHECK constraint) is the right approach.


Int widening

The IntInt64 / Int32Int64 changes across LogQueries.bis, cnts, totalSessions, and all the monitor/issue/error-pattern fields are mechanical and consistent with the 0098 migration. No issues.


Nit: Typeable qualType constraint on FromField

instance (KnownSymbol prefix, Read a, Typeable a, Typeable qualType) => FromField (WrappedEnumSC qualType prefix a)

The Typeable qualType constraint was added, but FromField uses Typeable only to derive a type name for error messages. Since qualType :: Maybe Symbol, GHC can derive Typeable for it for free — the constraint is technically redundant (GHC 8.2+ gives Typeable to all types automatically). No impact on correctness, just a minor noise increase.


Overall

Solid PR. The type-level Maybe Symbol dispatch is the right abstraction — it keeps the deriving sites simple and concentrates the OID-routing logic in one place. The two things worth addressing before merge:

  1. Extract the repeated render lambda in EncodeValue / shared splitQualType call in DecodeValue.
  2. Confirm the (: _) relaxation in IssueEnhancement is intentional.

Migration sequencing (0097/0098 on top of 0096) should be verified against open branches as noted in the PR description.

@claude

claude Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review

Overall: A well-motivated and comprehensive migration. The core design — surfacing the PG backing type through WrappedEnumSC's kind parameter — is the right approach, and the INT→BIGINT sweep is thorough. A few focused items below.


splitQualType — leverage text more directly

src/Pkg/DeriveUtils.hs

-- current
splitQualType qn = case T.break (== '.') qn of
  (sch, dotTyp) | Just typ <- T.stripPrefix "." dotTyp -> (Just sch, typ)
  _ -> (Nothing, qn)

The pattern guard + stripPrefix is doing the work of splitting on the first dot. This can be expressed without the two-step approach:

splitQualType qn =
  let (sch, rest) = T.span (/= '.') qn
   in case T.uncons rest of
        Just ('.', typ) -> (Just sch, typ)
        _ -> (Nothing, qn)

Or even more directly using T.breakOn:

splitQualType qn = case T.breakOn "." qn of
  (sch, typ) | Just typ' <- T.stripPrefix "." typ -> (Just sch, typ')
  _ -> (Nothing, qn)

Both are equivalent — just surfacing that T.breakOn/T.span are slightly more idiomatic text APIs for this pattern than T.break.


WrappedEnum.encodeValue — good semantic fix worth noting

The old code used E.enum for WrappedEnum (no SC suffix), which is for OID-typed PG CREATE TYPE … AS ENUM columns. Since WrappedEnum is always backed by plain text, the change to contramap ... E.text is a genuine correctness fix, not just a style change:

-- was (wrong — E.enum expects a real PG enum OID):
encodeValue = E.enum (\(WrappedEnum a) -> ...)

-- now (correct):
encodeValue = (\(WrappedEnum a) -> ...) `contramap` E.text

Same pattern applies to WrappedEnumShow. Good catch — this would have caused UnexpectedColumnTypeStatementError on any column that wasn't a CREATE TYPE … AS ENUM.


isTransientUsageError — lost exhaustive-pattern safety net

src/Data/Effectful/Hasql.hs

The deleted code had an explicit comment:

"Exhaustive pattern (no wildcard) so a future hasql-pool upgrade triggers a compile error rather than silently misclassifying new error variants as non-retriable."

Delegating to isTransient is simpler, but it trades that compile-time guarantee for runtime behaviour defined inside hasql. If hasql adds a new error constructor that isTransient returns False for (where the old code would have returned True), callers will start seeing spurious non-retried errors with no compile-time signal. Low probability but worth keeping in mind if the hasql pin ever moves.


insertIssue — redundant ::apis.issue_type cast

src/Models/Apis/Issues.hs

Now that IssueType derives via WrappedEnumSC ('Just "apis.issue_type"), the EncodeValue instance sends via E.enum with the correct OID. The explicit cast in the VALUES clause:

#{i.issueType}::apis.issue_type

is now redundant (PG is being asked to cast an apis.issue_type value to apis.issue_type). Harmless, but the same SQL in isInCooldown also has #{ty}::apis.issue_type — if the goal of the new qualType parameter was to remove these per-site casts, a follow-up cleanup would make the intent consistent.


Issue generic DecodeRow field-order fragility

The comment on the new fields is honest:

"Order matches attnum (cooldown_until from 0075, last_notified_at from 0085) so generic DecodeRow against SELECT * resolves them in the right slots."

This will silently break if any future migration does a DROP COLUMN / ADD COLUMN cycle (which resets attnum) or if pg_dump/restore produces a different column ordering. Explicitly selecting columns in the IssueL / selectIssues queries (as is already done in many other SELECT sites in this file) would be safer and removes the dependency on physical column order.


Minor positives

  • refineText is a clean deduplication — using D.refine directly rather than re-implementing the error path in three places.
  • D.citext for CI Text is the right fix; D.text would fail on citext/email domain columns with an OID mismatch at session init.
  • Removing the tuple-unpacking workarounds in insertIssue / queryMonitorUpsert is a nice cleanup enabled by the TH quasi-quoter now handling record-dot syntax cleanly.
  • V.fromList idsids in all the monitorXByIds functions removes a pointless intermediate conversion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant