Skip to content

fix(insights): cast toFloatOrDefault default value to float#58714

Draft
sampennington wants to merge 1 commit into
masterfrom
sam/fix-tofloatordefault-cast-type
Draft

fix(insights): cast toFloatOrDefault default value to float#58714
sampennington wants to merge 1 commit into
masterfrom
sam/fix-tofloatordefault-cast-type

Conversation

@sampennington
Copy link
Copy Markdown
Contributor

Problem

Insight queries (e.g. Trends with a math_hogql series) that call toFloatOrDefault with an integer default literal fail deterministically with Default value type should be same as cast type. Expected Float64. Actual UInt8 (code 36).

ClickHouse's toFloat64OrDefault requires the default value to already have the cast target type (Float64). HogQL passed the default through unchanged, so an integer literal like 0 reached ClickHouse as a UInt8.

Part of an effort to reduce deterministic product-analytics query-builder failures — tracking dashboard: https://metabase.prod-us.posthog.dev/dashboard/207-product-analytics-insight-query-failures

Changes

Map toFloatOrDefault to a positional template that wraps the default in accurateCast(..., 'Float64'), so any numeric or string default literal works. The function now requires both arguments — a single-argument toFloatOrDefault was degenerate (equivalent to toFloatOrZero). Updated the Postgres-translation printer test parameter to pass a default accordingly.

How did you test this code?

I'm an agent. Added an executing HogQL test (test_to_float_or_default_with_integer_default) that calls toFloatOrDefault with integer defaults and runs against ClickHouse. Verified it fails on origin/master with the exact code 36 error and passes with the fix. Ran the full test_mapping.py suite (including the existing test_function_mapping which exercises float defaults) and the printer test suite — all pass. No snapshot files reference the function.

Publish to changelog?

no

🤖 Agent context

Agent-authored (Claude Code, Opus 4.7). Requires human review.

Root cause traced from a production failing Trends query whose math_hogql series contained coalesce(toFloatOrDefault(properties.pause_duration_minutes, 0), 1). Considered keeping the 1-argument form, but it duplicates toFloatOrZero and a positional template cannot be variadic; requiring two arguments keeps the fix minimal and the template simple.

ClickHouse's toFloat64OrDefault requires the default value to have the
same type as the cast target (Float64). When a HogQL query passes an
integer literal as the default (e.g. toFloatOrDefault(prop, 0)), the
default is a UInt8 and ClickHouse fails with "Default value type should
be same as cast type. Expected Float64. Actual UInt8" (code 36).

Map toFloatOrDefault to a positional template that wraps the default in
accurateCast(..., 'Float64'), so any numeric or string default literal
works. The function now requires both arguments — a single-argument
form is equivalent to toFloatOrZero.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

🎭 Playwright didn't run on this PR — your changes touch code that could affect E2E behavior, but Playwright is opt-in via label now to keep CI cost down.

Add the run-playwright label if you want an E2E sweep before merging — CI will pick it up automatically.

Most PRs don't need this. Real regressions still get caught on master and fix-forward.

@sampennington
Copy link
Copy Markdown
Contributor Author

Automated code review

Tightly scoped fix for a real, deterministic ClickHouse failure (code 36 — Default value type should be same as cast type). The root cause is correct: toFloat64OrDefault requires the default to already be Float64, and HogQL passed integer literals through as UInt8. Wrapping the default in accurateCast(..., 'Float64') is the right call. The reproducing test (test_mapping.py:307) is genuinely useful and executes against ClickHouse.

Needs discussion — the 2-arg requirement is a breaking change

conversions.py:51 changes min_args from 1 to 2. Any existing saved insight, dashboard tile, or math_hogql series calling single-arg toFloatOrDefault(x) will now fail to parse with Function 'toFloatOrDefault' expects 2 arguments, found 1 (from core.py:36) — i.e. this trades one deterministic failure for a different one for that subset of queries.

The PR body argues single-arg is "degenerate" (equivalent to toFloatOrZero). That's true semantically, but:

  • A query that parses and runs today will stop parsing after this lands. That is a regression in availability for those users, not a no-op.
  • The dashboard linked in the PR tracks reducing query failures — worth checking it for existing single-arg toFloatOrDefault callers before merging, so we know the blast radius.

Two lower-risk alternatives if any single-arg usage exists:

  • Keep min_args=1 and special-case the missing default. A positional template can't be variadic, but you can keep the old plain-name mapping for the 1-arg case via a small custom handler, or default-fill the second arg to accurateCast(0, 'Float64') during resolution.
  • If single-arg usage is confirmed zero in production, ship as-is but call that out explicitly in the PR description.

This is the one item I'd block on until the usage question is answered.

Functional gap — Postgres backend silently ignores the default

postgres_functions.py:292 maps toFloatOrDefault via _make_cast_handler("DOUBLE PRECISION"), and _make_cast_handler (postgres_functions.py:26-28) only uses args[0]. So under the Postgres-translation path, toFloatOrDefault('bla', 0) becomes CAST('bla' AS DOUBLE PRECISION) — which errors or yields NULL instead of returning 0. The default arg is dropped entirely.

This is pre-existing (not introduced by this PR), and the printer test at test_printer.py:5675 even bakes the dropped-arg behavior in as expected output. Not a blocker for this PR, but worth a follow-up issue: the ClickHouse and Postgres backends now disagree on toFloatOrDefault semantics. At minimum a code comment near postgres_functions.py:292 noting the default is intentionally unsupported there would prevent a future "fix" that breaks the printer test.

Minor

  • test_printer.py:5675: the parameter only verifies the cast wrapper, not that the default is honored — fine given the Postgres limitation above, just noting the test name (toFloatOrDefault) slightly oversells what it checks.
  • The new test could be parameterized per repo convention (parameterized library) since it's two variations of the same logic, though with only two cases inline tuples are acceptable.
  • Good: the inline comment in conversions.py:49-51 explains why the cast is needed — exactly the kind of load-bearing comment that earns its place.

Positive

  • Correct, minimal diff; no snapshot churn.
  • Test reproduces the exact production error and is pinned to ClickHouse execution.
  • accurateCast (vs CAST/reinterpret) is the safe choice — consistent with toFloat's accurateCastOrNull mapping right above it.

Verdict: Needs discussion — confirm there are no existing single-arg toFloatOrDefault callers before requiring 2 args; the ClickHouse fix itself is sound.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant