Skip to content

fix(insights): coerce histogram breakdown property to a number#58701

Open
sampennington wants to merge 1 commit into
masterfrom
sam/fix-histogram-breakdown-string-property
Open

fix(insights): coerce histogram breakdown property to a number#58701
sampennington wants to merge 1 commit into
masterfrom
sam/fix-histogram-breakdown-string-property

Conversation

@sampennington
Copy link
Copy Markdown
Contributor

Problem

A trends insight with a histogram breakdown crashed the whole query with ClickHouse Illegal types String and String of arguments of function minus when the breakdown property wasn't numeric-typed. Surfaced from system.query_log as part of the effort to reduce deterministic query-builder bugs (dashboard).

The histogram bin math computes bin widths as max - min over the breakdown values. The histogram breakdown column was the raw property field, and a property without a Numeric PropertyDefinition isn't coerced by the property-type swapper — so the bin math ran minus() on strings.

Changes

  • Wrap the histogram breakdown column (_get_breakdown_col_expr) in toFloat().
  • Coerce the breakdown property the same way in the bin-bounds filter (_get_breakdown_expr) so the actors drill-in >= / < comparison stays numeric.

How did you test this code?

I'm an agent. Automated tests run locally:

  • New regression test test_trends_histogram_breakdown_on_string_typed_property (numeric values stored as strings → String-typed property), reproducing the crash and asserting correct buckets.
  • test_trends_query_runner.py histogram + breakdown tests — 89 passed.

Publish to changelog?

no

🤖 Agent context

Authored by Claude Code (Opus 4.7). Found via system.query_log analysis (exception_code = 43, minus on String).

The existing histogram tests passed only because the test helper auto-registers integer-valued properties as Numeric (so the property-type swapper coerced them) — the bug was masked. The new test stores numeric values as strings to exercise the uncoerced path. toFloat() of an already-numeric column is a harmless no-op, so numeric-typed breakdowns are unaffected.

Agent-authored; requires human review.

A trends histogram breakdown computes bin widths with max - min over the
breakdown values. When the breakdown property is not numeric-typed (so the
property-type swapper leaves it as a string), this ran minus() on strings
and ClickHouse failed the whole query with ILLEGAL_TYPE_OF_ARGUMENT.

- Wrap the histogram breakdown column in toFloat().
- Coerce the breakdown property the same way in the bin-bounds filter so
  the actors drill-in comparison stays numeric.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

🎭 Playwright didn't run on this PR — your changes touch code that could affect E2E behavior, but Playwright is opt-in via label now to keep CI cost down.

Add the run-playwright label if you want an E2E sweep before merging — CI will pick it up automatically.

Most PRs don't need this. Real regressions still get caught on master and fix-forward.

@sampennington sampennington marked this pull request as ready for review May 17, 2026 13:09
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 17, 2026

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
posthog/hogql_queries/insights/trends/test/test_trends_query_runner.py:2053-2085
**Actors drill-in path not covered by regression test**

The PR adds a `toFloat()` coercion to `_get_actors_query_where_expr` (the actors drill-in filter), but the new test only exercises the main trends query. Without a matching test that calls `to_actors_query_options()` or resolves the actors query with a `str_amount`-bucketed breakdown, the actors filter change (`toFloat(left) >= gte AND toFloat(left) < lt`) is untested. Before this PR the actors path would have crashed in the same way as the main query; a test covering that path would confirm the fix holds end-to-end (see `test_to_actors_query_options_breakdowns_histogram` for the numeric-typed counterpart at line ~3250).

### Issue 2 of 2
posthog/hogql_queries/insights/trends/test/test_trends_query_runner.py:2053-2085
**Single-case test where parameterisation would add value**

The PR description explicitly states that `toFloat()` on an already-numeric column is "a harmless no-op." That claim is not exercised by any test on this path — the existing histogram tests all use integer-valued properties (registered as `Numeric`) and live in separate, unrelated test methods. Parameterising this test to also run the already-numeric case (e.g., `breakdown_prop=10` as an integer) would assert the no-op claim directly and align with the team's preference for parameterised tests.

Reviews (1): Last reviewed commit: "fix(insights): coerce histogram breakdow..." | Re-trigger Greptile

Comment on lines +2053 to +2085
def test_trends_histogram_breakdown_on_string_typed_property(self):
# Numeric-looking values stored as strings register the property as String,
# so the property-type swapper does not coerce it. The histogram bin math
# (max - min) must still work rather than raising ILLEGAL_TYPE_OF_ARGUMENT.
self._create_events(
[
SeriesTestData(
distinct_id="p1",
events=[Series(event="$pageview", timestamps=["2020-01-11T12:00:00Z"])],
properties={"str_amount": "10"},
),
SeriesTestData(
distinct_id="p2",
events=[Series(event="$pageview", timestamps=["2020-01-12T12:00:00Z"])],
properties={"str_amount": "40"},
),
]
)

response = self._run_trends_query(
"2020-01-11",
"2020-01-13",
IntervalType.DAY,
[EventsNode(event="$pageview")],
None,
BreakdownFilter(
breakdown_type=BreakdownType.EVENT,
breakdown="str_amount",
breakdown_histogram_bin_count=2,
),
)

assert {r["breakdown_value"] for r in response.results} == {"[10,25]", "[25,40.01]"}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Actors drill-in path not covered by regression test

The PR adds a toFloat() coercion to _get_actors_query_where_expr (the actors drill-in filter), but the new test only exercises the main trends query. Without a matching test that calls to_actors_query_options() or resolves the actors query with a str_amount-bucketed breakdown, the actors filter change (toFloat(left) >= gte AND toFloat(left) < lt) is untested. Before this PR the actors path would have crashed in the same way as the main query; a test covering that path would confirm the fix holds end-to-end (see test_to_actors_query_options_breakdowns_histogram for the numeric-typed counterpart at line ~3250).

Prompt To Fix With AI
This is a comment left during a code review.
Path: posthog/hogql_queries/insights/trends/test/test_trends_query_runner.py
Line: 2053-2085

Comment:
**Actors drill-in path not covered by regression test**

The PR adds a `toFloat()` coercion to `_get_actors_query_where_expr` (the actors drill-in filter), but the new test only exercises the main trends query. Without a matching test that calls `to_actors_query_options()` or resolves the actors query with a `str_amount`-bucketed breakdown, the actors filter change (`toFloat(left) >= gte AND toFloat(left) < lt`) is untested. Before this PR the actors path would have crashed in the same way as the main query; a test covering that path would confirm the fix holds end-to-end (see `test_to_actors_query_options_breakdowns_histogram` for the numeric-typed counterpart at line ~3250).

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +2053 to +2085
def test_trends_histogram_breakdown_on_string_typed_property(self):
# Numeric-looking values stored as strings register the property as String,
# so the property-type swapper does not coerce it. The histogram bin math
# (max - min) must still work rather than raising ILLEGAL_TYPE_OF_ARGUMENT.
self._create_events(
[
SeriesTestData(
distinct_id="p1",
events=[Series(event="$pageview", timestamps=["2020-01-11T12:00:00Z"])],
properties={"str_amount": "10"},
),
SeriesTestData(
distinct_id="p2",
events=[Series(event="$pageview", timestamps=["2020-01-12T12:00:00Z"])],
properties={"str_amount": "40"},
),
]
)

response = self._run_trends_query(
"2020-01-11",
"2020-01-13",
IntervalType.DAY,
[EventsNode(event="$pageview")],
None,
BreakdownFilter(
breakdown_type=BreakdownType.EVENT,
breakdown="str_amount",
breakdown_histogram_bin_count=2,
),
)

assert {r["breakdown_value"] for r in response.results} == {"[10,25]", "[25,40.01]"}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Single-case test where parameterisation would add value

The PR description explicitly states that toFloat() on an already-numeric column is "a harmless no-op." That claim is not exercised by any test on this path — the existing histogram tests all use integer-valued properties (registered as Numeric) and live in separate, unrelated test methods. Parameterising this test to also run the already-numeric case (e.g., breakdown_prop=10 as an integer) would assert the no-op claim directly and align with the team's preference for parameterised tests.

Context Used: Do not attempt to comment on incorrect alphabetica... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: posthog/hogql_queries/insights/trends/test/test_trends_query_runner.py
Line: 2053-2085

Comment:
**Single-case test where parameterisation would add value**

The PR description explicitly states that `toFloat()` on an already-numeric column is "a harmless no-op." That claim is not exercised by any test on this path — the existing histogram tests all use integer-valued properties (registered as `Numeric`) and live in separate, unrelated test methods. Parameterising this test to also run the already-numeric case (e.g., `breakdown_prop=10` as an integer) would assert the no-op claim directly and align with the team's preference for parameterised tests.

**Context Used:** Do not attempt to comment on incorrect alphabetica... ([source](https://app.greptile.com/review/custom-context?memory=instruction-0))

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@sampennington
Copy link
Copy Markdown
Contributor Author

Automated code review

Scope: posthog/hogql_queries/insights/trends/breakdown.py (+9/-1) and one new regression test. The fix and its rationale are sound — a histogram breakdown on a String-typed property ran minus() on strings and crashed with ILLEGAL_TYPE_OF_ARGUMENT. Coercing both the breakdown column and the actors-query bounds filter to a number with toFloat() is the right shape, and keeping the two sites symmetric (_get_breakdown_col_expr_get_actors_query_where_expr) is good.

Needs discussion — non-numeric strings still crash

breakdown.py:433 and :323 use toFloat(), which compiles to ClickHouse toFloat64(). That function throws (CANNOT_PARSE_TEXT / CANNOT_PARSE_NUMBER, exception code 6/72) on a value it can't parse — empty string, null-ish text, or genuinely non-numeric values like "abc". So this PR fixes the numeric-looking string case but converts a different population of inputs from one hard error into another.

Histogram breakdown is an opt-in numeric feature, so a fully non-numeric property is arguably user error — but a single stray non-numeric event value among thousands of numeric ones would now fail the entire query, which is a worse failure mode than today for that mixed-data case. Worth confirming the intended behavior:

  • If non-numeric rows should be dropped/bucketed-as-null, use toFloatOrNull() instead. The bin math (max - min, width_bucket-style) tolerates NULL, and toFloatOrNull of an already-numeric column is still a harmless no-op. The bounds filter at :323 would also behave correctly (NULL >= gte → false, row excluded).
  • If a hard error is genuinely desired for non-numeric data, that's defensible — but then it should be a validated, user-facing error (like the breakdown_histogram_bin_count math-type check already in this file) rather than a raw ClickHouse exception surfaced from system.query_log, which is exactly the class of bug this effort is trying to reduce.

My recommendation: switch both call sites to toFloatOrNull(). It strictly widens the set of inputs that succeed and removes the new crash surface, at no cost to the numeric path.

Functional gaps in test coverage

test_trends_histogram_breakdown_on_string_typed_property only covers cleanly-parseable numeric strings. Given the concern above, add cases for:

  • A property value that is non-numeric (e.g. "abc") or empty string — asserts the chosen behavior (graceful bucket vs. explicit error), and pins it so a future change can't silently regress it.
  • A mixed dataset (some numeric strings, some non-numeric) — the realistic system.query_log scenario.
  • A property value that is missing entirely on some events (NULL column) — confirms null handling through the histogram path.
    Parameterize these per the repo's test conventions.

Minor

  • breakdown.py:307 is_numeric_breakdown is derived only when histogram_bin_count is an int, while _get_breakdown_col_expr:427 keys off histogram_bin_count is not None. Same effective condition today, but the two sites that must stay in lock-step now express the predicate differently — a one-line shared helper or matching the check would make the symmetry the PR depends on explicit.
  • The new comments are clear and appropriately explain why; no narration. Good.

Positive

  • Correct root-cause diagnosis (uncoerced property-type swapper path) and a genuine regression test that fails without the fix — the PR body's note that existing tests masked the bug via auto-registered Numeric PropertyDefinitions is a useful catch.
  • Both the aggregation column and the actors drill-in filter are fixed together, so the breakdown and its person-level drill-in stay consistent.

Verdict: Needs changes — prefer toFloatOrNull() at both sites (or an explicit validated error) plus non-numeric/mixed/null test cases before merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant