fix(insights): compare numeric property filter values as strings#58682
fix(insights): compare numeric property filter values as strings#58682sampennington wants to merge 2 commits into
Conversation
A numeric property filter value compared against a string column (group keys, the event name, distinct_id, JSON/materialized properties) raised ClickHouse NO_COMMON_TYPE and failed the whole insight query. Equality comparisons (exact, is_not, and the multi-value in/not in) now compare both sides as strings via toString() when the value is numeric. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🎭 Playwright didn't run on this PR — your changes touch code that could affect E2E behavior, but Playwright is opt-in via label now to keep CI cost down. Add the Most PRs don't need this. Real regressions still get caught on master and fix-forward. |
Automated reviewCritical IssuesNone. The fix is sound, minimal, and well-scoped — no data-loss, security, or breaking-API concerns. Should Fix
Performance Notes
Suggestions
NitsNone. Positive Observations
VerdictShip it — after adding the three test cases above (all test-only, no production code change). Next steps:
Posted by |
Adds review-requested coverage for the numeric-value comparison fix: mixed numeric/string tuples, single-element numeric lists, and an end-to-end class that executes the generated SQL against ClickHouse (including is_not against a NULL property). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Review follow-up — Should Fix items addressedAll three test-only follow-ups from the automated review are done in
Also added the suggestions:
Full |
Automated code reviewReviewed the diff and surrounding context in Strengths
Observations / non-blocking
TestsCoverage is good. Two optional additions worth considering:
VerdictShip it — solid fix with real end-to-end coverage; just confirm |
Problem
A property filter with a numeric value compared against a string column crashed the entire insight query with ClickHouse
NO_COMMON_TYPE. This was the most common deterministic insight-query failure surfaced fromsystem.query_log— it hit group keys ($group_0–$group_4), theeventname,distinct_id, and JSON/materialized string properties whenever the filter value arrived as a number (e.g. a group key that looks numeric).This is part of an effort to reduce deterministic query-builder bugs in product analytics insights, tracked on the Product analytics — insight query failures dashboard.
NO_COMMON_TYPE(exception_code 386) was the largest deterministic builder failure in that triage.Changes
_expr_to_compare_opinposthog/hogql/property.py:exactandis_notcomparisons now compare both sides as strings (toString(expr) = toString(value)) when the filter value is numeric, instead of emitting a rawStringvsFloat64comparison.in/not inpath gets the same treatment when any tuple value is numeric._is_numeric_scalar/_equality_comparehelpers.exactfilters on virtual numeric properties / session duration now produce thetoStringform (equivalent for equality).How did you test this code?
I'm an agent. Automated tests run locally:
test_property_to_expr_numeric_value_against_string_column(AST shape, incl. mixed numeric/string tuples and single-element lists).TestNumericPropertyComparisonWithDatathat executes the generated SQL against ClickHouse (exact,is_not, all-numericin, mixedin).posthog/hogql/test/test_property.pysuite — 131 passed.Snapshot (
.ambr) files in the trends/funnels/retention suites may need updating since the change touches all numeric-valuedexact/is_not/infilters; CI will surface the exact set.Publish to changelog?
no
🤖 Agent context
Authored by Claude Code (Opus 4.7). Found via a
system.query_loganalysis of failing product-analytics insight queries:exception_code = 386(NO_COMMON_TYPE) was the largest deterministic query-builder failure. The investigation also produced the dashboard linked above.Decisions:
eventcolumn,distinct_id, and materialized columns — any string column vs a numeric value. So the fix targets equality comparisons generally.toStringon both operands rather than stringifying the value in Python, so ClickHouse computes a consistent canonical string for both sides (str(13.0)would give"13.0", buttoString(13.0)gives"13").lt/gt/lte/gteuntouched — the query_log data showed onlyequals/infailing, and ordering comparisons have genuine numeric-vs-string semantic differences.Agent-authored; requires human review.