fix(insights): reject invalid regex filters with a clear error#58706
fix(insights): reject invalid regex filters with a clear error#58706sampennington wants to merge 1 commit into
Conversation
|
🎭 Playwright didn't run on this PR — your changes touch code that could affect E2E behavior, but Playwright is opt-in via label now to keep CI cost down. Add the Most PRs don't need this. Real regressions still get caught on master and fix-forward. |
A property filter with an invalid regex (e.g. a trailing backslash, or a PCRE lookahead that RE2 doesn't support) failed the whole insight query with ClickHouse CANNOT_COMPILE_REGEXP — surfacing as a 500. regex / not_regex property filters now validate the pattern with RE2 (the same engine ClickHouse uses) and raise a user-facing QueryError naming the bad pattern instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6106dd7 to
0ff9e49
Compare
Automated code reviewReviewed Strengths
Suggestions (non-blocking)
Functional noteA Verdict: Ship it — minor message-truncation and the oversized-pattern test would strengthen it, but nothing blocks merge. |
Problem
A property filter with an invalid regex crashed the whole insight query with ClickHouse
CANNOT_COMPILE_REGEXP, surfacing as a 500. Surfaced fromsystem.query_log(exception_code = 427) as part of the effort to reduce deterministic query-builder bugs (dashboard).Examples seen: a trailing backslash, and PCRE lookaheads that ClickHouse's RE2 engine doesn't support. These are user-input errors but were reaching ClickHouse and failing as 500s.
Changes
regex/not_regexproperty filters now validate the pattern with RE2 (the same engine ClickHouse uses) before building the query.QueryErrornaming the bad pattern, instead of a ClickHouse 500.How did you test this code?
I'm an agent. Automated tests run locally:
test_property_to_expr_invalid_regex_raises_query_errorcovering a trailing backslash, an unsupported lookahead, and an unbalanced paren — for bothregexandnot_regex.posthog/hogql/test/test_property.pysuite — 129 passed.Publish to changelog?
no
🤖 Agent context
Authored by Claude Code (Opus 4.7). Found via
system.query_loganalysis (exception_code = 427,CANNOT_COMPILE_REGEXP).Validation uses the
google-re2package (already a dependency) so it matches ClickHouse's RE2 behavior exactly — Python'srewould wrongly accept PCRE-only constructs like lookahead. This converts the failure from a 500 (incident noise) to a 4xx user-input error; the query still can't run with an invalid pattern, but the classification is now correct.Agent-authored; requires human review.