feat: optimize HogQL generated OR expressions #39590
Open
+46
−6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Follow up to: #39568
Problem
Some queries we run, that contain multiple JOIN's fail to use indices on
events
table. It makes them trying read multiple terabytes of data, clogging the cluster.Internal context:
https://posthog.slack.com/archives/C019RAX2XBN/p1760353830494289
Failing query example:
Solution
Skip redundant parts of boolean
OR
expressionOptimizations:
or(expr, 0)
<=>expr
or(expr, 0, ...)
<=>or(expr, ...)
or(expr, 1, ...)
<=>1
Example
Input HogQL:
SELECT event FROM events WHERE 1 OR 2
Before:
SELECT event FROM events WHERE and(equals(events.team_id, 2), or(1, 1)) limit 100
After:
SELECT event FROM events WHERE equals(events.team_id, 2) LIMIT 100