fix: Keep Computes with Aggregate when Filter follows (#5130) #5639
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #5130: Filtering on combined aggregates was generating invalid SQL with CTEs missing GROUP BY.
The issue occurred when aggregate expressions contained operations like
(sum x) + (sum y). These expressions were being extracted as Compute transforms, but when the pipeline was split for a following Filter, the Computes ended up in a separate CTE from their Aggregate, losing the GROUP BY clause.Fix: In
is_split_required, don't split between Compute and Filter when there's an Aggregate later in the pipeline. This keeps aggregate-related Computes together with their GROUP BY.Trade-off: Some intermediate CTEs may include unused columns (documented in test comment). These don't leak to final output.
Test plan
(sum x) + (sum y)and(sum x) * 2with filter🤖 Generated with Claude Code