Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support WITHIN GROUP syntax to standardize certain existing aggregate functions #13511
base: main
Are you sure you want to change the base?
Support WITHIN GROUP syntax to standardize certain existing aggregate functions #13511
Changes from 40 commits
a9b901a
0918000
070a96b
3fd92fd
4082a78
c3be3c6
9fd05a3
8518a59
79669d9
597f4d7
d3b483c
a827c9d
23bdf70
97d96ca
1b61b5b
d0fdde3
3c8bce3
be99a35
f9aa1fc
d5f0b62
d7f2f59
7ef2139
91565b3
d482bff
005a27c
1179bc4
e5fc1a4
cf4faad
fc7d2bc
5469e39
d96b667
293d33e
ecdb21b
d65420e
b6d426a
4b0c52f
36a732d
8d6db85
db0355a
37b783e
124d8c5
3259c95
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something we need? From what I know, there aren't any aggregate functions that have options for null handling. At the moment, the 2 overrides you have of this both return
Some(false)
, which is what I would consider the default value anyways.Speaking of which, if we do need this, do we need to return an
Optional<bool>
or could we just returnbool
directly?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some aggregate functions using null handling in current datafusion.
(cf. If this is something we need to discuss/fix, then I can make another git issue. Or, I can refactor it too in this PR. I left this comment because I am not 100% sure about the SQL standard.)
And I refactored the function to just return
bool
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was smelling odd, so I dug a bit deeper. I think you've inadvertantly stumbled into something even weirder than you anticipated
The example you've linked is
which I don't think is a valid query because
first_value
should not be an aggregate function, or at the very least the above query is not valid in most SQL dialects.first_value
is actually a window function in other engines (eg. Trino, Postgres, MySQL).If you try running something like
against Postgres you get an error like
dbfiddle
The
RESPECT NULLS | IGNORE NULLS
options is only a property of certain window functions, hence we shouldn't need to track it for aggregate functions.I'm going to file a ticket for the above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #15006
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used floating point subtraction instead of actual sorting in reverse order, for conciseness.
If any slight floating point difference is not permitted (even if this branch passed the tests), please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable to me, but I don't have that much experience on the execution side of things.