-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-51580][SQL] Throw proper user facing error message when lambda function is out of place in HigherOrderFunction
#50345
Conversation
dff68ee
to
1da8eac
Compare
sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala
Show resolved
Hide resolved
fe9a97b
to
355566d
Compare
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
Outdated
Show resolved
Hide resolved
355566d
to
ea4ed3d
Compare
@@ -345,6 +345,16 @@ trait CheckAnalysis extends LookupCatalog with QueryErrorsBase with PlanToString | |||
|
|||
case operator: LogicalPlan => | |||
operator transformExpressionsDown { | |||
case hof: HigherOrderFunction if hof.arguments.exists { | |||
case LambdaFunction(_, _, _) => true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need a bit more context. So ideally the analyzer should rewrite LambdaFunction
if it's in the right place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is in the following query pattern: select transform(x -> x + 1, array(1,2,3))
. By reversing the parameter order the user sees the error: call to dataType on unresolved object
, because the lambda function will not be bound properly. This is because HigherOrderFunction
expects a lambda in a certain place but we never enforce this check anywhere. We can't really place it in checkInputDataTypes
because we are not going to hit that codepath before erroring out, so we do it in CheckAnalysis
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So after binding, LambdaFunction
becomes something different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we are never going to bind LambdaFunction
because in order to bind it, we need to have arguments resolved, but because lambda is an argument we get into a recursive cycle of it never being resolved: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/higherOrderFunctions.scala#L116
c96fa20
to
f6cdd9a
Compare
f6cdd9a
to
b82d5fa
Compare
The spark connect test failure is unrelated, thanks, merging to master/4.0! |
… function is out of place in `HigherOrderFunction` ### What changes were proposed in this pull request? Throw proper user facing error message when lambda function is out of place in `HigherOrderFunction`. ### Why are the changes needed? Currently, the analysis fails with `invalid call to dataType`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added new test cases with expected error message. ### Was this patch authored or co-authored using generative AI tooling? No Closes #50345 from mihailotim-db/mihailotim-db/lambda_error_message. Authored-by: Mihailo Timotic <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 90050f5) Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
Throw proper user facing error message when lambda function is out of place in
HigherOrderFunction
.Why are the changes needed?
Currently, the analysis fails with
invalid call to dataType
.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Added new test cases with expected error message.
Was this patch authored or co-authored using generative AI tooling?
No