-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-35278][SQL] Invoke should find the method with correct number of parameters #32404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Nice catch!
Thank you, @viirya .
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
| case ObjectType(cls) => | ||
| val m = cls.getMethods.find(_.getName == encodedFunctionName) | ||
| val m = cls.getMethods.find { m => | ||
| m.getName == encodedFunctionName && m.getParameterCount == arguments.length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we check parameter types too? also curious why we don't use cls.getMethod for the purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to use getMethod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually Invoke uses getMethod for target object with data types other than ObjectType. I'm not sure why we need to do a separate getMethod during evaluation for the cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And note method uses encodedFunctionName to find method, but in eval the original functionName is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be safer to keep original pattern. But switched to use getMethod and see what CI tells.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like a very ancient bug from Spark 1.x.
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
Show resolved
Hide resolved
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, @viirya . According to the Affected Version of SPARK-35278, are we going to cut another RC (Apache Spark 2.4.8 RC4) for this?
|
Test build #138103 has finished for PR 32404 at commit
|
Switched to |
Ideally, yes. Though seems no one complains this, and this is an ancient bug... |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Let me know if you prefer to skip 2.4 backport. |
|
Oh, the problem of using |
|
Test build #138097 has finished for PR 32404 at commit
|
| @transient lazy val method = targetObject.dataType match { | ||
| case ObjectType(cls) => | ||
| val m = cls.getMethods.find(_.getName == encodedFunctionName) | ||
| val m = cls.getMethods.find { m => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we use filter instead of find, and fail if more than one match is found?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
getMethodwith param types- If not found, turn to find one method with parameter number
- If more than one is found, or no one can be found, fails it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for that improved one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Now I'm wondering whether StaticInvoke has the same issue since it uses getDeclaredMethod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StaticInvoke calls getDeclaredMethod with argClasses. I think it may also suffer the issue if the arg class is Object.
Let me see if I can come out a test case and fix it (might be in other PR).
As said above, I think it is safer to try to combine two method looking up in master only.
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #138107 has finished for PR 32404 at commit
|
For that, it's totally up to your decision, @viirya , because you are the release manager of Apache Spark 2.4.8. I support your decision. |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #138121 has finished for PR 32404 at commit
|
| sys.error(s"Couldn't find $encodedFunctionName on $cls") | ||
| } else if (m.length > 1) { | ||
| // More than one matched method signature. Exclude synthetic one, e.g. generic one. | ||
| val realMethods = m.filter(!_.isSynthetic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot filter out synthetic ones in L336?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it is possible that there is only one method and it is also a synthetic one. If we filter synthetic ones at L336, we may miss it?
|
Looks fine otherwise. |
|
Test build #138122 has finished for PR 32404 at commit
|
|
Thanks @cloud-fan @dongjoon-hyun @maropu. Merging to master/3.1/3.0. There is conflicts in 2.4. Will manually backport. |
…of parameters ### What changes were proposed in this pull request? This patch fixes `Invoke` expression when the target object has more than one method with the given method name. ### Why are the changes needed? `Invoke` will find out the method on the target object with given method name. If there are more than one method with the name, currently it is undeterministic which method will be used. We should add the condition of parameter number when finding the method. ### Does this PR introduce _any_ user-facing change? Yes, fixed a bug when using `Invoke` on a object where more than one method with the given method name. ### How was this patch tested? Unit test. Closes #32404 from viirya/verify-invoke-param-len. Authored-by: Liang-Chi Hsieh <[email protected]> Signed-off-by: Liang-Chi Hsieh <[email protected]> (cherry picked from commit 6ce1b16) Signed-off-by: Liang-Chi Hsieh <[email protected]>
…of parameters ### What changes were proposed in this pull request? This patch fixes `Invoke` expression when the target object has more than one method with the given method name. ### Why are the changes needed? `Invoke` will find out the method on the target object with given method name. If there are more than one method with the name, currently it is undeterministic which method will be used. We should add the condition of parameter number when finding the method. ### Does this PR introduce _any_ user-facing change? Yes, fixed a bug when using `Invoke` on a object where more than one method with the given method name. ### How was this patch tested? Unit test. Closes #32404 from viirya/verify-invoke-param-len. Authored-by: Liang-Chi Hsieh <[email protected]> Signed-off-by: Liang-Chi Hsieh <[email protected]> (cherry picked from commit 6ce1b16) Signed-off-by: Liang-Chi Hsieh <[email protected]>
…mber of parameters ### What changes were proposed in this pull request? This patch fixes `Invoke` expression when the target object has more than one method with the given method name. This is 2.4 backport of #32404. ### Why are the changes needed? `Invoke` will find out the method on the target object with given method name. If there are more than one method with the name, currently it is undeterministic which method will be used. We should add the condition of parameter number when finding the method. ### Does this PR introduce _any_ user-facing change? Yes, fixed a bug when using `Invoke` on a object where more than one method with the given method name. ### How was this patch tested? Unit test. Closes #32412 from viirya/SPARK-35278-2.4. Authored-by: Liang-Chi Hsieh <[email protected]> Signed-off-by: Liang-Chi Hsieh <[email protected]>
|
+1, late LGTM with the last commit. |
|
@viirya can we look into which callers do not provide the concrete parameter types when building |
|
@cloud-fan Okay. Let me find them out and fix them. |
|
@cloud-fan Just to confirm. So you mean we only allow For example, calling a method |
|
@cloud-fan Seems to me we still need to support it. One example is For example, |
This should be allowed, because we do provide the concrete parameter type ( I think I get your point, we should probably use a util function to look up the method, instead of writing the code by ourselves, to handle cases like search the inheritance hierarchy, argument type casting, etc. |
It looks promising. Not aware there is an util function we can use. So my question would be should we use it in backports? Or only replacing with our code with the util in master only? |
|
I think this PR itself is pretty conservative and is good to backport. The new code is clearly better than the old code without regression. I'd suggest to only change to the util method in the master branch, for both invoke and static invoke. |
|
BTW, AFAIK invoke and static invoke are only used by internal places such as the encoder and UDT. It's not really a bug for <= 3.1, but a bug for 3.2 as we use it to execute the new UDF. |
I see. Then I don't touch invoke in old branches again and keep this backport as it is. Let's use the util method in master only. |
Rarely but maybe possibly developers will use invoke/static invoke for custom encoder? As this is already backport to 2.4/3.0/3.1, I think it should be fine to keep as it is? For static invoke #32413, it is also conservative, so I guess it may be also good for backport? |
Yea I think so |
…of parameters This patch fixes `Invoke` expression when the target object has more than one method with the given method name. `Invoke` will find out the method on the target object with given method name. If there are more than one method with the name, currently it is undeterministic which method will be used. We should add the condition of parameter number when finding the method. Yes, fixed a bug when using `Invoke` on a object where more than one method with the given method name. Unit test. Closes apache#32404 from viirya/verify-invoke-param-len. Authored-by: Liang-Chi Hsieh <[email protected]> Signed-off-by: Liang-Chi Hsieh <[email protected]> (cherry picked from commit 6ce1b16) Signed-off-by: Liang-Chi Hsieh <[email protected]>
…of parameters ### What changes were proposed in this pull request? This patch fixes `Invoke` expression when the target object has more than one method with the given method name. ### Why are the changes needed? `Invoke` will find out the method on the target object with given method name. If there are more than one method with the name, currently it is undeterministic which method will be used. We should add the condition of parameter number when finding the method. ### Does this PR introduce _any_ user-facing change? Yes, fixed a bug when using `Invoke` on a object where more than one method with the given method name. ### How was this patch tested? Unit test. Closes apache#32404 from viirya/verify-invoke-param-len. Authored-by: Liang-Chi Hsieh <[email protected]> Signed-off-by: Liang-Chi Hsieh <[email protected]> (cherry picked from commit 6ce1b16) Signed-off-by: Liang-Chi Hsieh <[email protected]>
What changes were proposed in this pull request?
This patch fixes
Invokeexpression when the target object has more than one method with the given method name.Why are the changes needed?
Invokewill find out the method on the target object with given method name. If there are more than one method with the name, currently it is undeterministic which method will be used. We should add the condition of parameter number when finding the method.Does this PR introduce any user-facing change?
Yes, fixed a bug when using
Invokeon a object where more than one method with the given method name.How was this patch tested?
Unit test.