Skip to content

Conversation

@wjxiz1992
Copy link
Collaborator

@wjxiz1992 wjxiz1992 commented Apr 25, 2021

This PR is to close #2060. This gives initial support for element_at.

Note apache/spark#30297 introduce different behaviors for element_at when invalid index or key is given to lhs with ansi mode enabled and disabled.
A follow-up issue is created to track the difference.

For select element_at(Array(1,2,3), 4):

Spark 3.0.2 + spark.sql.ansi.enabled=true:

+-----------------------------+
|element_at(array(1, 2, 3), 4)|
+-----------------------------+
|                         null|
+-----------------------------+

Spark 3.1.1 + spark.sql.ansi.enabled=true:

java.lang.ArrayIndexOutOfBoundsException: Invalid index: 4, numElements: 3

For select element_at(map(1, 'a', 2, 'b'), 3):

Spark 3.0.2 + spark.sql.ansi.enabled=true:

+------------------------------+
|element_at(map(1, a, 2, b), 3)|
+------------------------------+
|                          null|
+------------------------------+

Spark 3.1.1 + spark.sql.ansi.enabled=true:

java.util.NoSuchElementException: Key 3 does not exist.

Signed-off-by: Allen Xu [email protected]

@wjxiz1992 wjxiz1992 self-assigned this Apr 25, 2021
@sameerz sameerz added the feature request New feature or request label Apr 25, 2021
@sameerz sameerz added this to the Apr 26 - May 7 milestone Apr 25, 2021
@wjxiz1992 wjxiz1992 linked an issue Apr 26, 2021 that may be closed by this pull request
@firestarman
Copy link
Collaborator

build

firestarman
firestarman previously approved these changes Apr 30, 2021
Copy link
Collaborator

@sperlingxx sperlingxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that ElementAt is located at collectionOperations.scala instead of complexTypeExtractors.scala in Spark. Shall we move GpuElementAt into collectionOperations.scala to follow the structure of Spark SQL ?

@revans2
Copy link
Collaborator

revans2 commented May 5, 2021

I found that ElementAt is located at collectionOperations.scala instead of complexTypeExtractors.scala in Spark. Shall we move GpuElementAt into collectionOperations.scala to follow the structure of Spark SQL ?

For most expressions I think it is good to try and mirror where Spark has them so yes.

@@ -0,0 +1,119 @@
/*
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is moved from rapids to sql, because:

  1. to match the path for this file in Spark
  2. to be able to call private [sql] class like AbstractDataType, ArrayType.simpleString

@firestarman
Copy link
Collaborator

LGTM

@firestarman
Copy link
Collaborator

build

@wjxiz1992
Copy link
Collaborator Author

wjxiz1992 commented May 14, 2021

CI failing due to a window related PR, this PR has been reverted.
Need to wait until the new cuDF jar is ready, then build again.

@revans2
Copy link
Collaborator

revans2 commented May 14, 2021

build


object GetArrayItemUtil {
def evalColumnar(array: GpuColumnVector, ordinal: Scalar, dataType: DataType,
zeroIndexed: Boolean): ColumnVector = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked for a common implementation because I thought it could be combined cleanly. This is not combined cleanly. At a minimum we need some documentation to explain exactly what is happening because it is not obvious from just the API. However, I am inclined to admin that my request was a bad one because the corner cases are handled differently enough that it is not simple to combine them, and perhaps we should leave these separate.

Signed-off-by: Allen Xu <[email protected]>
@wjxiz1992
Copy link
Collaborator Author

build

@wjxiz1992 wjxiz1992 requested a review from revans2 May 14, 2021 13:48
Signed-off-by: Allen Xu <[email protected]>
revans2
revans2 previously approved these changes May 14, 2021
@wjxiz1992
Copy link
Collaborator Author

build

firestarman
firestarman previously approved these changes May 15, 2021
@wjxiz1992
Copy link
Collaborator Author

build

@wjxiz1992
Copy link
Collaborator Author

build

Signed-off-by: Allen Xu <[email protected]>
@firestarman
Copy link
Collaborator

build

@wjxiz1992 wjxiz1992 merged commit 247b758 into NVIDIA:branch-0.6 May 15, 2021
@wjxiz1992 wjxiz1992 deleted the elementat branch May 15, 2021 15:20
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
Support ElementAt.

Signed-off-by: Allen Xu <[email protected]>
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
Support ElementAt.

Signed-off-by: Allen Xu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEA] Support element_at function on GPU

5 participants