[SPARK-20494] Implement UDF array_unique in Spark with codegen #17778

janewangfb · 2017-04-27T00:57:36Z

What changes were proposed in this pull request?

Add UDF array_unique which return a new array with all the duplicated elements in the original array removed.

How was this patch tested?

Added various unittests in collectionExpressionsSuite.scala and also in spark-shell, created tables with columns of array type and inserted values with duplicated array elements, ran queries with UDF and verified the results.

AmplabJenkins · 2017-04-27T01:02:14Z

Can one of the admins verify this patch?

HyukjinKwon · 2017-04-27T04:18:37Z

Hi @janewangfb, it looks we need a JIRA, better PR title and PR description. Please check out http://spark.apache.org/contributing.html.

srowen · 2017-04-27T08:12:58Z

Why does this need to be added? If it isn't a standard function somewhere it probably doesn't need to be in Spark

janewangfb · 2017-04-27T17:11:55Z

@srowen We have array_contains UDF. I think it is nice to have one that removes all the duplicated elements.

srowen · 2017-04-27T17:19:34Z

@janewangfb that's because array_contains is a Hive function

srowen · 2017-05-06T09:33:10Z

We should close this

Add array_unique UDF

0a37b96

janewangfb changed the title ~~Add array_unique UDF~~ [SPARK-20494] Implement UDF array_unique in Spark with codegen Apr 27, 2017

srowen mentioned this pull request May 17, 2017

[INFRA] Close stale PRs #18017

Closed

asfgit closed this in 5d2750a May 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-20494] Implement UDF array_unique in Spark with codegen #17778

[SPARK-20494] Implement UDF array_unique in Spark with codegen #17778

Uh oh!

janewangfb commented Apr 27, 2017 •

edited

Loading

Uh oh!

AmplabJenkins commented Apr 27, 2017

Uh oh!

HyukjinKwon commented Apr 27, 2017

Uh oh!

srowen commented Apr 27, 2017

Uh oh!

janewangfb commented Apr 27, 2017

Uh oh!

srowen commented Apr 27, 2017

Uh oh!

srowen commented May 6, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-20494] Implement UDF array_unique in Spark with codegen #17778

[SPARK-20494] Implement UDF array_unique in Spark with codegen #17778

Uh oh!

Conversation

janewangfb commented Apr 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

AmplabJenkins commented Apr 27, 2017

Uh oh!

HyukjinKwon commented Apr 27, 2017

Uh oh!

srowen commented Apr 27, 2017

Uh oh!

janewangfb commented Apr 27, 2017

Uh oh!

srowen commented Apr 27, 2017

Uh oh!

srowen commented May 6, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

janewangfb commented Apr 27, 2017 •

edited

Loading