Skip to content

Conversation

@rxin
Copy link
Contributor

@rxin rxin commented May 3, 2017

What changes were proposed in this pull request?

We allow users to specify hints (currently only "broadcast" is supported) in SQL and DataFrame. However, while SQL has a standard hint format (/*+ ... */), DataFrame doesn't have one and sometimes users are confused that they can't find how to apply a broadcast hint. This ticket adds a generic hint function on DataFrame that allows using the same hint on DataFrames as well as SQL.

As an example, after this patch, the following will apply a broadcast hint on a DataFrame using the new hint function:

df1.join(df2.hint("broadcast"))

How was this patch tested?

Added a test case in DataFrameJoinSuite.

@SparkQA
Copy link

SparkQA commented May 3, 2017

Test build #76410 has started for PR 17839 at commit b84badc.

@gatorsmile
Copy link
Member

LGTM pending Jenkins

@rxin
Copy link
Contributor Author

rxin commented May 3, 2017

Actually somebody should add the Python / R wrapper.

cc @felixcheung and @zero323

@cloud-fan
Copy link
Contributor

LGTM

@zero323
Copy link
Member

zero323 commented May 3, 2017

Actually somebody should add the Python / R wrapper.

I can add both, once it is merged.

@SparkQA
Copy link

SparkQA commented May 3, 2017

Test build #3683 has finished for PR 17839 at commit b84badc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

just a thought - hint sounds fairly generic, especially in R as hint(df, ...)

@rxin
Copy link
Contributor Author

rxin commented May 3, 2017

Merging in master/branch-2.2.

@rxin
Copy link
Contributor Author

rxin commented May 3, 2017

@felixcheung do you worry about conflicts?

asfgit pushed a commit that referenced this pull request May 3, 2017
## What changes were proposed in this pull request?
We allow users to specify hints (currently only "broadcast" is supported) in SQL and DataFrame. However, while SQL has a standard hint format (/*+ ... */), DataFrame doesn't have one and sometimes users are confused that they can't find how to apply a broadcast hint. This ticket adds a generic hint function on DataFrame that allows using the same hint on DataFrames as well as SQL.

As an example, after this patch, the following will apply a broadcast hint on a DataFrame using the new hint function:

```
df1.join(df2.hint("broadcast"))
```

## How was this patch tested?
Added a test case in DataFrameJoinSuite.

Author: Reynold Xin <[email protected]>

Closes #17839 from rxin/SPARK-20576.

(cherry picked from commit 527fc5d)
Signed-off-by: Reynold Xin <[email protected]>
@rxin
Copy link
Contributor Author

rxin commented May 3, 2017

BTW I filed follow-up tickets for Python/R at https://issues.apache.org/jira/browse/SPARK-20576

@asfgit asfgit closed this in 527fc5d May 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants