Skip to content

Conversation

viirya
Copy link
Member

@viirya viirya commented Sep 4, 2015

JIRA: https://issues.apache.org/jira/browse/SPARK-10446

Currently the method join(right: DataFrame, usingColumns: Seq[String]) only supports inner join. It is more convenient to have it support other join types.

@SparkQA
Copy link

SparkQA commented Sep 4, 2015

Test build #41997 has finished for PR 8600 at commit 8ff97ed.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class BlockFetchException(messages: String, throwable: Throwable)

@SparkQA
Copy link

SparkQA commented Sep 4, 2015

Test build #42001 has finished for PR 8600 at commit efe069a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot use default parameter values in order to maintain compatibility with Java. You can add an extra method.

@SparkQA
Copy link

SparkQA commented Sep 5, 2015

Test build #42058 has finished for PR 8600 at commit e298dad.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Sep 8, 2015

ping @rxin

@rxin
Copy link
Contributor

rxin commented Sep 22, 2015

Thanks - I've merged this.

@asfgit asfgit closed this in 1fcefef Sep 22, 2015
ghost pushed a commit to dbtsai/spark that referenced this pull request Dec 28, 2015
…i-Join

After reading the JIRA https://issues.apache.org/jira/browse/SPARK-12520, I double checked the code.

For example, users can do the Equi-Join like
  ```df.join(df2, 'name', 'outer').select('name', 'height').collect()```
- There exists a bug in 1.5 and 1.4. The code just ignores the third parameter (join type) users pass. However, the join type we called is `Inner`, even if the user-specified type is the other type (e.g., `Outer`).
- After a PR: apache#8600, the 1.6 does not have such an issue, but the description has not been updated.

Plan to submit another PR to fix 1.5 and issue an error message if users specify a non-inner join type when using Equi-Join.

Author: gatorsmile <[email protected]>

Closes apache#10477 from gatorsmile/pyOuterJoin.
asfgit pushed a commit that referenced this pull request Dec 28, 2015
…i-Join

After reading the JIRA https://issues.apache.org/jira/browse/SPARK-12520, I double checked the code.

For example, users can do the Equi-Join like
  ```df.join(df2, 'name', 'outer').select('name', 'height').collect()```
- There exists a bug in 1.5 and 1.4. The code just ignores the third parameter (join type) users pass. However, the join type we called is `Inner`, even if the user-specified type is the other type (e.g., `Outer`).
- After a PR: #8600, the 1.6 does not have such an issue, but the description has not been updated.

Plan to submit another PR to fix 1.5 and issue an error message if users specify a non-inner join type when using Equi-Join.

Author: gatorsmile <[email protected]>

Closes #10477 from gatorsmile/pyOuterJoin.
@gatorsmile
Copy link
Member

How can we combine two columns with different values?

@gatorsmile
Copy link
Member

nvm. USING join can support outer join types, but we are unable to treat them as actual outer join.

@viirya viirya deleted the usingcolumns_df branch December 27, 2023 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants