Skip to content

Conversation

davies
Copy link
Contributor

@davies davies commented Sep 16, 2014

Py4j can not handle large string efficiently, so we should use broadcast for large closure automatically. (Broadcast use local filesystem to pass through data).

@davies davies changed the title [SPARK-3554] use broadcast automatically for large closure [SPARK-3554] [PySpark] use broadcast automatically for large closure Sep 16, 2014
@SparkQA
Copy link

SparkQA commented Sep 16, 2014

QA tests have started for PR 2417 at commit aefd508.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 16, 2014

QA tests have finished for PR 2417 at commit aefd508.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 16, 2014

QA tests have started for PR 2417 at commit fbf4e97.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 16, 2014

QA tests have finished for PR 2417 at commit fbf4e97.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

LGTM. Surprising that the broadcast variable removal code was never triggered in the test suite before; thanks for fixing that!

@asfgit asfgit closed this in e77fa81 Sep 19, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants