Skip to content

Conversation

@andrewor14
Copy link
Contributor

In #30, both @tgraves and I ran into the issue of building the assembly jar on a Red Hat system: the resulting jar does not load the python files properly, which were needed for running PySpark on YARN.

In the medium term, we should figure out what the issue is, since Red Hat is used quite commonly. For now, we should at the very least document it so people don't run into the same headaches that plagued us for days.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@tgravescs
Copy link
Contributor

Changes look good to me. Do you know if its documented somewhere you have to build with maven to get pyspark on yarn?

@andrewor14
Copy link
Contributor Author

I took a quick pass at the latest docs, and it looks like for 1.0+ we only mention maven when we talk about building. I wonder if we should still document the requirement for building with maven for PySpark on YARN, however, since we can still build with sbt even though it's not documented.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14782/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to create a JIRA number for this and then reference it in the documentation.

@andrewor14 andrewor14 changed the title [Docs] Warn about PySpark on YARN on Red Hat [SPARK-1753] Warn about PySpark on YARN on Red Hat May 8, 2014
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14805/

@andrewor14
Copy link
Contributor Author

This PR is subsumed by #701. Closing.

@andrewor14 andrewor14 closed this May 13, 2014
@andrewor14 andrewor14 deleted the pyspark-on-yarn-docs branch May 13, 2014 01:24
helenyugithub pushed a commit to helenyugithub/spark that referenced this pull request Jul 13, 2020
agirish pushed a commit to HPEEzmeral/apache-spark that referenced this pull request May 5, 2022
udaynpusa pushed a commit to mapr/spark that referenced this pull request Jan 30, 2024
mapr-devops pushed a commit to mapr/spark that referenced this pull request May 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants