Skip to content

Commit 32c567b

Browse files
committed
Merge pull request apache#95 from concretevitamin/master
Update README.md with YARN instructions.
2 parents 8e8a029 + 4cd2d5e commit 32c567b

File tree

1 file changed

+21
-1
lines changed

1 file changed

+21
-1
lines changed

README.md

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ environment variable. For example to use 1g, you can run
6565

6666
SPARK_MEM=1g ./sparkR
6767

68-
In a cluster settting to set the amount of memory used by the executors you can
68+
In a cluster setting to set the amount of memory used by the executors you can
6969
pass the variable `spark.executor.memory` to the SparkContext constructor.
7070

7171
library(SparkR)
@@ -89,6 +89,26 @@ You can also run the unit-tests for SparkR by running
8989
Instructions for running SparkR on EC2 can be found in the
9090
[SparkR wiki](https://github.com/amplab-extras/SparkR-pkg/wiki/SparkR-on-EC2).
9191

92+
## Running on YARN
93+
Currently, SparkR supports running on YARN with the `yarn-client` mode. These steps show how to build SparkR with YARN support and run SparkR programs on a YARN cluster:
94+
95+
```
96+
# assumes Java, R, rJava, yarn, spark etc. are installed on the whole cluster.
97+
cd SparkR-pkg/
98+
USE_YARN=1 SPARK_YARN_VERSION=2.4.0 SPARK_HADOOP_VERSION=2.4.0 ./install-dev.sh
99+
```
100+
101+
Before launching an application, make sure each worker node has a local copy of `lib/SparkR/sparkr-assembly-0.1.jar`. With a cluster launched with the `spark-ec2` script, do:
102+
```
103+
~/spark-ec2/copy-dir ~/SparkR-pkg
104+
```
105+
106+
Finally, when launching an application, the environment variable `YARN_CONF_DIR` needs to be set to the directory which contains the client-side configuration files for the Hadoop cluster (with a cluster launched with `spark-ec2`, this defaults to `/root/ephemeral-hdfs/conf/`):
107+
```
108+
YARN_CONF_DIR=/root/ephemeral-hdfs/conf/ MASTER=yarn-client ./sparkR
109+
YARN_CONF_DIR=/root/ephemeral-hdfs/conf/ ./sparkR examples/pi.R yarn-client
110+
```
111+
92112
## Report Issues/Feedback
93113

94114
For better tracking and collaboration, issues and TODO items are reported to a dedicated [SparkR JIRA](https://sparkr.atlassian.net/browse/SPARKR/).

0 commit comments

Comments
 (0)