Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions developer-tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -389,19 +389,19 @@ Here are instructions on profiling Spark applications using YourKit Java Profile
<a href="https://www.yourkit.com/download/index.jsp">YourKit downloads page</a>.
This file is pretty big (~100 MB) and YourKit downloads site is somewhat slow, so you may
consider mirroring this file or including it on a custom AMI.
- Untar this file somewhere (in `/root` in our case): `tar xvjf yjp-12.0.5-linux.tar.bz2`
- Copy the expanded YourKit files to each node using copy-dir: `~/spark-ec2/copy-dir /root/yjp-12.0.5`
- Unzip this file somewhere (in `/root` in our case): `unzip YourKit-JavaProfiler-2017.02-b66.zip`
- Copy the expanded YourKit files to each node using copy-dir: `~/spark-ec2/copy-dir /root/YourKit-JavaProfiler-2017.02`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we still use spark-ec2/copy-dir? (not sure to keep spark-ec2 updated)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @shivaram, if I understood correctly, amplab/spark-ec2 is managed by you. Would you mind if I ask whether we should keep this here? The change itself looks fine if I haven't missed something but I am less sure if we should keep this here or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its probably good to add some wording on top saying These instructions apply to when Spark is run using the spark-ec2[link to amplab/spark-ec2] scripts

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this yourkit usage is not only for spark-ec2, I think the instruction might make others misunderstood. WDYT? @HyukjinKwon

Copy link
Member

@HyukjinKwon HyukjinKwon Sep 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, that's what I was thinking too. But I would rather avoid to fix it here now as I guess it maybe needs to wait for some more opinions / discussion (e.g., other profiles tools, description for copying it to other nodes, checking duplicated documentation and etc.). I am fine with it if any committer strongly prefers (I am +0).

Otherwise, let's push what we are sure of first as the current changes look obviously something to be fixed. I don't want to make this PR complicated for now and the original intention looks describing yourkit with ec2 anyway.

- Configure the Spark JVMs to use the YourKit profiling agent by editing `~/spark/conf/spark-env.sh`
and adding the lines
```
SPARK_DAEMON_JAVA_OPTS+=" -agentpath:/root/yjp-12.0.5/bin/linux-x86-64/libyjpagent.so=sampling"
SPARK_DAEMON_JAVA_OPTS+=" -agentpath:/root/YourKit-JavaProfiler-2017.02/bin/linux-x86-64/libyjpagent.so=sampling"
export SPARK_DAEMON_JAVA_OPTS
SPARK_JAVA_OPTS+=" -agentpath:/root/yjp-12.0.5/bin/linux-x86-64/libyjpagent.so=sampling"
export SPARK_JAVA_OPTS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait .. is it okay to remove this out without setting other alternatives (e.g., in cluster mode)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see and I didn't check. probably, is it better that we use spark.executor.extraJavaOptions and spark.driver.extraJavaOptions for this example? I think these options seem to work even in a cluster mode?

Copy link
Member Author

@maropu maropu Sep 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the previous pr (apache/spark#17212) to remove SPARK_JAVA_OPTS and the removed warning messages suggested we should use instead:

          |SPARK_JAVA_OPTS was detected (set to '$value').
          |This is deprecated in Spark 1.0+.
          |
          |Please instead use:
          | - ./spark-submit with conf/spark-defaults.conf to set defaults for an application
          | - ./spark-submit with --driver-java-options to set -X options for a driver
          | - spark.executor.extraJavaOptions to set -X options for executors
          | - SPARK_DAEMON_JAVA_OPTS to set java options for standalone daemons (master or worker)

From this message, SPARK_DAEMON_JAVA_OPTS seems to work in standalone only... I need to check more..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this isn't my area .. (I am less sure if it'd work in Yarn and etc.). Probably the safe choice should be to set both ..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, ok. It's okay that we just wait for other qualified developer's comments :)

SPARK_EXECUTOR_OPTS+=" -agentpath:/root/YourKit-JavaProfiler-2017.02/bin/linux-x86-64/libyjpagent.so=sampling"
export SPARK_EXECUTOR_OPTS
```
- Copy the updated configuration to each node: `~/spark-ec2/copy-dir ~/spark/conf/spark-env.sh`
- Restart your Spark cluster: `~/spark/bin/stop-all.sh` and `~/spark/bin/start-all.sh`
- By default, the YourKit profiler agents use ports 10001-10010. To connect the YourKit desktop
- By default, the YourKit profiler agents use ports `10001-10010`. To connect the YourKit desktop
application to the remote profiler agents, you'll have to open these ports in the cluster's EC2
security groups. To do this, sign into the AWS Management Console. Go to the EC2 section and
select `Security Groups` from the `Network & Security` section on the left side of the page.
Expand All @@ -417,7 +417,7 @@ cluster with the same name, your security group settings will be re-used.
- YourKit should now be connected to the remote profiling agent. It may take a few moments for profiling information to appear.

Please see the full YourKit documentation for the full list of profiler agent
<a href="http://www.yourkit.com/docs/80/help/startup_options.jsp">startup options</a>.
<a href="https://www.yourkit.com/docs/java/help/startup_options.jsp">startup options</a>.

<h4>In Spark unit tests</h4>

Expand Down
14 changes: 7 additions & 7 deletions site/developer-tools.html
Original file line number Diff line number Diff line change
Expand Up @@ -568,19 +568,19 @@ <h4>On Spark EC2 images</h4>
<a href="https://www.yourkit.com/download/index.jsp">YourKit downloads page</a>.
This file is pretty big (~100 MB) and YourKit downloads site is somewhat slow, so you may
consider mirroring this file or including it on a custom AMI.</li>
<li>Untar this file somewhere (in <code>/root</code> in our case): <code>tar xvjf yjp-12.0.5-linux.tar.bz2</code></li>
<li>Copy the expanded YourKit files to each node using copy-dir: <code>~/spark-ec2/copy-dir /root/yjp-12.0.5</code></li>
<li>Unzip this file somewhere (in <code>/root</code> in our case): <code>unzip YourKit-JavaProfiler-2017.02-b66.zip</code></li>
<li>Copy the expanded YourKit files to each node using copy-dir: <code>~/spark-ec2/copy-dir /root/YourKit-JavaProfiler-2017.02</code></li>
<li>Configure the Spark JVMs to use the YourKit profiling agent by editing <code>~/spark/conf/spark-env.sh</code>
and adding the lines
<pre><code>SPARK_DAEMON_JAVA_OPTS+=" -agentpath:/root/yjp-12.0.5/bin/linux-x86-64/libyjpagent.so=sampling"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maropu, it looks we should keep <pre>. I just double checked the format is broken:

2017-09-11 4 38 05

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will recheck

<pre><code>SPARK_DAEMON_JAVA_OPTS+=" -agentpath:/root/YourKit-JavaProfiler-2017.02/bin/linux-x86-64/libyjpagent.so=sampling"
export SPARK_DAEMON_JAVA_OPTS
SPARK_JAVA_OPTS+=" -agentpath:/root/yjp-12.0.5/bin/linux-x86-64/libyjpagent.so=sampling"
export SPARK_JAVA_OPTS
SPARK_EXECUTOR_OPTS+=" -agentpath:/root/YourKit-JavaProfiler-2017.02/bin/linux-x86-64/libyjpagent.so=sampling"
export SPARK_EXECUTOR_OPTS
</code></pre>
</li>
<li>Copy the updated configuration to each node: <code>~/spark-ec2/copy-dir ~/spark/conf/spark-env.sh</code></li>
<li>Restart your Spark cluster: <code>~/spark/bin/stop-all.sh</code> and <code>~/spark/bin/start-all.sh</code></li>
<li>By default, the YourKit profiler agents use ports 10001-10010. To connect the YourKit desktop
<li>By default, the YourKit profiler agents use ports <code>10001-10010</code>. To connect the YourKit desktop
application to the remote profiler agents, you&#8217;ll have to open these ports in the cluster&#8217;s EC2
security groups. To do this, sign into the AWS Management Console. Go to the EC2 section and
select <code>Security Groups</code> from the <code>Network &amp; Security</code> section on the left side of the page.
Expand All @@ -597,7 +597,7 @@ <h4>On Spark EC2 images</h4>
</ul>

<p>Please see the full YourKit documentation for the full list of profiler agent
<a href="http://www.yourkit.com/docs/80/help/startup_options.jsp">startup options</a>.</p>
<a href="https://www.yourkit.com/docs/java/help/startup_options.jsp">startup options</a>.</p>

<h4>In Spark unit tests</h4>

Expand Down