-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-34828][YARN] Make shuffle service name configurable on client side and allow for classpath-based config override on server side #31936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
ca6b3ce
e5d7a2d
90e743a
ef88052
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -773,8 +773,28 @@ The following extra configuration options are available when the shuffle service | |
| NodeManagers where the Spark Shuffle Service is not running. | ||
| </td> | ||
| </tr> | ||
| <tr> | ||
| <td><code>spark.yarn.shuffle.service.metrics.namespace</code></td> | ||
| <td><code>sparkShuffleService</code></td> | ||
| <td> | ||
| The namespace to use when emitting shuffle service metrics into Hadoop metrics2 system of the | ||
| NodeManager. | ||
|
||
| </td> | ||
| </tr> | ||
| </table> | ||
|
|
||
| Please note that the instructions above assume that the default shuffle service name, | ||
| `spark_shuffle`, has been used. It is possible to use any name here, but the values used in the | ||
| YARN NodeManager configurations must match the value of `spark.shuffle.service.name` in the | ||
| Spark application. | ||
|
|
||
| The shuffle service will, by default, take all of its configurations from the Hadoop Configuration | ||
| used by the NodeManager (e.g. `yarn-site.xml`). However, it is also possible to configure the | ||
| shuffle service independently using a file named `spark-shuffle-site.xml` which should be placed | ||
| onto the classpath of the shuffle service (which is, by default, shared with the classpath of the | ||
| NodeManager). The shuffle service will treat this as a standard Hadoop Configuration resource and | ||
| overlay it on top of the NodeManager's configuration. | ||
|
|
||
| # Launching your application with Apache Oozie | ||
|
|
||
| Apache Oozie can launch Spark applications as part of a workflow. | ||
|
|
@@ -823,3 +843,54 @@ do the following: | |
| to the list of filters in the <code>spark.ui.filters</code> configuration. | ||
|
|
||
| Be aware that the history server information may not be up-to-date with the application's state. | ||
|
|
||
| # Running multiple versions of the Spark Shuffle Service | ||
|
|
||
| Please note that this section only applies when running on YARN versions >= 2.9.0. | ||
|
|
||
| In some cases it may be desirable to run multiple instances of the Spark Shuffle Service which are | ||
| using different versions of Spark. This can be helpful, for example, when running a YARN cluster | ||
| with a mixed workload of applications running multiple Spark versions, since a given version of | ||
| the shuffle service is not always compatible with other versions of Spark. YARN versions since 2.9.0 | ||
| support the ability to run shuffle services within an isolated classloader | ||
xkrogen marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| (see [YARN-4577](https://issues.apache.org/jira/browse/YARN-4577)), meaning multiple Spark versions | ||
| can coexist within a single NodeManager. The | ||
| `yarn.nodemanager.aux-services.<service-name>.classpath` and, starting from YARN 2.10.2/3.1.1/3.2.0, | ||
| `yarn.nodemanager.aux-services.<service-name>.remote-classpath` options can be used to configure | ||
| this. In addition to setting up separate classpaths, it's necessary to ensure the two versions | ||
| advertise to different ports. This can be achieved using the `spark-shuffle-site.xml` file described | ||
| above. For example, you may have configuration like: | ||
|
|
||
| ```properties | ||
| yarn.nodemanager.aux-services = spark_shuffle_x,spark_shuffle_y | ||
| yarn.nodemanager.aux-services.spark_shuffle_x.classpath = /path/to/spark-x-yarn-shuffle.jar,/path/to/spark-x-config | ||
| yarn.nodemanager.aux-services.spark_shuffle_y.classpath = /path/to/spark-y-yarn-shuffle.jar,/path/to/spark-y-config | ||
| ``` | ||
|
|
||
| The two `spark-*-config` directories each contain one file, `spark-shuffle-site.xml`. These are XML | ||
| files in the [Hadoop Configuration format](https://hadoop.apache.org/docs/r3.2.2/api/org/apache/hadoop/conf/Configuration.html) | ||
| which each contain a few configurations to adjust the port number and metrics name prefix used: | ||
| ```xml | ||
| <configuration> | ||
| <property> | ||
| <name>spark.shuffle.service.port</name> | ||
| <value>7001</value> | ||
| </property> | ||
| <property> | ||
| <name>spark.yarn.shuffle.service.metrics.namespace</name> | ||
| <value>sparkShuffleServiceX</value> | ||
| </property> | ||
| </configuration> | ||
| ``` | ||
| The values should both be different for the two different services. | ||
|
|
||
| Then, in the configuration of the Spark applications, one should be configured with: | ||
| ```properties | ||
| spark.shuffle.service.name = spark_shuffle_x | ||
| spark.shuffle.service.port = 7001 | ||
| ``` | ||
| and one should be configured with: | ||
| ```properties | ||
| spark.shuffle.service.name = spark_shuffle_y | ||
| spark.shuffle.service.port = <other value> | ||
| ``` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,79 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.spark.deploy.yarn | ||
|
|
||
| import java.net.URLClassLoader | ||
|
|
||
| import org.apache.hadoop.yarn.conf.YarnConfiguration | ||
|
|
||
| import org.apache.spark._ | ||
| import org.apache.spark.internal.config._ | ||
| import org.apache.spark.network.yarn.{YarnShuffleService, YarnTestAccessor} | ||
| import org.apache.spark.tags.ExtendedYarnTest | ||
|
|
||
| /** | ||
| * SPARK-34828: Integration test for the external shuffle service with an alternate name and | ||
| * configs (by using a configuration overlay) | ||
| */ | ||
| @ExtendedYarnTest | ||
| class YarnShuffleAlternateNameConfigSuite extends YarnShuffleIntegrationSuite { | ||
|
|
||
| private[this] val shuffleServiceName = "custom_shuffle_service_name" | ||
|
|
||
| override def newYarnConfig(): YarnConfiguration = { | ||
| val yarnConfig = super.newYarnConfig() | ||
| yarnConfig.set(YarnConfiguration.NM_AUX_SERVICES, shuffleServiceName) | ||
| yarnConfig.set(YarnConfiguration.NM_AUX_SERVICE_FMT.format(shuffleServiceName), | ||
| classOf[YarnShuffleService].getCanonicalName) | ||
| val overlayConf = new YarnConfiguration() | ||
| // Enable authentication in the base NodeManager conf but not in the client. This would break | ||
| // shuffle, unless the shuffle service conf overlay overrides to turn off authentication. | ||
| overlayConf.setBoolean(NETWORK_AUTH_ENABLED.key, true) | ||
| // Add the authentication conf to a separate config object used as an overlay rather than | ||
| // setting it directly. This is necessary because a config overlay will override previous | ||
| // config overlays, but not configs which were set directly on the config object. | ||
| yarnConfig.addResource(overlayConf) | ||
| yarnConfig | ||
| } | ||
|
|
||
| override protected def extraSparkConf(): Map[String, String] = | ||
| super.extraSparkConf() ++ Map(SHUFFLE_SERVICE_NAME.key -> shuffleServiceName) | ||
xkrogen marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| override def beforeAll(): Unit = { | ||
| val configFileContent = | ||
| s"""<?xml version="1.0" encoding="UTF-8"?> | ||
| |<configuration> | ||
| | <property> | ||
| | <name>${NETWORK_AUTH_ENABLED.key}</name> | ||
| | <value>false</value> | ||
| | </property> | ||
| |</configuration> | ||
| |""".stripMargin | ||
| val jarFile = TestUtils.createJarWithFiles(Map( | ||
| YarnTestAccessor.getShuffleServiceConfOverlayResourceName -> configFileContent | ||
| )) | ||
| // Configure a custom classloader which includes the conf overlay as a resource | ||
| val oldClassLoader = Thread.currentThread().getContextClassLoader | ||
| Thread.currentThread().setContextClassLoader(new URLClassLoader(Array(jarFile))) | ||
| try { | ||
| super.beforeAll() | ||
| } finally { | ||
| Thread.currentThread().setContextClassLoader(oldClassLoader) | ||
| } | ||
| } | ||
| } | ||
Uh oh!
There was an error while loading. Please reload this page.