forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 2
[pull] master from apache:master #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…String ### What changes were proposed in this pull request? This patch adds documentation to describe how clients should implement handling connecting to the Spark Connect endpoint. GRPC as a protocol is well documented and has many options. However, this does not make it easy for users to reason about how to correctly configure GRPC to make it work for their use cases. To overcome the issue, this document defines a client connection string that needs to be parsed by the different language clients and can be used to properly configure the GRPC client. ### Why are the changes needed? Documentation and design specification for clients implementing the Spark Connect protocol. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Doc only. Closes #38470 from grundprinzip/client-connection. Lead-authored-by: Martin Grund <[email protected]> Co-authored-by: Martin Grund <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
…overage in Python client ### What changes were proposed in this pull request? This PR tests `session.sql` in Python client both in `toProto` path and the data collection path. ### Why are the changes needed? Improve testing coverage. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? UT Closes #38472 from amaliujia/test_sql_in_python_client. Authored-by: Rui Wang <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
### What changes were proposed in this pull request? Fix the type in the doc filename: `coient` -> `client`. ### Why are the changes needed? Fix typo. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? UT Closes #38487 from amaliujia/follow_up_docs. Authored-by: Rui Wang <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
…lient ### What changes were proposed in this pull request? 1. Improve testing coverage for `Union` and `UnionAll` (they are actually both `UnionAll`) 2. Add the API which does `UnionByName`. ### Why are the changes needed? Improve API Coverage. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? UT Closes #38453 from amaliujia/python_union. Authored-by: Rui Wang <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
…o `INVALID_IDENTIFIER` ### What changes were proposed in this pull request? In the PR, I propose to assign the proper name `INVALID_IDENTIFIER ` to the legacy error class `_LEGACY_ERROR_TEMP_0040 `, and modify test suite to use `checkError()` which checks the error class name, context and etc. ### Why are the changes needed? Proper name improves user experience w/ Spark SQL. ### Does this PR introduce _any_ user-facing change? Yes, the PR changes an user-facing error message. ### How was this patch tested? By running the modified test suites: ``` $ build/sbt "core/testOnly *SparkThrowableSuite" $ build/sbt "test:testOnly *ErrorParserSuite" ``` Closes #38484 from MaxGekk/invalid-identifier-error-class. Authored-by: Max Gekk <[email protected]> Signed-off-by: Max Gekk <[email protected]>
### What changes were proposed in this pull request? Fix the filename in doc: `client_connection_string.md` -> `client-connection-string.md`. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? github CI Closes #38493 from dengziming/minor-docs. Authored-by: dengziming <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
### What changes were proposed in this pull request? Change to plain cast to `PartitioningUtils.castPartitionSpec` in convertToPartIdent. So the behavior can follow the `STORE_ASSIGNMENT_POLICY`. ### Why are the changes needed? Make v2 code path ALTER PARTITION follows `STORE_ASSIGNMENT_POLICY` in ansi mode. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? remove the test `SPARK-40798: Alter partition should verify partition value - legacy`. This change has been convered at : `SPARK-40798: Alter partition should verify partition value` Closes #38449 from ulysses-you/SPARK-40798-follow. Authored-by: ulysses-you <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
…tElementNames in Mima for Scala 2.13 ### What changes were proposed in this pull request? This PR is a followup of #35594 that recovers Mima compatibility test for Scala 2.13. ### Why are the changes needed? To fix the Mima build broken (https://github.com/apache/spark/actions/runs/3380379538/jobs/5613108397) ``` [error] spark-core: Failed binary compatibility check against org.apache.spark:spark-core_2.13:3.3.0! Found 2 potential problems (filtered 945) [error] * method productElementName(Int)java.lang.String in object org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages#Shutdown does not have a correspondent in current version [error] filter with: ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages#Shutdown.productElementName") [error] * method productElementNames()scala.collection.Iterator in object org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages#Shutdown does not have a correspondent in current version [error] filter with: ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages#Shutdown.productElementNames") ``` ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? CI in this PR should test it out. After that, scheduled jobs for Scala 2.13 will test this out Closes #38492 from HyukjinKwon/SPARK-38270-followup. Authored-by: Hyukjin Kwon <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
… code for MergeScalarSubqueries ### What changes were proposed in this pull request? Recently, I read the `MergeScalarSubqueries` because it is a feature used for improve performance. I fount the parameters of ScalarSubqueryReference is hard to understand, so I want add some comments on it. Additionally, the private method `supportedAggregateMerge` of `MergeScalarSubqueries` looks redundant, this PR wants simplify the code. ### Why are the changes needed? Improve the readability and simplify the code for `MergeScalarSubqueries`. ### Does this PR introduce _any_ user-facing change? 'No'. Just improve the readability and simplify the code for `MergeScalarSubqueries`. ### How was this patch tested? Exists tests. Closes #38461 from beliefer/SPARK-34079_followup. Authored-by: Jiaan Geng <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
…L status in UI
### What changes were proposed in this pull request?
Use event `SparkListenerSQLExecutionEnd` to track if the SQL/DataFrame is completion instead of using job status.
### Why are the changes needed?
The SQL may succeed with some failed jobs. For example, a inner join with one empty side and one large side, the plan would finish and the large side is still running.
### Does this PR introduce _any_ user-facing change?
yes, correct the sql status in UI
### How was this patch tested?
add test for backward compatibility and manually test
```sql
CREATE TABLE t1 (c1 int) USING PARQUET;
CREATE TABLE t2 USING PARQUET AS SELECT 1 AS c2;
```
```bash
./bin/spark-sql -e "SELECT /*+ merge(tmp) */ * FROM t1 JOIN (SELECT c2, java_method('java.lang.Thread', 'sleep', 10000L) FROM t2) tmp ON c1 = c2;"
```
before:
<img width="1712" alt="image" src="https://user-images.githubusercontent.com/12025282/196576790-7e4eeb29-024f-4ac3-bdec-f4e894448b57.png">
after:
<img width="1709" alt="image" src="https://user-images.githubusercontent.com/12025282/196576674-15d80366-bd42-417b-80bf-eeec0b1ef046.png">
Closes #38302 from ulysses-you/sql-end.
Authored-by: ulysses-you <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
…lve `dev/sbt-checkstyle` run failed with sbt 1.7.3 ### What changes were proposed in this pull request? This pr aims upgrade `sbt-checkstyle-plugin` to 4.0.0 to resolve `dev/sbt-checkstyle` run failed with sbt 1.7.3, the new version will check the generated source code, so some new suppression rules have been added to `dev/checkstyle-suppressions.xml` ### Why are the changes needed? #38476 revert sbt 1.7.3 upgrade due to run `dev/sbt-checkstyle` failed: ``` [error] org.xml.sax.SAXParseException; lineNumber: 18; columnNumber: 10; DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true. [error] at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203) [error] at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177) [error] at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400) [error] at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327) [error] at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1473) [error] at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:914) [error] at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602) [error] at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505) [error] at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:842) [error] at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:771) [error] at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) [error] at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) [error] at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643) [error] at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:327) [error] at scala.xml.factory.XMLLoader.parse(XMLLoader.scala:73) [error] at scala.xml.factory.XMLLoader.loadXML(XMLLoader.scala:54) [error] at scala.xml.factory.XMLLoader.loadXML$(XMLLoader.scala:53) [error] at scala.xml.XML$.loadXML(XML.scala:62) [error] at scala.xml.factory.XMLLoader.loadString(XMLLoader.scala:92) [error] at scala.xml.factory.XMLLoader.loadString$(XMLLoader.scala:92) [error] at scala.xml.XML$.loadString(XML.scala:62) [error] at com.etsy.sbt.checkstyle.Checkstyle$.checkstyle(Checkstyle.scala:35) [error] at com.etsy.sbt.checkstyle.CheckstylePlugin$autoImport$.$anonfun$checkstyleTask$1(CheckstylePlugin.scala:36) [error] at com.etsy.sbt.checkstyle.CheckstylePlugin$autoImport$.$anonfun$checkstyleTask$1$adapted(CheckstylePlugin.scala:34) [error] at scala.Function1.$anonfun$compose$1(Function1.scala:49) [error] at sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:62) [error] at sbt.std.Transform$$anon$4.work(Transform.scala:68) [error] at sbt.Execute.$anonfun$submit$2(Execute.scala:282) [error] at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:23) [error] at sbt.Execute.work(Execute.scala:291) [error] at sbt.Execute.$anonfun$submit$1(Execute.scala:282) [error] at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:265) [error] at sbt.CompletionService$$anon$2.call(CompletionService.scala:64) [error] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [error] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [error] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [error] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [error] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [error] at java.lang.Thread.run(Thread.java:748) ``` This pr upgrade `sbt-checkstyle-plugin` to 4.0.0 to resolve the above issue, the `https://github.com/stringbean/sbt-checkstyle-plugin` was forked from etsy/sbt-checkstyle-plugin in 2022 after it became unmaintained, the release notes as follows: - https://github.com/stringbean/sbt-checkstyle-plugin/releases/tag/3.2.0 - https://github.com/stringbean/sbt-checkstyle-plugin/releases/tag/3.3.0 - https://github.com/stringbean/sbt-checkstyle-plugin/releases/tag/v4.0.0 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Actions - Manual test `dev/sbt-checkstyle` with sbt 1.7.3 and this pr: `Checkstyle checks passed.` Closes #38481 from LuciferYang/173-pass-checkstyle. Authored-by: yangjie01 <[email protected]> Signed-off-by: Sean Owen <[email protected]>
### What changes were proposed in this pull request? Strip leading - from resource name prefix ### Why are the changes needed? leading - are not allowed for resource name prefix (especially spark.kubernetes.executor.podNamePrefix) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit test Closes #38331 from tobiasstadler/fix-SPARK-40869. Lead-authored-by: Tobias Stadler <[email protected]> Co-authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
### What changes were proposed in this pull request? This pr aims to re-upgrade sbt to 1.7.3 due to SPARK-40996 has solved the issue of `dev/sbt-checkstyle` execution failure. ### Why are the changes needed? The release note as follows, this version just updates sbt underlying Coursier from 2.1.0-M2 to 2.1.0-M7-18-g67daad6a9: - https://github.com/sbt/sbt/releases/tag/v1.7.3 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Actions - Manual test: Run `dev/sbt-checkstyle` with this pr ``` Checkstyle checks passed. ``` Closes #38502 from LuciferYang/SPARK-40976-2. Authored-by: yangjie01 <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
…tests for JDK 9+ ### What changes were proposed in this pull request? This PR is a follow-up for #38277. This change is required due to test failures in JDK 11 and JDK 17. The patch disables the unit test for JDK 9+. This is a temporary measure while I am debugging and working on the fix for higher versions of JDK. ### Why are the changes needed? Fixes the test failure in JDK 11. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? N/A. Closes #38504 from sadikovi/fix-symlink-test. Authored-by: Ivan Sadikov <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
…se-sensitive ### What changes were proposed in this pull request? Adding one sentence to the documentation that connection string parameters are case sensitive. ### Why are the changes needed? Dev Experience ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Doc only Closes #38501 from grundprinzip/SPARK-41001-v21. Authored-by: Martin Grund <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
…CY_ERROR_TEMP_2201-2225 ### What changes were proposed in this pull request? This PR proposes to migrate 25 execution errors onto temporary error classes with the prefix `_LEGACY_ERROR_TEMP_2201` to `_LEGACY_ERROR_TEMP_2225`. The error classes are prefixed with `_LEGACY_ERROR_TEMP_` indicates the dev-facing error messages, and won't be exposed to end users. ### Why are the changes needed? To speed-up the error class migration. The migration on temporary error classes allow us to analyze the errors, so we can detect the most popular error classes. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? ``` $ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite" $ build/sbt "test:testOnly *SQLQuerySuite" $ build/sbt -Phive-thriftserver "hive-thriftserver/testOnly org.apache.spark.sql.hive.thriftserver.ThriftServerQueryTestSuite" ``` Closes #38170 from itholic/SPARK-40540-2201-2225. Lead-authored-by: itholic <[email protected]> Co-authored-by: Haejoon Lee <[email protected]> Signed-off-by: Max Gekk <[email protected]>
…lasses This is the follow-up PR to #37972 and #38212 ### What changes were proposed in this pull request? 1. Move spark-protobuf error classes to the spark error-classes framework(core/src/main/resources/error/error-classes.json). 2. Support protobuf imports 3. validate protobuf timestamp and duration types. ### Why are the changes needed? N/A ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? Existing tests should cover the validation of this PR. CC: rangadi mposdev21 gengliangwang Closes #38344 from SandishKumarHN/SPARK-40777-ProtoErrorCls. Authored-by: SandishKumarHN <[email protected]> Signed-off-by: Jungtaek Lim <[email protected]>
…trySuite ### What changes were proposed in this pull request? This PR aims to replace 'intercept' with 'Check error classes' in InterceptorRegistrySuite, include: 1. CONNECT.INTERCEPTOR_CTOR_MISSING 2. CONNECT.INTERCEPTOR_RUNTIME_ERROR ### Why are the changes needed? The changes improve the error framework. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? By running the modified test suite: ``` $./build/sbt "connect/testOnly *InterceptorRegistrySuite" ``` Closes #38494 from panbingkun/CONNECT_TEST. Authored-by: panbingkun <[email protected]> Signed-off-by: Max Gekk <[email protected]>
…or classes ### What changes were proposed in this pull request? This pr aims to A.check error classes in GeneratorFunctionSuite by using checkError() B.replaces TypeCheckFailure by DataTypeMismatch in type checks in the generator expressions, includes: 1. Stack (3): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L163-L170 2. ExplodeBase (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L299 3. Inline (1): https://github.com/apache/spark/blob/1431975723d8df30a25b2333eddcfd0bb6c57677/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L441 ### Why are the changes needed? Migration onto error classes unifies Spark SQL error messages. ### Does this PR introduce _any_ user-facing change? Yes. The PR changes user-facing error messages. ### How was this patch tested? 1. Add new UT 2. Update existed UT 3. Pass GA Closes #38482 from panbingkun/SPARK-40749. Authored-by: panbingkun <[email protected]> Signed-off-by: Max Gekk <[email protected]>
…OUT_OF_RANGE` ### What changes were proposed in this pull request? This PR proposes to rename `_LEGACY_ERROR_TEMP_1022` to `ORDER_BY_POS_OUT_OF_RANGE` ### Why are the changes needed? Error class name should clear/briefly explain the error message. ### Does this PR introduce _any_ user-facing change? User-facing error class name is changed. ### How was this patch tested? ``` ./build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z order-by-ordinal.sql" ``` Closes #38508 from itholic/SPARK-41012. Authored-by: itholic <[email protected]> Signed-off-by: Max Gekk <[email protected]>
…hon Client ### What changes were proposed in this pull request? This PR implements the connection string for Spark Connect clients according to the documentation added in #38470. With this patch it becomes possible to connect to a Spark Connect endpoint using ``` spark = SparkRemoteSession(user_id="martin", connection_string="sc://hostname/;use_ssl=true;token=abcd") spark.read.table("test").limit(10).toPandas() ``` The connection string is properly parsed and filtered. This allows to dynamically configure SSL and bearer token authentication. All remaining parameters are converted into GRPC Metadata pairs and submitted as part of the request. ### Why are the changes needed? User experience. ### Does this PR introduce _any_ user-facing change? No, experimental API. ### How was this patch tested? UT Closes #38485 from grundprinzip/SPARK-41001. Lead-authored-by: Martin Grund <[email protected]> Co-authored-by: Martin Grund <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
…r Python in Spark Connect ### What changes were proposed in this pull request? This PR implements the client-side serialization of most Python literals into Spark Connect literals. ### Why are the changes needed? Expanding the Python client support. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? UT Closes #38462 from grundprinzip/SPARK-40533. Lead-authored-by: Martin Grund <[email protected]> Co-authored-by: Martin Grund <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
…ressions onto error classes ### What changes were proposed in this pull request? This pr aims to replace `TypeCheckFailure` by `DataTypeMismatch` in type checks in the aggregate expressions: - Count - CollectSet - CountMinSketchAgg - HistogramNumeric ### Why are the changes needed? Migration onto error classes unifies Spark SQL error messages. ### Does this PR introduce _any_ user-facing change? Yes. The PR changes user-facing error messages. ### How was this patch tested? Pass GitHub Actions Closes #38498 from LuciferYang/SPARK-40769. Authored-by: yangjie01 <[email protected]> Signed-off-by: Max Gekk <[email protected]>
…plied if necessary ### What changes were proposed in this pull request? This is a followup of #38151, to fix a perf issue. When struct/array/map doesn't contain char type field, we should not recreate the struct/array/map for nothing. ### Why are the changes needed? fix a perf issue ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? new test Closes #38479 from cloud-fan/char. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Gengliang Wang <[email protected]>
…on scala 2.13 ### What changes were proposed in this pull request? In FetchBlockRequest we currently store a `Seq[FetchBlockInfo]` as part of the function `isRemoteAddressMaxedOut` (probably other places as well, but this is the function that showd up in my profileling) we use the length of this Seq. In scala 2.12 `Seq` is an alias for `scala.collection.Seq` but in 2.13 it an alias for `scala.collection.immutable.Seq`. This means that in when for example we call `toSeq` on a `ArrayBuffer` in 2.12 we do nothing and the `blocks` in the `FetchRequest` will be backed by something with a cheap `length` but in 2.13 we end up copying the data to a `List` with O(n) length function. This PR solves this changing the `Seq` to and `IndexedSeq` and therefore making the expectation of a cheap length function explicit. This means that we some places will do an extra copy in scala 2.13 compared to 2.12 (was also the case before this PR). If we wanted to avoid this copy we should instead change it to use `scala.collection.IndexedSeq` so we would have the same types in both 2.13 and 2.12. ### Why are the changes needed? The performance for ShuffleBlockFetcherIterator is much worse on Scala 2.13 than 2.12. Have seen cases were the overhead of repeatedly calculating the length is as much as 20% of cpu time (and could probably be even worse for larger shuffles). ### Does this PR introduce _any_ user-facing change? No. I think the interface changes are only on private classes. ### How was this patch tested? Existing specs. Closes #38427 from eejbyfeldt/SPARK-40950. Authored-by: Emil Ejbyfeldt <[email protected]> Signed-off-by: Mridul <mridul<at>gmail.com>
pull bot
pushed a commit
that referenced
this pull request
Nov 22, 2024
…ead pool ### What changes were proposed in this pull request? This PR aims to use a meaningful class name prefix for REST Submission API thread pool instead of the default value of Jetty QueuedThreadPool, `"qtp"+super.hashCode()`. https://github.com/dekellum/jetty/blob/3dc0120d573816de7d6a83e2d6a97035288bdd4a/jetty-util/src/main/java/org/eclipse/jetty/util/thread/QueuedThreadPool.java#L64 ### Why are the changes needed? This is helpful during JVM investigation. **BEFORE (4.0.0-preview2)** ``` $ SPARK_MASTER_OPTS='-Dspark.master.rest.enabled=true' sbin/start-master.sh $ jstack 28217 | grep qtp "qtp1925630411-52" #52 daemon prio=5 os_prio=31 cpu=0.07ms elapsed=19.06s tid=0x0000000134906c10 nid=0xde03 runnable [0x0000000314592000] "qtp1925630411-53" #53 daemon prio=5 os_prio=31 cpu=0.05ms elapsed=19.06s tid=0x0000000134ac6810 nid=0xc603 runnable [0x000000031479e000] "qtp1925630411-54" #54 daemon prio=5 os_prio=31 cpu=0.06ms elapsed=19.06s tid=0x000000013491ae10 nid=0xdc03 runnable [0x00000003149aa000] "qtp1925630411-55" #55 daemon prio=5 os_prio=31 cpu=0.08ms elapsed=19.06s tid=0x0000000134ac9810 nid=0xc803 runnable [0x0000000314bb6000] "qtp1925630411-56" #56 daemon prio=5 os_prio=31 cpu=0.04ms elapsed=19.06s tid=0x0000000134ac9e10 nid=0xda03 runnable [0x0000000314dc2000] "qtp1925630411-57" #57 daemon prio=5 os_prio=31 cpu=0.05ms elapsed=19.06s tid=0x0000000134aca410 nid=0xca03 runnable [0x0000000314fce000] "qtp1925630411-58" #58 daemon prio=5 os_prio=31 cpu=0.04ms elapsed=19.06s tid=0x0000000134acaa10 nid=0xcb03 runnable [0x00000003151da000] "qtp1925630411-59" #59 daemon prio=5 os_prio=31 cpu=0.06ms elapsed=19.06s tid=0x0000000134acb010 nid=0xcc03 runnable [0x00000003153e6000] "qtp1925630411-60-acceptor-0108e9815-ServerConnector1e497474{HTTP/1.1, (http/1.1)}{M3-Max.local:6066}" #60 daemon prio=3 os_prio=31 cpu=0.11ms elapsed=19.06s tid=0x00000001317ffa10 nid=0xcd03 runnable [0x00000003155f2000] "qtp1925630411-61-acceptor-11d90f2aa-ServerConnector1e497474{HTTP/1.1, (http/1.1)}{M3-Max.local:6066}" #61 daemon prio=3 os_prio=31 cpu=0.10ms elapsed=19.06s tid=0x00000001314ed610 nid=0xcf03 waiting on condition [0x00000003157fe000] ``` **AFTER** ``` $ SPARK_MASTER_OPTS='-Dspark.master.rest.enabled=true' sbin/start-master.sh $ jstack 28317 | grep StandaloneRestServer "StandaloneRestServer-52" #52 daemon prio=5 os_prio=31 cpu=0.09ms elapsed=60.06s tid=0x00000001284a8e10 nid=0xdb03 runnable [0x000000032cfce000] "StandaloneRestServer-53" #53 daemon prio=5 os_prio=31 cpu=0.06ms elapsed=60.06s tid=0x00000001284acc10 nid=0xda03 runnable [0x000000032d1da000] "StandaloneRestServer-54" #54 daemon prio=5 os_prio=31 cpu=0.05ms elapsed=60.06s tid=0x00000001284ae610 nid=0xd803 runnable [0x000000032d3e6000] "StandaloneRestServer-55" #55 daemon prio=5 os_prio=31 cpu=0.09ms elapsed=60.06s tid=0x00000001284aec10 nid=0xd703 runnable [0x000000032d5f2000] "StandaloneRestServer-56" #56 daemon prio=5 os_prio=31 cpu=0.06ms elapsed=60.06s tid=0x00000001284af210 nid=0xc803 runnable [0x000000032d7fe000] "StandaloneRestServer-57" #57 daemon prio=5 os_prio=31 cpu=0.05ms elapsed=60.06s tid=0x00000001284af810 nid=0xc903 runnable [0x000000032da0a000] "StandaloneRestServer-58" #58 daemon prio=5 os_prio=31 cpu=0.06ms elapsed=60.06s tid=0x00000001284afe10 nid=0xcb03 runnable [0x000000032dc16000] "StandaloneRestServer-59" #59 daemon prio=5 os_prio=31 cpu=0.05ms elapsed=60.06s tid=0x00000001284b0410 nid=0xcc03 runnable [0x000000032de22000] "StandaloneRestServer-60-acceptor-04aefbaa8-ServerConnector44284d85{HTTP/1.1, (http/1.1)}{M3-Max.local:6066}" #60 daemon prio=3 os_prio=31 cpu=0.13ms elapsed=60.05s tid=0x000000015cda1a10 nid=0xcd03 runnable [0x000000032e02e000] "StandaloneRestServer-61-acceptor-148976251-ServerConnector44284d85{HTTP/1.1, (http/1.1)}{M3-Max.local:6066}" #61 daemon prio=3 os_prio=31 cpu=0.12ms elapsed=60.05s tid=0x000000015cd1c810 nid=0xce03 waiting on condition [0x000000032e23a000] ``` ### Does this PR introduce _any_ user-facing change? No, the thread names are accessed during the debugging. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#48924 from dongjoon-hyun/SPARK-50385. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: panbingkun <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )