-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[HOTFIX] Wait for EOF only for the PySpark shell #2170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Otherwise the application simply won't exit unless we manually send an EOF.
|
QA tests have started for PR 2170 at commit
|
|
Tested the shell and SparkPi in both Scala and Python locally on OSX and Windows. All subprocesses are confirmed to exit as expected. |
|
QA tests have finished for PR 2170 at commit
|
|
Test failures are not related. Jenkins, retest this please. |
|
The cases about SparkSubmit are flaky... |
|
QA tests have started for PR 2170 at commit
|
|
Yup, I'm looking into it... |
|
QA tests have finished for PR 2170 at commit
|
|
Okay thanks - I'll merge this. |
In `SparkSubmitDriverBootstrapper`, we wait for the parent process to send us an `EOF` before finishing the application. This is applicable for the PySpark shell because we terminate the application the same way. However if we run a python application, for instance, the JVM actually never exits unless it receives a manual EOF from the user. This is causing a few tests to timeout. We only need to do this for the PySpark shell because Spark submit runs as a python subprocess only in this case. Thus, the normal Spark shell doesn't need to go through this case even though it is also a REPL. Thanks davies for reporting this. Author: Andrew Or <[email protected]> Closes #2170 from andrewor14/bootstrap-hotfix and squashes the following commits: 42963f5 [Andrew Or] Do not wait for EOF unless this is the pyspark shell (cherry picked from commit dafe343) Signed-off-by: Patrick Wendell <[email protected]>
In `SparkSubmitDriverBootstrapper`, we wait for the parent process to send us an `EOF` before finishing the application. This is applicable for the PySpark shell because we terminate the application the same way. However if we run a python application, for instance, the JVM actually never exits unless it receives a manual EOF from the user. This is causing a few tests to timeout. We only need to do this for the PySpark shell because Spark submit runs as a python subprocess only in this case. Thus, the normal Spark shell doesn't need to go through this case even though it is also a REPL. Thanks davies for reporting this. Author: Andrew Or <[email protected]> Closes apache#2170 from andrewor14/bootstrap-hotfix and squashes the following commits: 42963f5 [Andrew Or] Do not wait for EOF unless this is the pyspark shell
In
SparkSubmitDriverBootstrapper, we wait for the parent process to send us anEOFbefore finishing the application. This is applicable for the PySpark shell because we terminate the application the same way. However if we run a python application, for instance, the JVM actually never exits unless it receives a manual EOF from the user. This is causing a few tests to timeout.We only need to do this for the PySpark shell because Spark submit runs as a python subprocess only in this case. Thus, the normal Spark shell doesn't need to go through this case even though it is also a REPL.
Thanks @davies for reporting this.