Skip to content

Conversation

holdenk
Copy link
Contributor

@holdenk holdenk commented Aug 24, 2017

What changes were proposed in this pull request?

Design document: https://docs.google.com/document/d/1bC2sxHoF3XbAvUHQebpylAktH6B3PSTVAGIOCYj0Mbg/edit?usp=sharing

Keep track of nodes which are going to be shutdown to prevent scheduling tasks. The PR is designed with spot instances in mind, where there is some notice (depending on the cloud vendor) that the node will be shut down.

Since Kubernetes has a first class notion of pod shut down and grace periods the decommissioning support is available on Kubernetes. For other deployments it is left to the instance to notify the worker(s) of decommissioning with SIGPWR.

SPARK-20628 is a sub-task of SPARK-20624 with follow up tasks to perform migration of data and re-launching of tasks. SPARK-20628 is distinct from other mechanism where Spark its self has control of executor decommissioning, however the later follow up tasks in SPARK-20624 should be usable across voluntary and involuntary termination (e.g. #19041 could provide a good mechanism for doing data copy during involuntary termination).

How was this patch tested?

Extension of AppClientSuite to cover decommissioning and addition of explicit worker decom suite.

Areas of future work:

@SparkQA
Copy link

SparkQA commented Aug 25, 2017

Test build #81103 has finished for PR 19045 at commit 65a29c1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

Are you still working on this? @holdenk

@holdenk
Copy link
Contributor Author

holdenk commented Nov 8, 2017

I'll bump it if anyone has a chance to review, but I think we'll see how #19267 (comment) plays out first.

@holdenk
Copy link
Contributor Author

holdenk commented Dec 1, 2017

So it seems like the YARN changes are only going to happen in Hadoop 3+ so this might make sense regardless of what happens in #19267 (since folks like K8 or whoever can send the message as desired).

@holdenk
Copy link
Contributor Author

holdenk commented Aug 13, 2018

Chatted with some K8s folks and I'll revive this PR with that in mind.

@SparkQA
Copy link

SparkQA commented Aug 27, 2018

Test build #95306 has finished for PR 19045 at commit c40fac5.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor Author

holdenk commented Sep 8, 2018

cc @ifilonenko it's super WIP but since you joined me on the stream where I was working on reviving this I thought it would be good to get your early comments (especially if you have any suggestions around making effective integration tests for this).

@SparkQA
Copy link

SparkQA commented Sep 8, 2018

Test build #95837 has finished for PR 19045 at commit 0ba0ca5.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 8, 2018

Test build #95838 has finished for PR 19045 at commit 5877c16.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 14, 2019

@SparkQA
Copy link

SparkQA commented Mar 14, 2019

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8886/

@shaneknapp
Copy link
Contributor

FYI... the k8s integration test failure was caused by this:
https://issues.apache.org/jira/browse/SPARK-27178

i have a fix ready to go, but am still wondering why this suddenly popped up. :(

@holdenk
Copy link
Contributor Author

holdenk commented May 7, 2019

Thanks @shaneknapp :)

@SparkQA
Copy link

SparkQA commented May 7, 2019

Test build #105181 has finished for PR 19045 at commit 09a01cf.

  • This patch fails RAT tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 7, 2019

@SparkQA
Copy link

SparkQA commented May 7, 2019

@SparkQA
Copy link

SparkQA commented May 7, 2019

@SparkQA
Copy link

SparkQA commented May 7, 2019

@SparkQA
Copy link

SparkQA commented May 7, 2019

@SparkQA
Copy link

SparkQA commented May 7, 2019

@SparkQA
Copy link

SparkQA commented May 7, 2019

Test build #105184 has finished for PR 19045 at commit 55fa260.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 7, 2019

Test build #105183 has finished for PR 19045 at commit 9a5000d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 16, 2019

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/15793/

@SparkQA
Copy link

SparkQA commented Sep 16, 2019

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/15793/

@holdenk
Copy link
Contributor Author

holdenk commented Nov 8, 2019

I've created a new pull request & design doc with some feedback and just general updates since Spark has shifted a lot. We can continue the discussion in #26440

@holdenk holdenk closed this Nov 8, 2019
asfgit pushed a commit that referenced this pull request Feb 14, 2020
…emption support

This PR is based on an existing/previou PR - #19045

### What changes were proposed in this pull request?

This changes adds a decommissioning state that we can enter when the cloud provider/scheduler lets us know we aren't going to be removed immediately but instead will be removed soon. This concept fits nicely in K8s and also with spot-instances on AWS / preemptible instances all of which we can get a notice that our host is going away. For now we simply stop scheduling jobs, in the future we could perform some kind of migration of data during scale-down, or at least stop accepting new blocks to cache.

There is a design document at https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE/edit?usp=sharing

### Why are the changes needed?

With more move to preemptible multi-tenancy, serverless environments, and spot-instances better handling of node scale down is required.

### Does this PR introduce any user-facing change?

There is no API change, however an additional configuration flag is added to enable/disable this behaviour.

### How was this patch tested?

New integration tests in the Spark K8s integration testing. Extension of the AppClientSuite to test decommissioning seperate from the K8s.

Closes #26440 from holdenk/SPARK-20628-keep-track-of-nodes-which-are-going-to-be-shutdown-r4.

Lead-authored-by: Holden Karau <[email protected]>
Co-authored-by: Holden Karau <[email protected]>
Signed-off-by: Holden Karau <[email protected]>
sjincho pushed a commit to sjincho/spark that referenced this pull request Apr 15, 2020
…emption support

This PR is based on an existing/previou PR - apache#19045

### What changes were proposed in this pull request?

This changes adds a decommissioning state that we can enter when the cloud provider/scheduler lets us know we aren't going to be removed immediately but instead will be removed soon. This concept fits nicely in K8s and also with spot-instances on AWS / preemptible instances all of which we can get a notice that our host is going away. For now we simply stop scheduling jobs, in the future we could perform some kind of migration of data during scale-down, or at least stop accepting new blocks to cache.

There is a design document at https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE/edit?usp=sharing

### Why are the changes needed?

With more move to preemptible multi-tenancy, serverless environments, and spot-instances better handling of node scale down is required.

### Does this PR introduce any user-facing change?

There is no API change, however an additional configuration flag is added to enable/disable this behaviour.

### How was this patch tested?

New integration tests in the Spark K8s integration testing. Extension of the AppClientSuite to test decommissioning seperate from the K8s.

Closes apache#26440 from holdenk/SPARK-20628-keep-track-of-nodes-which-are-going-to-be-shutdown-r4.

Lead-authored-by: Holden Karau <[email protected]>
Co-authored-by: Holden Karau <[email protected]>
Signed-off-by: Holden Karau <[email protected]>
holdenk added a commit to holdenk/spark that referenced this pull request Jun 25, 2020
…emption support

This PR is based on an existing/previou PR - apache#19045

This changes adds a decommissioning state that we can enter when the cloud provider/scheduler lets us know we aren't going to be removed immediately but instead will be removed soon. This concept fits nicely in K8s and also with spot-instances on AWS / preemptible instances all of which we can get a notice that our host is going away. For now we simply stop scheduling jobs, in the future we could perform some kind of migration of data during scale-down, or at least stop accepting new blocks to cache.

There is a design document at https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE/edit?usp=sharing

With more move to preemptible multi-tenancy, serverless environments, and spot-instances better handling of node scale down is required.

There is no API change, however an additional configuration flag is added to enable/disable this behaviour.

New integration tests in the Spark K8s integration testing. Extension of the AppClientSuite to test decommissioning seperate from the K8s.

Closes apache#26440 from holdenk/SPARK-20628-keep-track-of-nodes-which-are-going-to-be-shutdown-r4.

Lead-authored-by: Holden Karau <[email protected]>
Co-authored-by: Holden Karau <[email protected]>
Signed-off-by: Holden Karau <[email protected]>
holdenk added a commit to holdenk/spark that referenced this pull request Oct 27, 2020
…emption support

This PR is based on an existing/previou PR - apache#19045

This changes adds a decommissioning state that we can enter when the cloud provider/scheduler lets us know we aren't going to be removed immediately but instead will be removed soon. This concept fits nicely in K8s and also with spot-instances on AWS / preemptible instances all of which we can get a notice that our host is going away. For now we simply stop scheduling jobs, in the future we could perform some kind of migration of data during scale-down, or at least stop accepting new blocks to cache.

There is a design document at https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE/edit?usp=sharing

With more move to preemptible multi-tenancy, serverless environments, and spot-instances better handling of node scale down is required.

There is no API change, however an additional configuration flag is added to enable/disable this behaviour.

New integration tests in the Spark K8s integration testing. Extension of the AppClientSuite to test decommissioning seperate from the K8s.

Closes apache#26440 from holdenk/SPARK-20628-keep-track-of-nodes-which-are-going-to-be-shutdown-r4.

Lead-authored-by: Holden Karau <[email protected]>
Co-authored-by: Holden Karau <[email protected]>
Signed-off-by: Holden Karau <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.