Skip to content

Commit ffe256c

Browse files
mikekapFelix Cheung
authored andcommitted
[SPARK-25730][K8S] Delete executor pods from kubernetes after figuring out why they died
## What changes were proposed in this pull request? `removeExecutorFromSpark` tries to fetch the reason the executor exited from Kubernetes, which may be useful if the pod was OOMKilled. However, the code previously deleted the pod from Kubernetes first which made retrieving this status impossible. This fixes the ordering. On a separate but related note, it would be nice to wait some time before removing the pod - to let the operator examine logs and such. ## How was this patch tested? Running on my local cluster. Author: Mike Kaplinskiy <[email protected]> Closes #22720 from mikekap/patch-1.
1 parent c77aa42 commit ffe256c

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,8 +112,8 @@ private[spark] class ExecutorPodsLifecycleManager(
112112
execId: Long,
113113
schedulerBackend: KubernetesClusterSchedulerBackend,
114114
execIdsRemovedInRound: mutable.Set[Long]): Unit = {
115-
removeExecutorFromK8s(podState.pod)
116115
removeExecutorFromSpark(schedulerBackend, podState, execId)
116+
removeExecutorFromK8s(podState.pod)
117117
execIdsRemovedInRound += execId
118118
}
119119

0 commit comments

Comments
 (0)