-
Notifications
You must be signed in to change notification settings - Fork 117
Submission V2 - Use dependency server and init-container to start Spark applications #227
Conversation
|
Work in progress - still some tests that should be added here. |
|
Integration tests are failing here because Minikube is not downloaded. I'm confused as to why Minikube is not being downloaded before the tests are run. I checked the logs, and it looks like the integration tests are being run in the scalatest:test maven plugin, but does Also I'm not sure why the tests are only failing now and were successful beforehand. I've changed the structure of the tests here, though, so that might have something to do with it. I'll take a look into this. @kimoonkim any thoughts on what could be going on here? |
|
Fixed - the problem was the test wasn't named properly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed the pattern of some abstract traits having a single implementation, e.g. MountedDependencyManagerProvider or SubmissionKubernetesClientProvider. It looks to be similar to a pattern in the broader spark code. Is it anticipating that these might have multiple alternate implementations in the future, or (not mutually exclusive) conforming to a design idiom?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's primarily for unit testing, so that we can inject custom implementations and thus only test each class separately. We could use Mockito to mock classes but I believe the correct idiom is to inject interface/traits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least that's how we're using it here - not sure if that's the intended purpose of the idiom's usage in the broader Spark code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
traits are definitely better scala idiom
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that jars are decoupled from data 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a known performance impact of spinning up & running init containers?
I'm not sure what an alternative would look like, except maybe logic on the driver container itself to pull from the server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that's been concerning in my local testing is that the init-container can take awhile to run. I've logged into the init container and saw that the jars have already been downloaded to the disk but the init container was still running and not giving way to the main application. I haven't been able to diagnose why this is happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I prefer this to be an init-container provided that we can understand and mitigate the performance implications, as it decouples the execution of the driver from the downloading of the dependencies. Having this be an init-container makes the driver's Dockerfile simply be an execution of the user's main class, as opposed to needing an adapter Spark-specific main class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I like init containers, in the sense of being very idiomatic to kube, and also not requiring that logic to live on the user container images. The flip side is that they seem a bit heavy-weight from the standpoint of having to instantiate the image, run it, etc. Maybe @foxish has insights on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a closer look and the JVM was still running the init-container even after the files were downloaded. However, I made a change to make retrofit clients use daemon threads. The JVM cleaned up far more quickly as a result of this change, making me think that there was some sort of dangling non-daemon thread somewhere - perhaps a thread in a cached thread pool that was only cleaned up after a period of inactivity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's still some more we can test here, particularly to check the driver container is picking up the resolved jars in the environment variables.
|
Tests that are still missing:
* - Edit: I thought this was broken but we can still test with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Had a bunch of minor stuff, but I like the direction and think this is going to have a really nice effect on resiliency when we're eventually able to restart the driver pod on failure. Also the streaming requests and responses should resolve the OOMs we were seeing earlier with the rest service approach.
Let me know when you've got the remaining tests in and are ready for re-review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some javadoc on here to explain what this is?
My understanding is this is the submission client that launches an application inside a kubernetes cluster by uploading any local resources to a resource staging server
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
localUriStringsToPaths -> Files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
javadoc for what this return parameter means (the secret required to retrieve the files) and for uploadJars too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark-driver-init ? thinking to if somehow there are other init containers on the pod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we add the Spark init container after the existing ones specified by the user. Why after instead of before? Does it need to be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can wait until the use case comes up to make it configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this supposed to be lowercased init-container? does it matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only a comment written to the properties string, so we don't have to be too concerned about the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't support multiple files with the same filename; I think we handle this on upload time already though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the upload we rename the files with deduplication counters - we should probably do the same thing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're adding two more copies of this method in this PR, and I think we've already got it once. Do we need a library jar to be reused between places? Or is copying just this one method ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put some logging in here. Checkpoints I'd love to see:
- "Running" (first log line in main method showing when our code starts running)
- download jar/files each started
- download jar/files each finished
- "Finished" (when we're done and return from the main method)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems familiar -- is it also a copied method for use in the init container?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No - this is only used in the driver pod. It has different conventions on the configuration parameters it looks for and whether or not to use a service account token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
downloading files and jars can probably be done in parallel in a followup to this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Javadoc here for what this does -- I think the salient points are that it starts the ResourceStagingServer as a kubernetes pod and puts a NodePort service around the pod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple minor changes then I think I'm good. Haven't fully digested the testing situation yet, but they seem to be running and passing so that's good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this hide the driver java options from the Environment tab of the Spark UI? I think we want that tab to not be missing user-provided conf info, even if we have to leave otherwise ignored config in the SparkConf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that since we pass these in verbatim to the driver JVM, we would have to escape these in order for the Java invocation to treat this properly as a single string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be a way to do that but would prefer to address in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kind of odd to delimit this with the path separator / but it makes sense -- it's the one character prohibited from being in a filename so is safe to delimit a list of files with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this is for making the mounted classpath be able to be passed in directly as the classpath argument. The driver dockerfile just runs java... -cp $CLASSPATH where $CLASSPATH is a concatenation of a bunch of environment variables; each environment variable must itself be a valid path-separated classpath string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also Path separator I believe is : not /.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh you're right, mixed up separator and delimiter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
javadoc on this -- it's an implementation of MountedDependencyManager that is backed by a ResourceStagingService
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this take an InputStream instead of a File? In that case we may not have to create the tarball on disk before submitting -- we could create the tarball and stream out to the server all in one step!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately not - RequestBody.create only has method signatures for passing a byte array, a string, and a file. I think streaming is going to be difficult here because RequestBody implementations must know the length of the content within a-priori.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh given the length indeed we would have problems. Let's make any changes to this flow in a followup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you planning to do the executor V2 change in a separate PR? This one's already plenty big enough, I lose attention halfway through the 2k+ lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, separate change.
|
@mccheah all three pre-reqs of this are now in -- re-aim away! |
1f7cfc2 to
09f3763
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming no changes between the version I reviewed last night and the squashed change you just pushed, I'm ready to merge.
@erikerlandson @foxish @tnachen any thoughts on this submission V2 server now that this shows how it's used in practice?
|
Taking a quick look now |
| private[spark] val MEMORY_OVERHEAD_MIN = 384L | ||
|
|
||
| // V2 submission init container | ||
| private[spark] val INIT_CONTAINER_ANNOTATION = "pod.beta.kubernetes.io/init-containers" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The init-container annotation should not be necessary anymore as of k8s 1.6: they're a first class field (spec.InitContainers)). The annotation will be deprecated in future releases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Fabric8 Library doesn't support this field at the moment. Should we file an issue to the fabric8 project to support it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something we haven't discussed much yet is the compatibility range the spark-k8s project has on different versions of k8s. Maybe we try to make this a rolling window, and support both latest and previous releases, which currently would be 1.6 and 1.5
In that case we should use the beta annotations until k8s 1.7 is released
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll open up an issue against fabric8. That's a good point on compatibility. Supporting current version and the previous seems to be a good start (we should formalize and capture this somewhere in the documentation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The beta annotation LGTM. I do want to take a closer look at the other k8s specific parts of this PR.
Can we hold off on merging till Monday morning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing -- looking forward to your detailed review!
|
@mccheah, how does a user launch this file submission server on their cluster? Is there a YAML spec? |
There is no YAML spec at the moment. However, the resource staging server should be agnostic to how it is launched. So one could use a YAML spec to launch a pod into a Kubernetes cluster, or they could run the Java process somewhere natively as well. There's an example of how this can be launched and configured in Kubernetes here: https://github.com/apache-spark-on-k8s/spark/pull/227/files#diff-2b35cca3c8575728153853415f9a67d5 |
| kubernetesCredentials: RequestBody): Call[StagedResourceIdentifier] | ||
|
|
||
| @Streaming | ||
| @retrofit2.http.GET("/api/v0/resources/{resourceId}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand this. It's v2 but the api calls it v0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should change the v0 to v2 for consistency sake. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is V0 of this particular API but version 2 of submitting to Kubernetes in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on our conversation offline, v1 will eventually go away and we'll just have this v2 submission.
In a future PR, we will have YAML files to help quickly turn up a file submission server.
|
Does this also mean that we need to rebuild our docker images, because the submission client now expects /v1/submissions/create as opposed to /submissions/create by default? |
|
Merging, LGTM. |
Implements version 2 of submitting applications, using the file staging server if necessary to fetch application dependencies.