Skip to content
This repository was archived by the owner on Jan 9, 2020. It is now read-only.

Conversation

@mccheah
Copy link

@mccheah mccheah commented Apr 6, 2017

Allow providing a keyStore or key+certificate PEM files to have the staging server listen over TLS.

Note that we're also preemptively allowing the server's configuration to be provided in a properties file. This is to allow properties to be provided through a ConfigMap more easily if this file staging server were to be deployed in Kubernetes. The alternative would be to specify all of these properties as system properties or command-line arguments, but I think our previous work on similar constructs has shown that parsing out command line arguments and expecting them to be set correctly in the Dockerfile can be tedious.

In the case that the user prefer not to have the key and keystore passwords be provided through the ConfigMap - e.g. they would prefer to mount K8s secrets - the key and keystore password values can be provided in files as well.

@mccheah
Copy link
Author

mccheah commented Apr 6, 2017

Note that this is merging into #220 , which in turn is merging into #212. We should:

  1. Merge Staging server for receiving application dependencies. #212 first
  2. Retarget Reorganize packages between v1 work and v2 work #220 to branch-2.1-kubernetes
  3. Merge Reorganize packages between v1 work and v2 work #220
  4. Retarget this PR to branch-2.1-kubernetes
  5. Merge this

.stringConf
.createOptional

private[spark] val DEPENDENCY_SERVER_CERT_PEM =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put all these new config options into the docs file. Also needs to include what of file:// local:// hdfs:// protocols are supported here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was planning to save all documentation for a final sweep where we eliminate all of the v1 submission logic and replace it with v2. At that point the docs should be updated to remove the old options and replace them with the new ones.

@mccheah
Copy link
Author

mccheah commented Apr 14, 2017

rerun integration test please

Copy link

@ash211 ash211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nicely tested! Some minor suggestions. The biggest thing is probably logging and the overall logging story -- do you want to tackle that here or in a separate PR?

When running some manual tests I think we're going to find logging invaluable

connector.setPort(port)
server.addConnector(connector)
server.setHandler(contextHandler)
server.start()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this log some kind of "server started on hostname:port" message? Want to have a clear "I'm ready" message to look for in server logs

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have the hostname here but Spark tends to log "Started server listening on port X", and we should do something similar here.

}

private[spark] class ResourceStagingServerSslOptionsProviderImpl(sparkConf: SparkConf)
extends ResourceStagingServerSslOptionsProvider {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find a server logging all its non-sensitive config on startup helps with debugging later -- maybe DEBUG log everything in SparkConf and make sure DEBUG is enabled for this class?

"Shouldn't provide both the keyStore password value and the keyStore password file.")
requireNandDefined(baseSslOptions.keyPassword, maybeKeyPasswordFile,
"Shouldn't provide both the keyStore key password value and the keyStore key password file.")
maybeKeyPem.foreach { _ =>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactor to a requireSecondIfFirst method

class ResourceStagingServerSslOptionsProviderSuite extends SparkFunSuite with BeforeAndAfter {

private val SSL_TEMP_DIR = Utils.createTempDir(namePrefix = "resource-staging-server-ssl-test")
private val KEYSTORE_FILE = new File(SSL_TEMP_DIR, "keyStore.jks")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convention is all-lowercase keystore.jks I think

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention is to capitalize the "S" on keyStore.

sparkConf = new SparkConf(true)
sslOptionsProvider = new ResourceStagingServerSslOptionsProviderImpl(sparkConf)
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clear the SSL_TEMP_DIR between tests to reduce risk of polluting other test runs

server.stop()
test("Enable SSL on the server") {
val (keyStore, trustStore) = SSLUtils.generateKeyStoreTrustStorePair(
"127.0.0.1", "changeit", "changeit", "changeit")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make these 3 values different so it's clear what's what and make sure the values a couple lines down are lined up

Copy link

@ash211 ash211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will merge later today after #220

@ash211 ash211 changed the base branch from split-v1-and-v2-submission-packages to branch-2.1-kubernetes April 21, 2017 07:35
@ash211
Copy link

ash211 commented Apr 21, 2017

rerun unit tests please

@ash211 ash211 merged commit 4940eae into branch-2.1-kubernetes Apr 21, 2017
foxish pushed a commit that referenced this pull request Jul 24, 2017
* Staging server for receiving application dependencies.

* Move packages around to split between v1 work and v2 work

* Add unit test for file writing

* Remove unnecessary main

* Allow the file staging server to be secured with TLS.

* Add back license header

* Minor fixes

* Fix integration test with renamed package for client. Fix scalastyle.

* Remove unused import

* Force json serialization to consider the different package.

* Revert extraneous log

* Fix scalastyle

* Remove getting credentials from the API

We still want to post them because in the future we can use these
credentials to monitor the API server and handle cleaning up the data
accordingly.

* Fix build

* Randomize name and namespace in test to prevent collisions

* Generalize to resource staging server outside of Spark

* Update code documentation

* Val instead of var

* Fix unit tests.

* Fix build

* Fix naming, remove unused import

* Move suites from integration test package to core

* Fix unit test

* Use TrieMap instead of locks

* Address comments

* Fix imports

* Address comments

* Change main object name

* Change config variable names

* Change paths, use POST instead of PUT

* Use a resource identifier as well as a resource secret
ifilonenko pushed a commit to ifilonenko/spark that referenced this pull request Feb 26, 2019
puneetloya pushed a commit to puneetloya/spark that referenced this pull request Mar 11, 2019
* Staging server for receiving application dependencies.

* Move packages around to split between v1 work and v2 work

* Add unit test for file writing

* Remove unnecessary main

* Allow the file staging server to be secured with TLS.

* Add back license header

* Minor fixes

* Fix integration test with renamed package for client. Fix scalastyle.

* Remove unused import

* Force json serialization to consider the different package.

* Revert extraneous log

* Fix scalastyle

* Remove getting credentials from the API

We still want to post them because in the future we can use these
credentials to monitor the API server and handle cleaning up the data
accordingly.

* Fix build

* Randomize name and namespace in test to prevent collisions

* Generalize to resource staging server outside of Spark

* Update code documentation

* Val instead of var

* Fix unit tests.

* Fix build

* Fix naming, remove unused import

* Move suites from integration test package to core

* Fix unit test

* Use TrieMap instead of locks

* Address comments

* Fix imports

* Address comments

* Change main object name

* Change config variable names

* Change paths, use POST instead of PUT

* Use a resource identifier as well as a resource secret
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants