Skip to content

Conversation

@MrBago
Copy link
Contributor

@MrBago MrBago commented Apr 11, 2019

What changes were proposed in this pull request?

The RBackend and RBackendHandler create new conf objects that don't pick up conf values from the existing SparkSession and therefore always use the default conf values instead of values specified by the user.

In this fix we check to see if the spark env already exists, and get the conf from there. We fall back to creating a new conf. This follows the pattern used in other places including this:

val conf = Option(SparkEnv.get).map(_.conf).getOrElse(new SparkConf())

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.


def init(): (Int, RAuthHelper) = {
val conf = new SparkConf()
val conf = Option(SparkEnv.get).map(_.conf).getOrElse(new SparkConf())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @MrBago .
Could you add a unit test case, please? That will prevent a future regression. Maybe, at RBackendSuite.scala?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to think of a way to test this, I was thinking maybe making a "vissableForTesting" method that returns the connection timeout and we can verify that passing the conf sets the correct value. But that seems somewhat indirect and requires an API just for testing.

What do you think?

@dongjoon-hyun dongjoon-hyun changed the title [Spark-27446] Use existing spark conf if available. [SPARK-27446][R] Use existing spark conf if available. Apr 11, 2019
@SparkQA
Copy link

SparkQA commented Apr 12, 2019

Test build #104529 has finished for PR 24353 at commit cfb013d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this. what's the exact step to reproduce the problem?

IIRC this is done this way because the config is needed before the session is initialized (and therefore can be set by the user programmatically). user can still set/change with spark conf default properties file

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change itself seems reasonable

@MrBago
Copy link
Contributor Author

MrBago commented Apr 12, 2019

@felixcheung that's a good point. Our spark deployment is unusual in that we have sparkR connect to an existing driver process instead of having it start it's own. These changes should be safe for existing code paths because we always fall back onto creating a new SparkConf.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
For adding tests, I don't feel strongly since looks tricky but ideally should be good to add.

@felixcheung
Copy link
Member

ok thats fine.

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants