Skip to content

Conversation

@BryanCutler
Copy link
Member

What changes were proposed in this pull request?

Some ML Models were using defaultCopy which expects a default constructor, and others were not setting the parent estimator. This change fixes these by creating a new instance of the model and explicitly setting values and parent.

How was this patch tested?

Added MLTestingUtils.checkCopy to the offending models to tests to verify the copy is made and parent is set.

@SparkQA
Copy link

SparkQA commented Mar 16, 2017

Test build #74695 has finished for PR 17326 at commit 89bac8a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BryanCutler
Copy link
Member Author

ping @jkbradley @MLnick

@BryanCutler
Copy link
Member Author

BryanCutler commented Mar 17, 2017

ping @holdenk, I could do a backport too for 2.1.1 if needed

.setMaxIter(100)
.setSolver("l-bfgs")
val model = trainer.fit(dataset)
MLTestingUtils.checkCopy(model)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just something for consideration. Not sure if this is the best way to add unit test for this. I don't think we need to add check only for the classes included in this PR. And even if we do, maybe it's possible to do it in a more uniform way, like in testEstimatorAndModelReadWrite or add a pipelineCompatibilityTest as necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hhbyyh , I agree that this is maybe not the best general way to test it, but this is how it is done in every other Model test. Maybe this is ok for now and we could follow up with a discussion on the best way to check "basic" ML functionality?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it this way for now - would be good to log a JIRA to investigate a better and more generic way of testing so that it doesn't have to be done manually for each newly added Estimator/Model (or when a Model changes such that defaultCopy is no longer sufficient, etc)

@Since("1.5.0")
override def copy(extra: ParamMap): MultilayerPerceptronClassificationModel = {
copyValues(new MultilayerPerceptronClassificationModel(uid, layers, weights), extra)
val copied = new MultilayerPerceptronClassificationModel(uid, layers, weights)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem that we have a "standard" way of doing this, i.e. some places do

val copied = new X().setParent(parent)

and some places do

val copied = ...
copyValues(...).setParent(parent)

But at least for this PR let's be consistent, so maybe change this one to be the first version like you have in the rest of the PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, definitely. I noticed they were inconsistent too, so tried to change these to be like most of the others, but I must have forgot to fix this one.

@BryanCutler
Copy link
Member Author

Thanks for the review @MLnick!

@SparkQA
Copy link

SparkQA commented Mar 22, 2017

Test build #75011 has finished for PR 17326 at commit 8c03c28.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@MLnick
Copy link
Contributor

MLnick commented Mar 31, 2017

Jenkins retest this please

@MLnick
Copy link
Contributor

MLnick commented Mar 31, 2017

LGTM

@SparkQA
Copy link

SparkQA commented Mar 31, 2017

Test build #75433 has finished for PR 17326 at commit 8c03c28.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@MLnick
Copy link
Contributor

MLnick commented Apr 3, 2017

Merged to master. Thanks!

@asfgit asfgit closed this in 2a903a1 Apr 3, 2017
@BryanCutler BryanCutler deleted the ml-model-copy-error-SPARK-19985 branch March 6, 2018 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants