Skip to content

Conversation

@yinxusen
Copy link
Contributor

@yinxusen yinxusen commented Jun 21, 2016

What changes were proposed in this pull request?

  1. Add a PythonStageWrapper in Scala for pure Python implemented pipeline stages in PySpark.
  2. Add a PythonTransformer, PythonEstimator, and PythonModel as proxies of PySpark pure Python transformers, estimators, and models.
  3. Change pure Python implemented Pipeline in PySpark into Java one.
  4. Implement save/load in Pipeline for pure Python pipeline stages.

How was this patch tested?

Test with Python unit test and doc test.

@SparkQA
Copy link

SparkQA commented Jun 23, 2016

Test build #61127 has finished for PR 13794 at commit f4ec73d.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yinxusen
Copy link
Contributor Author

test it please

@SparkQA
Copy link

SparkQA commented Jun 23, 2016

Test build #61129 has finished for PR 13794 at commit 8d23bac.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class ElementwiseProduct @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)
    • class Normalizer @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)
    • class PolynomialExpansion @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)
    • public class JavaPackage

@yinxusen
Copy link
Contributor Author

test it please

@SparkQA
Copy link

SparkQA commented Jun 24, 2016

Test build #61133 has finished for PR 13794 at commit b8ddcdb.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yinxusen
Copy link
Contributor Author

retest this please

@yinxusen yinxusen changed the title [SPARK-15574][WIP][ML][PySpark] Python transformer wrapper and Pipeline [SPARK-15574][ML][PySpark] Python meta-algorithms in Scala Jun 24, 2016
@SparkQA
Copy link

SparkQA commented Jun 24, 2016

Test build #61197 has finished for PR 13794 at commit b8ddcdb.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yinxusen
Copy link
Contributor Author

@jkbradley Update: Now I've added the PythonEstimator and PythonModel. For PythonEvaluator, it's better to commit in along with changes of CrossValidator. It's ready to review.

@holdenk
Copy link
Contributor

holdenk commented Oct 7, 2016

@yinxusen - is this something you are still interested in? If so updating it to master would be good as well as making sure the unit tests pass in jenkins (a lot of reviwers just skip PRs which are failing tests). Although this is also pretty big so it might make sense to check with @jkbradley that this is something he is still interested in as well before you spend your time on it.

@yinxusen
Copy link
Contributor Author

yinxusen commented Oct 7, 2016

Thanks @holdenk Yes, I am still interested in this. @jkbradley Do we still need the PR to support meta-algorithms in PySpark?

@ueshin
Copy link
Member

ueshin commented Jun 20, 2017

@yinxusen Hi, are you still working on this?

@jkbradley
Copy link
Member

@yinxusen Thanks for this PR! I still think this seems like a very cool feature, but I've become less convinced that it's worth the engineering and maintenance effort. The alternative to this feature is to have meta-algorithms all implemented in Python as well as Scala. Since there are not many such meta-algorithms (4 currently), I think that sounds easier than implementing something like this.

That's my current opinion, at least, especially since I have not seen a lot of demand for more meta-algorithms in MLlib.

@WeichenXu123
Copy link
Contributor

+1 @jkbradley For now it is better to keep the current implementation for the 4 meta-algo in pyspark.
@yinxusen Would you mind to close this PR ? But I still appreciate your contribution for this!

@WeichenXu123
Copy link
Contributor

cc @srowen Can you help close this ? We won't need this feature for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants