[SPARK-7535.0] [MLLIB] Audit the pipeline APIs for 1.4 #6322

mengxr · 2015-05-21T17:21:47Z

Some changes to the pipeilne APIs:

Estimator/Transformer/ doesn’t need to extend Params since PipelineStage already does.
Move Evaluator to ml.evaluation.
Mention larger metric values are better.
PipelineModel doc. “compiled” -> “fitted”
Hide object PolynomialExpansion.
Hide object VectorAssembler.
Word2Vec.minCount (and other) -> @group param
ParamValidators -> @DeveloperAPI
Hide MetadataUtils/SchemaUtils.

@jkbradley

SparkQA · 2015-05-21T18:32:13Z

Test build #33255 has finished for PR 6322 at commit e179480.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

mengxr · 2015-05-21T19:13:49Z

test this please

SparkQA · 2015-05-21T20:11:28Z

Test build #33268 has finished for PR 6322 at commit e179480.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

jkbradley · 2015-05-22T00:12:46Z

python/pyspark/ml/__init__.py

Why not include Evaluator in this list?

It is moved to ml.evaluation.

But why is this only importing stuff from pipeline.py? Isn't Evaluator just as important a concept as the other items here?

On the Scala side, we put Classifier under ml.classification and Regressor under ml.regression. So I moved Evaluator to ml.evaluation to match them. Evaluator is now under pyspark.ml.evaluation and hence it is not imported here. Under pyspark.ml, we have Transformer, Estimator, Model, Pipeline, and PipelineModel.

Ohh, I see. Thanks for clarifying.

jkbradley · 2015-05-22T00:13:10Z

Looks good other than those 2 items

SparkQA · 2015-05-22T00:24:35Z

Test build #849 has finished for PR 6322 at commit e179480.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

jkbradley · 2015-05-22T00:58:56Z

The update looks fine to me.

jkbradley · 2015-05-22T01:06:57Z

LGTM

SparkQA · 2015-05-22T02:16:12Z

Test build #33304 has finished for PR 6322 at commit 9e9c7da.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

Some changes to the pipeilne APIs: 1. Estimator/Transformer/ doesn’t need to extend Params since PipelineStage already does. 1. Move Evaluator to ml.evaluation. 1. Mention larger metric values are better. 1. PipelineModel doc. “compiled” -> “fitted” 1. Hide object PolynomialExpansion. 1. Hide object VectorAssembler. 1. Word2Vec.minCount (and other) -> group param 1. ParamValidators -> DeveloperApi 1. Hide MetadataUtils/SchemaUtils. jkbradley Author: Xiangrui Meng <[email protected]> Closes #6322 from mengxr/SPARK-7535.0 and squashes the following commits: 9e9c7da [Xiangrui Meng] move JavaEvaluator to ml.evaluation as well e179480 [Xiangrui Meng] move Evaluation to ml.evaluation in PySpark 08ef61f [Xiangrui Meng] update pipieline APIs (cherry picked from commit 8f11c61) Signed-off-by: Xiangrui Meng <[email protected]>

mengxr · 2015-05-22T05:58:11Z

Merged into master and branch-1.4.

Some changes to the pipeilne APIs: 1. Estimator/Transformer/ doesn’t need to extend Params since PipelineStage already does. 1. Move Evaluator to ml.evaluation. 1. Mention larger metric values are better. 1. PipelineModel doc. “compiled” -> “fitted” 1. Hide object PolynomialExpansion. 1. Hide object VectorAssembler. 1. Word2Vec.minCount (and other) -> group param 1. ParamValidators -> DeveloperApi 1. Hide MetadataUtils/SchemaUtils. jkbradley Author: Xiangrui Meng <[email protected]> Closes apache#6322 from mengxr/SPARK-7535.0 and squashes the following commits: 9e9c7da [Xiangrui Meng] move JavaEvaluator to ml.evaluation as well e179480 [Xiangrui Meng] move Evaluation to ml.evaluation in PySpark 08ef61f [Xiangrui Meng] update pipieline APIs

mengxr added 2 commits May 21, 2015 10:13

update pipieline APIs

08ef61f

move Evaluation to ml.evaluation in PySpark

e179480

jkbradley reviewed May 22, 2015
View reviewed changes

jkbradley mentioned this pull request May 22, 2015

[SPARK-7574][ml][doc] User guide for OneVsRest #6296

Closed

move JavaEvaluator to ml.evaluation as well

9e9c7da

asfgit closed this in 8f11c61 May 22, 2015

[SPARK-7535.0] [MLLIB] Audit the pipeline APIs for 1.4 #6322

[SPARK-7535.0] [MLLIB] Audit the pipeline APIs for 1.4 #6322

Uh oh!

Conversation

mengxr commented May 21, 2015

Uh oh!

SparkQA commented May 21, 2015

Uh oh!

mengxr commented May 21, 2015

Uh oh!

SparkQA commented May 21, 2015

Uh oh!

jkbradley May 22, 2015

Choose a reason for hiding this comment

Uh oh!

mengxr May 22, 2015

Choose a reason for hiding this comment

Uh oh!

jkbradley May 22, 2015

Choose a reason for hiding this comment

Uh oh!

mengxr May 22, 2015

Choose a reason for hiding this comment

Uh oh!

jkbradley May 22, 2015

Choose a reason for hiding this comment

Uh oh!

jkbradley commented May 22, 2015

Uh oh!

SparkQA commented May 22, 2015

Uh oh!

jkbradley commented May 22, 2015

Uh oh!

jkbradley commented May 22, 2015

Uh oh!

SparkQA commented May 22, 2015

Uh oh!

mengxr commented May 22, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants