Skip to content

Conversation

@mengxr
Copy link
Contributor

@mengxr mengxr commented May 21, 2015

Some changes to the pipeilne APIs:

  1. Estimator/Transformer/ doesn’t need to extend Params since PipelineStage already does.
  2. Move Evaluator to ml.evaluation.
  3. Mention larger metric values are better.
  4. PipelineModel doc. “compiled” -> “fitted”
  5. Hide object PolynomialExpansion.
  6. Hide object VectorAssembler.
  7. Word2Vec.minCount (and other) -> @group param
  8. ParamValidators -> @DeveloperAPI
  9. Hide MetadataUtils/SchemaUtils.

@jkbradley

@SparkQA
Copy link

SparkQA commented May 21, 2015

Test build #33255 has finished for PR 6322 at commit e179480.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor Author

mengxr commented May 21, 2015

test this please

@SparkQA
Copy link

SparkQA commented May 21, 2015

Test build #33268 has finished for PR 6322 at commit e179480.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not include Evaluator in this list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is moved to ml.evaluation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why is this only importing stuff from pipeline.py? Isn't Evaluator just as important a concept as the other items here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Scala side, we put Classifier under ml.classification and Regressor under ml.regression. So I moved Evaluator to ml.evaluation to match them. Evaluator is now under pyspark.ml.evaluation and hence it is not imported here. Under pyspark.ml, we have Transformer, Estimator, Model, Pipeline, and PipelineModel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, I see. Thanks for clarifying.

@jkbradley
Copy link
Member

Looks good other than those 2 items

@SparkQA
Copy link

SparkQA commented May 22, 2015

Test build #849 has finished for PR 6322 at commit e179480.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

The update looks fine to me.

@jkbradley
Copy link
Member

LGTM

@SparkQA
Copy link

SparkQA commented May 22, 2015

Test build #33304 has finished for PR 6322 at commit 9e9c7da.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request May 22, 2015
Some changes to the pipeilne APIs:

1. Estimator/Transformer/ doesn’t need to extend Params since PipelineStage already does.
1. Move Evaluator to ml.evaluation.
1. Mention larger metric values are better.
1. PipelineModel doc. “compiled” -> “fitted”
1. Hide object PolynomialExpansion.
1. Hide object VectorAssembler.
1. Word2Vec.minCount (and other) -> group param
1. ParamValidators -> DeveloperApi
1. Hide MetadataUtils/SchemaUtils.

jkbradley

Author: Xiangrui Meng <[email protected]>

Closes #6322 from mengxr/SPARK-7535.0 and squashes the following commits:

9e9c7da [Xiangrui Meng] move JavaEvaluator to ml.evaluation as well
e179480 [Xiangrui Meng] move Evaluation to ml.evaluation in PySpark
08ef61f [Xiangrui Meng] update pipieline APIs

(cherry picked from commit 8f11c61)
Signed-off-by: Xiangrui Meng <[email protected]>
@mengxr
Copy link
Contributor Author

mengxr commented May 22, 2015

Merged into master and branch-1.4.

@asfgit asfgit closed this in 8f11c61 May 22, 2015
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
Some changes to the pipeilne APIs:

1. Estimator/Transformer/ doesn’t need to extend Params since PipelineStage already does.
1. Move Evaluator to ml.evaluation.
1. Mention larger metric values are better.
1. PipelineModel doc. “compiled” -> “fitted”
1. Hide object PolynomialExpansion.
1. Hide object VectorAssembler.
1. Word2Vec.minCount (and other) -> group param
1. ParamValidators -> DeveloperApi
1. Hide MetadataUtils/SchemaUtils.

jkbradley

Author: Xiangrui Meng <[email protected]>

Closes apache#6322 from mengxr/SPARK-7535.0 and squashes the following commits:

9e9c7da [Xiangrui Meng] move JavaEvaluator to ml.evaluation as well
e179480 [Xiangrui Meng] move Evaluation to ml.evaluation in PySpark
08ef61f [Xiangrui Meng] update pipieline APIs
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
Some changes to the pipeilne APIs:

1. Estimator/Transformer/ doesn’t need to extend Params since PipelineStage already does.
1. Move Evaluator to ml.evaluation.
1. Mention larger metric values are better.
1. PipelineModel doc. “compiled” -> “fitted”
1. Hide object PolynomialExpansion.
1. Hide object VectorAssembler.
1. Word2Vec.minCount (and other) -> group param
1. ParamValidators -> DeveloperApi
1. Hide MetadataUtils/SchemaUtils.

jkbradley

Author: Xiangrui Meng <[email protected]>

Closes apache#6322 from mengxr/SPARK-7535.0 and squashes the following commits:

9e9c7da [Xiangrui Meng] move JavaEvaluator to ml.evaluation as well
e179480 [Xiangrui Meng] move Evaluation to ml.evaluation in PySpark
08ef61f [Xiangrui Meng] update pipieline APIs
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
Some changes to the pipeilne APIs:

1. Estimator/Transformer/ doesn’t need to extend Params since PipelineStage already does.
1. Move Evaluator to ml.evaluation.
1. Mention larger metric values are better.
1. PipelineModel doc. “compiled” -> “fitted”
1. Hide object PolynomialExpansion.
1. Hide object VectorAssembler.
1. Word2Vec.minCount (and other) -> group param
1. ParamValidators -> DeveloperApi
1. Hide MetadataUtils/SchemaUtils.

jkbradley

Author: Xiangrui Meng <[email protected]>

Closes apache#6322 from mengxr/SPARK-7535.0 and squashes the following commits:

9e9c7da [Xiangrui Meng] move JavaEvaluator to ml.evaluation as well
e179480 [Xiangrui Meng] move Evaluation to ml.evaluation in PySpark
08ef61f [Xiangrui Meng] update pipieline APIs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants