[SPARK-19303][ML][WIP] Add evaluate method in clustering models #16654

zhengruifeng · 2017-01-20T05:15:37Z

What changes were proposed in this pull request?

1, add evaluation metric in summary
2, add an evaluate() method which returns a summary

How was this patch tested?

added tests

SparkQA · 2017-01-20T07:18:49Z

Test build #71698 has finished for PR 16654 at commit bb01219.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-01-20T07:22:47Z

Test build #71707 has started for PR 16654 at commit 29bda3f.

zhengruifeng · 2017-01-20T08:10:26Z

Jenkins, retest this please

SparkQA · 2017-01-20T10:20:01Z

Test build #71710 has finished for PR 16654 at commit 29bda3f.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2017-01-20T10:53:05Z

General question: isn't this what Evaluators are for?

SparkQA · 2017-01-20T12:50:41Z

Test build #71717 has finished for PR 16654 at commit 1d89914.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

imatiach-msft · 2017-01-20T15:06:05Z

+1 with @srowen , this should be limited to the evaluator/metrics classes. If we have an evaluator for clustering then will we be able to use it with hyperparameter tuner (cross validate)?

zhengruifeng · 2017-01-22T06:01:18Z

I think now clustering metrics are not that general, comparing with classification/regression metrics:
WSSSE only apply to KMeans and BiKMeans
Loglikelihood only apply to GMM

I had opened a jira about clusteringEvaluator https://issues.apache.org/jira/browse/SPARK-14516, which may add metrics included in scikit-learn http://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics.cluster

@yanboliang @jkbradley What's your opinion?

SparkQA · 2017-01-22T06:23:37Z

Test build #71799 has started for PR 16654 at commit 5937ce7.

zhengruifeng · 2017-01-22T09:46:21Z

Jenkins, retest this please

srowen · 2017-01-22T10:16:42Z

Yes, I think this is at best a duplicate of SPARK-14516. You don't want to add ad-hoc methods for this.

zhengruifeng · 2017-01-22T10:43:20Z

@srowen I think I had not clarify my thoughts. WSSSE and Loglikelihood are algorithm-specific metrics.
For example:
WSSSE dont make sense for clustering algorithms like DBSCAN,
GMM's Loglikelihood is even different from MixtureModels of other distribution: Given a RDD[Int] representing clusterID or RDD[Array[Double]] representing cluster probability distribution, we can not design a general method to compute the Loglikelihood.

Some general clustering metrics are listed in http://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics.cluster, but wssse and loglikelihood are not in it.

srowen · 2017-01-22T10:46:47Z

I agree that clustering metrics are different from classification metrics, but that doesn't mean they can't have some common abstraction -- they're applied to a model and data set and produce a number. It's true that not every evaluation metric makes sense for every model, but that's not a problem per se.

Why wouldn't WSSSE make sense for DBSCAN?

SparkQA · 2017-01-22T12:04:15Z

Test build #71805 has finished for PR 16654 at commit 5937ce7.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

zhengruifeng · 2017-01-23T01:59:15Z

@srowen The concept of center don't exist in DBSCAN.

srowen · 2017-01-23T06:09:19Z

Metrics evaluate the clustering though; the details of the algorithm are irrelevant. This still clusters points in a continuous space so you can measure WSSSE.

imatiach-msft · 2017-01-23T16:04:20Z

Wouldn't we eventually want to add a lot more clustering metrics like Dunn, Davies-Bouldin, Simplified Silhouette etc... there are a lot of clustering metrics and it seems like a good idea to have separate metrics and evaluator classes for them so that it would be easy to use/extend them. It would also be nice to be able to sweep over the hyperparameters of the clustering algorithms and use the evaluators as part of cross validate (or am I misunderstanding and it is already possible with the changes in this review?).

imatiach-msft · 2017-01-23T19:05:04Z

Also, if some metrics are only applicable to some models, as @srowen noted, we can either make separate evaluator classes or put all metrics on one but throw if the model does not support that metric. Either solution would work and would be much better than putting the metrics calculation on the clustering model itself.

zhengruifeng · 2017-01-24T07:09:37Z

Existing metrics (WSSSE,Loglikelihood) are relevant to detail of algorithm. Computation of WSSSE for KMeans/BisectKMeans use the average vectors as the centers, but for KMedoids the medoids, other than averages, should be used. If we use the same logic in KMeans to compute the WSSSE for KMedoids, I think it will be a mistake.
And I found that some supervised algorithms support evaluate method in models: LiR,LoR,GLR.

srowen · 2017-01-24T11:36:11Z

Sure, and classification metrics like AUC only make sense for classifiers that output more than just a label -- they have to output a probability or score of some kind. Not every metric necessarily makes sense for every model, and we can use class hierarchy or just argument checking to avoid applying metrics where nonsensical. WSSSE can't be used for k-medoids, yes. k-medoids is also not in Spark, AFAIK. It's still not an argument to not abstract this at all.

zhengruifeng · 2017-01-25T03:30:46Z

@srowen I agree that metric should be irrelevant to details of the algorithms. AUC is irrelevant to algorithms, it is just relevant to the dataset: In spark-ml, scikit-learn, or any other packages, the input dataset contains label,decision values(or probabilities)， if and only if there exist two labels in the dataset, AUC can be computed, no matter which classifier is used.

I also agree that some general metrics should be abstracted in Evaluator.

I just disagree that if we treat WSSSE as a general metric:
There have been some attempts to add K-Medoids in spark, although their PRs were not accepted, there are still some third-party source implementing K-Medoids on spark.
More realisticly, Spark is used together with other ml-packages in many cases, suppose use other packages to generate the model locally, and evaluate the result in spark.

imatiach-msft · 2017-01-26T14:48:02Z

@zhengruifeng don't most ML libraries have separate clustering evaluators? For example, WEKA has ClusterEvaluation class. Scikit-learn just has a metrics class and functions you can call, but I don't really like that option and in any case it is separate from the estimator/model. H2O has a MetricBuilder that all ML learners (supervised/unsupervised) generate. I think creating a separate evaluation class which would fit in with the other evaluators in spark would be ideal - it would conform to the structure of the current codebase and possibly limit any confusion users might have.

HyukjinKwon · 2017-05-11T14:17:49Z

gentle ping @zhengruifeng

zhengruifeng added 3 commits January 22, 2017 13:36

create pr

aeb62ee

update tol

7f9364a

update test

01bffc1

add py summary wrapper

5937ce7

zhengruifeng force-pushed the clustering_model_evaluate branch from 1d89914 to 5937ce7 Compare January 22, 2017 06:22

HyukjinKwon mentioned this pull request May 17, 2017

[INFRA] Close stale PRs #18017

Closed

zhengruifeng closed this May 18, 2017

zhengruifeng deleted the clustering_model_evaluate branch May 18, 2017 03:08

Uh oh!

[SPARK-19303][ML][WIP] Add evaluate method in clustering models #16654

[SPARK-19303][ML][WIP] Add evaluate method in clustering models #16654

Uh oh!

Conversation

zhengruifeng commented Jan 20, 2017

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Jan 20, 2017

Uh oh!

SparkQA commented Jan 20, 2017

Uh oh!

zhengruifeng commented Jan 20, 2017

Uh oh!

SparkQA commented Jan 20, 2017

Uh oh!

srowen commented Jan 20, 2017

Uh oh!

SparkQA commented Jan 20, 2017

Uh oh!

imatiach-msft commented Jan 20, 2017

Uh oh!

zhengruifeng commented Jan 22, 2017

Uh oh!

SparkQA commented Jan 22, 2017

Uh oh!

zhengruifeng commented Jan 22, 2017

Uh oh!

srowen commented Jan 22, 2017

Uh oh!

zhengruifeng commented Jan 22, 2017

Uh oh!

srowen commented Jan 22, 2017

Uh oh!

SparkQA commented Jan 22, 2017

Uh oh!

zhengruifeng commented Jan 23, 2017

Uh oh!

srowen commented Jan 23, 2017

Uh oh!

imatiach-msft commented Jan 23, 2017

Uh oh!

imatiach-msft commented Jan 23, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhengruifeng commented Jan 24, 2017

Uh oh!

srowen commented Jan 24, 2017

Uh oh!

zhengruifeng commented Jan 25, 2017

Uh oh!

imatiach-msft commented Jan 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HyukjinKwon commented May 11, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

imatiach-msft commented Jan 23, 2017 •

edited

Loading

imatiach-msft commented Jan 26, 2017 •

edited

Loading