-
Couldn't load subscription status.
- Fork 28.9k
[SPARK-19303][ML][WIP] Add evaluate method in clustering models #16654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19303][ML][WIP] Add evaluate method in clustering models #16654
Conversation
|
Test build #71698 has finished for PR 16654 at commit
|
|
Test build #71707 has started for PR 16654 at commit |
|
Jenkins, retest this please |
|
Test build #71710 has finished for PR 16654 at commit
|
|
General question: isn't this what Evaluators are for? |
|
Test build #71717 has finished for PR 16654 at commit
|
|
+1 with @srowen , this should be limited to the evaluator/metrics classes. If we have an evaluator for clustering then will we be able to use it with hyperparameter tuner (cross validate)? |
|
I think now clustering metrics are not that general, comparing with classification/regression metrics: I had opened a jira about clusteringEvaluator https://issues.apache.org/jira/browse/SPARK-14516, which may add metrics included in scikit-learn http://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics.cluster @yanboliang @jkbradley What's your opinion? |
1d89914 to
5937ce7
Compare
|
Test build #71799 has started for PR 16654 at commit |
|
Jenkins, retest this please |
|
Yes, I think this is at best a duplicate of SPARK-14516. You don't want to add ad-hoc methods for this. |
|
@srowen I think I had not clarify my thoughts. WSSSE and Loglikelihood are algorithm-specific metrics. Some general clustering metrics are listed in http://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics.cluster, but wssse and loglikelihood are not in it. |
|
I agree that clustering metrics are different from classification metrics, but that doesn't mean they can't have some common abstraction -- they're applied to a model and data set and produce a number. It's true that not every evaluation metric makes sense for every model, but that's not a problem per se. Why wouldn't WSSSE make sense for DBSCAN? |
|
Test build #71805 has finished for PR 16654 at commit
|
|
@srowen The concept of |
|
Metrics evaluate the clustering though; the details of the algorithm are irrelevant. This still clusters points in a continuous space so you can measure WSSSE. |
|
Wouldn't we eventually want to add a lot more clustering metrics like Dunn, Davies-Bouldin, Simplified Silhouette etc... there are a lot of clustering metrics and it seems like a good idea to have separate metrics and evaluator classes for them so that it would be easy to use/extend them. It would also be nice to be able to sweep over the hyperparameters of the clustering algorithms and use the evaluators as part of cross validate (or am I misunderstanding and it is already possible with the changes in this review?). |
|
Also, if some metrics are only applicable to some models, as @srowen noted, we can either make separate evaluator classes or put all metrics on one but throw if the model does not support that metric. Either solution would work and would be much better than putting the metrics calculation on the clustering model itself. |
|
Existing metrics (WSSSE,Loglikelihood) are relevant to detail of algorithm. Computation of WSSSE for KMeans/BisectKMeans use the average vectors as the centers, but for KMedoids the medoids, other than averages, should be used. If we use the same logic in KMeans to compute the WSSSE for KMedoids, I think it will be a mistake. |
|
Sure, and classification metrics like AUC only make sense for classifiers that output more than just a label -- they have to output a probability or score of some kind. Not every metric necessarily makes sense for every model, and we can use class hierarchy or just argument checking to avoid applying metrics where nonsensical. WSSSE can't be used for k-medoids, yes. k-medoids is also not in Spark, AFAIK. It's still not an argument to not abstract this at all. |
|
@srowen I agree that metric should be irrelevant to details of the algorithms. AUC is irrelevant to algorithms, it is just relevant to the dataset: In spark-ml, scikit-learn, or any other packages, the input dataset contains I also agree that some general metrics should be abstracted in Evaluator. I just disagree that if we treat WSSSE as a general metric: |
|
@zhengruifeng don't most ML libraries have separate clustering evaluators? For example, WEKA has ClusterEvaluation class. Scikit-learn just has a metrics class and functions you can call, but I don't really like that option and in any case it is separate from the estimator/model. H2O has a MetricBuilder that all ML learners (supervised/unsupervised) generate. I think creating a separate evaluation class which would fit in with the other evaluators in spark would be ideal - it would conform to the structure of the current codebase and possibly limit any confusion users might have. |
|
gentle ping @zhengruifeng |
What changes were proposed in this pull request?
1, add evaluation metric in summary
2, add an evaluate() method which returns a summary
How was this patch tested?
added tests