[SPARK-20208][R][DOCS] Document R fpGrowth support #17557

zero323 · 2017-04-07T04:52:48Z

What changes were proposed in this pull request?

Document fpGrowth in:

vignettes
programming guide
code example

How was this patch tested?

Manual tests.

SparkQA · 2017-04-07T05:27:30Z

Test build #75588 has finished for PR 17557 at commit 27e94fd.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-04-07T06:57:36Z

Test build #75593 has started for PR 17557 at commit 30949a1.

SparkQA · 2017-04-07T13:25:34Z

Test build #75602 has finished for PR 17557 at commit 2f4c70e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-04-07T16:10:01Z

Test build #75605 has finished for PR 17557 at commit eb4939d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

felixcheung · 2017-04-07T18:14:17Z

R/pkg/vignettes/sparkr-vignettes.Rmd

thanks! - I'd prefer example with real data...

What do you mean by "real"? Something human readable (e.g. milk, bread, butter) or some standard pattern mining dataset? If the former one then it is not a problem. If the latter one I am not aware of any dataset which would be safe enough on the license side.

something that is not coded in 3 lines ;)
reading from a file if we could - if there isn't any dataset that we can license to use, can we anonymize an existing one?

something that is not coded in 3 lines ;)

That's for sure :) For now I am more trying to figure out how to present this to make it useful. For ML guide we can safely reuse data/mllib but I don't think we can do the same with vignette unless we bring sample_fpgrowth.txt as a package data.

SparkQA · 2017-04-09T13:14:24Z

Test build #75633 has finished for PR 17557 at commit 03afd08.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

zero323 · 2017-04-11T08:56:34Z

@felixcheung For vignette I used a bit larger synthetic dataset which should show all the features implemented by fpm. For examples I used the same data as #17130.

felixcheung · 2017-04-16T19:36:26Z

R/pkg/vignettes/sparkr-vignettes.Rmd

perhaps it's slightly less clear, since there are 3 references to "items" (or really, just the SparkDataFrame and its column name), which "items" L923 is referring to?

I like the approach you have there
https://github.com/apache/spark/pull/17557/files#diff-1d0d34d8ea18a9340f0a02c6befe6269R30

@felixcheung Updated.

BTW There is a JIRA tracking SQL functions parity, isn't there?

do you mean making sure we have all the SQL functions in R?
we don't, actually, since it's a evolving tasks - there are constantly new functions being added.

I think you are referring to split - yes we should probably add that in R too

nit: could you please rename the dataframe to df like the other example you have too?

I was pretty sure I've seen one :) split and array. There are of course name conflicts involved (spark.array?) but it would be really useful to have these.

that's possible but I'm fairly certain there are still quite a few functions we have missed over the year that are not in JIRA.

feel free to add them - would appreciate your help.

agree array is a bit tricky - I'd rather not having to diverge because of consistency with array_contain function and so on, but I can see spark.array might be an approach. Or perhaps array_col?

Or maybe create_array (like PySpark create_map)?

SparkQA · 2017-04-17T12:10:22Z

Test build #75854 has finished for PR 17557 at commit 4c02933.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-04-17T16:56:29Z

Test build #75859 has finished for PR 17557 at commit ab251f1.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

felixcheung

LGTM except one issue

felixcheung · 2017-04-18T07:03:04Z

R/pkg/vignettes/sparkr-vignettes.Rmd

oops. missed this - this should be {r}

- Vignettes. - Programming guide. - Code example.

SparkQA · 2017-04-18T11:02:01Z

Test build #75893 has finished for PR 17557 at commit 3445815.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

## What changes were proposed in this pull request? Document fpGrowth in: - vignettes - programming guide - code example ## How was this patch tested? Manual tests. Author: zero323 <[email protected]> Closes #17557 from zero323/SPARK-20208. (cherry picked from commit 702d85a) Signed-off-by: Felix Cheung <[email protected]>

felixcheung · 2017-04-19T03:00:14Z

merged to master/2.2

zero323 · 2017-04-19T11:12:31Z

Thanks @felixcheung!

## What changes were proposed in this pull request? Document fpGrowth in: - vignettes - programming guide - code example ## How was this patch tested? Manual tests. Author: zero323 <[email protected]> Closes apache#17557 from zero323/SPARK-20208.

zero323 force-pushed the SPARK-20208 branch from 30949a1 to 2f4c70e Compare April 7, 2017 12:47

felixcheung reviewed Apr 7, 2017

View reviewed changes

zero323 force-pushed the SPARK-20208 branch from eb4939d to 03afd08 Compare April 9, 2017 12:39

zero323 changed the title ~~[SPARK-20208][WIP][R][DOCS] Document R fpGrowth support~~ [SPARK-20208][R][DOCS] Document R fpGrowth support Apr 9, 2017

felixcheung requested changes Apr 16, 2017

View reviewed changes

felixcheung requested changes Apr 18, 2017

View reviewed changes

R/pkg/vignettes/sparkr-vignettes.Rmd Outdated

Copy link

Member

felixcheung Apr 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. missed this - this should be {r}

Document R fpGrowth support

3445815

- Vignettes. - Programming guide. - Code example.

zero323 force-pushed the SPARK-20208 branch from ab251f1 to 3445815 Compare April 18, 2017 10:22

asfgit closed this in 702d85a Apr 19, 2017

zero323 deleted the SPARK-20208 branch April 20, 2017 20:43

[SPARK-20208][R][DOCS] Document R fpGrowth support #17557

[SPARK-20208][R][DOCS] Document R fpGrowth support #17557

Uh oh!

Conversation

zero323 commented Apr 7, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Apr 7, 2017

Uh oh!

SparkQA commented Apr 7, 2017

Uh oh!

SparkQA commented Apr 7, 2017

Uh oh!

SparkQA commented Apr 7, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 9, 2017

Uh oh!

zero323 commented Apr 11, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 17, 2017

Uh oh!

SparkQA commented Apr 17, 2017

Uh oh!

felixcheung left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 18, 2017

Uh oh!

felixcheung commented Apr 19, 2017

Uh oh!

zero323 commented Apr 19, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zero323 commented Apr 7, 2017 •

edited

Loading