Skip to content

Conversation

@shivaram
Copy link
Contributor

What changes were proposed in this pull request?

This change skips tests that use the Hadoop libraries while running
on CRAN check with Windows as the operating system. This is to handle
cases where the Hadoop winutils binaries are missing on the target
system. The skipped tests consist of

  1. Tests that save, load a model in MLlib
  2. Tests that save, load CSV, JSON and Parquet files in SQL
  3. Hive tests

How was this patch tested?

Tested by running on a local windows VM with HADOOP_HOME unset. Also testing with https://win-builder.r-project.org

This change skips tests that use the Hadoop libraries while running
on CRAN check with Windows as the operating system. This is to handle
cases where the Hadoop winutils binaries are missing on the target
system. The skipped tests consist of
1. Tests that save, load a model in MLlib
2. Tests that save, load CSV, JSON and Parquet files in SQL
3. Hive tests
@shivaram
Copy link
Contributor Author

cc @felixcheung

FWIW it might easier to view the diff by adding w=1 to the URL. i.e. https://github.com/apache/spark/pull/17966/files?w=1

@vanzin
Copy link
Contributor

vanzin commented May 12, 2017

This doesn't seem to really have anything to do with the errors in SPARK-20666. As I pointed out, they do not only happen on Windows.

@shivaram
Copy link
Contributor Author

Sorry @vanzin I got the wrong JIRA number. Fixing it now

@shivaram shivaram changed the title [SPARK-20666] Skip tests that use Hadoop utils on CRAN Windows [SPARK-20727] Skip tests that use Hadoop utils on CRAN Windows May 12, 2017
@shivaram
Copy link
Contributor Author

This is SPARK-20727 - I just happened to have the other JIRA also open and pasted it incorrectly

@SparkQA
Copy link

SparkQA commented May 12, 2017

Test build #76876 has finished for PR 17966 at commit 40273ab.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor Author

Actually thinking more about this, I think we should be checking for availability of hadoop library / binaries rather than is_cran. For example I just found that win-builder only runs R CMD check (no --as-cran) and so this PR doesn't actually help on that machine.

@shivaram
Copy link
Contributor Author

@HyukjinKwon Do we know why things sometime queue for a long time on AppVeyor ? Like this PR has been queued for around 5 hours right now.

@HyukjinKwon
Copy link
Member

Thank you for cc'ing me. I think primarily it is because single AppVeyor account is shared across several Apache projects but the number of concurrent jobs is single up to my knowledge. So, it is a kind of a global queue.

I am monitoring ...

https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark
https://ci.appveyor.com/project/ApacheSoftwareFoundation/reef
https://ci.appveyor.com/project/ApacheSoftwareFoundation/nifi
https://ci.appveyor.com/project/ApacheSoftwareFoundation/thrift
https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow
https://ci.appveyor.com/project/ApacheSoftwareFoundation/syncope

@HyukjinKwon
Copy link
Member

Just FYI, closing and opening a PR is a workaround to re-trigger the build in AppVeyor as (I assume) we all don't currently have the permission via AppVeyor Web UI.

@felixcheung
Copy link
Member

felixcheung commented May 13, 2017 via email

@felixcheung
Copy link
Member

felixcheung commented May 13, 2017

So I'd propose this

is_cran <- function() { !identical(Sys.getenv("NOT_CRAN"), "true") }
is_windows <- function() { .Platform$OS.type == "windows" }
hadoop_home_set <- function() { !identical(Sys.getenv("HADOOP_HOME"), "") }

not_cran_and_windows_with_hadoop <- function() { !is_cran() && (!is_windows() || hadoop_home_set()) }

@felixcheung
Copy link
Member

@shivaram have you got a chance to work on this again?

@shivaram
Copy link
Contributor Author

Sorry I've been out traveling -- I'll try to update this by tonight

@shivaram
Copy link
Contributor Author

@felixcheung I made the change - I'm right now going to test this in my Windows VM. Will update this PR with the results

@SparkQA
Copy link

SparkQA commented May 20, 2017

Test build #77114 has finished for PR 17966 at commit 7555501.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor Author

@felixcheung Unfortunately I'm out traveling and haven't been able to do the windows tests yet -- Would you have a chance to do that ? Also what are your thoughts on merging this while we test given the upcoming RC cut ?

@felixcheung
Copy link
Member

testing now - could you submit to https://win-builder.r-project.org?

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
tested package with and without HADOOP_HOME, on Windows, both passed

asfgit pushed a commit that referenced this pull request May 23, 2017
## What changes were proposed in this pull request?

This change skips tests that use the Hadoop libraries while running
on CRAN check with Windows as the operating system. This is to handle
cases where the Hadoop winutils binaries are missing on the target
system. The skipped tests consist of
1. Tests that save, load a model in MLlib
2. Tests that save, load CSV, JSON and Parquet files in SQL
3. Hive tests

## How was this patch tested?

Tested by running on a local windows VM with HADOOP_HOME unset. Also testing with https://win-builder.r-project.org

Author: Shivaram Venkataraman <[email protected]>

Closes #17966 from shivaram/sparkr-windows-cran.

(cherry picked from commit d06610f)
Signed-off-by: Felix Cheung <[email protected]>
@felixcheung
Copy link
Member

merged to master/2.2

I think we should still check win-builder. Also it's a bit hard to tell if the skipped tests are skipped - might want to follow up with a trace

@asfgit asfgit closed this in d06610f May 23, 2017
@shivaram
Copy link
Contributor Author

Thanks I'll try to kick off the winbuilder build soon (i'm out of town till tomorrow). One more thing we might need to fix is that winbuilder has a 10 or 20 minute time limit for tests (not sure if the CRAN check has a similar limit ?)

liyichao pushed a commit to liyichao/spark that referenced this pull request May 24, 2017
## What changes were proposed in this pull request?

This change skips tests that use the Hadoop libraries while running
on CRAN check with Windows as the operating system. This is to handle
cases where the Hadoop winutils binaries are missing on the target
system. The skipped tests consist of
1. Tests that save, load a model in MLlib
2. Tests that save, load CSV, JSON and Parquet files in SQL
3. Hive tests

## How was this patch tested?

Tested by running on a local windows VM with HADOOP_HOME unset. Also testing with https://win-builder.r-project.org

Author: Shivaram Venkataraman <[email protected]>

Closes apache#17966 from shivaram/sparkr-windows-cran.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants