-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-12146] [SparkR] SparkR jsonFile should support multiple input files #10145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #47193 has finished for PR 10145 at commit
|
R/pkg/R/SQLContext.R
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move this change to the planned new JIRA issue about parquetFile? Let's focus this PR on jsonFile
|
Test build #47264 has finished for PR 10145 at commit
|
|
@yanboliang We moved the test file locations in #10030 -- So you'll need to rebase to master branch |
R/pkg/R/SQLContext.R
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought @sun-rui noted we should take a list or vector? In such case we should change this code to
paths <- as.list(suppressWarnings(normalizePath(path)))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
|
I found that it will complain errors if we use functions with 2. Error: read.json()/jsonFile() on a local file returns a DataFrame -----------
(由警告转换成)'jsonFile' is deprecated.
Use 'read.json' instead.
See help("Deprecated")
1: withCallingHandlers(eval(code, new_test_environment), error = capture_calls, message = function(c) invokeRestart("muffleMessage"))
2: eval(code, new_test_environment)
3: eval(expr, envir, enclos)
4: jsonFile(sqlContext, c(jsonPath, jsonPath2)) at test_sparkSQL.R:384
5: .Deprecated("read.json")
6: warning(paste(msg, collapse = ""), call. = FALSE, domain = NA)
7: .signalSimpleWarning("'jsonFile' is deprecated.\nUse 'read.json' instead.\nSee help(\"Deprecated\")",
quote(NULL))
8: withRestarts({
.Internal(.signalCondition(simpleWarning(msg, call), msg, call))
.Internal(.dfltWarn(msg, call))
}, muffleWarning = function() NULL)
9: withOneRestart(expr, restarts[[1L]])
10: doWithOneRestart(return(expr), restart) |
|
I vote for adding suppressWarnings. And add comment for this in test cases |
|
hmm, I guess deprecation is a warning which is now getting turned into an error. |
4decf22 to
47c7ee1
Compare
|
Test build #47308 has finished for PR 10145 at commit
|
|
looks good, thanks for making these changes |
|
Test build #47313 has finished for PR 10145 at commit
|
|
LGTM |
|
Test build #47516 has finished for PR 10145 at commit
|
…etFile SparkR support ```read.parquet``` and deprecate ```parquetFile```. This change is similar with #10145 for ```jsonFile```. Author: Yanbo Liang <[email protected]> Closes #10191 from yanboliang/spark-12198.
…etFile SparkR support ```read.parquet``` and deprecate ```parquetFile```. This change is similar with #10145 for ```jsonFile```. Author: Yanbo Liang <[email protected]> Closes #10191 from yanboliang/spark-12198. (cherry picked from commit eeb5872) Signed-off-by: Shivaram Venkataraman <[email protected]>
|
@yanboliang Could you bring this PR up to date with master ? |
06ae53d to
1d74b18
Compare
|
Test build #47563 has finished for PR 10145 at commit
|
|
LGTM. Merging this to master and branch-1.6 |
…iles * ```jsonFile``` should support multiple input files, such as: ```R jsonFile(sqlContext, c(“path1”, “path2”)) # character vector as arguments jsonFile(sqlContext, “path1,path2”) ``` * Meanwhile, ```jsonFile``` has been deprecated by Spark SQL and will be removed at Spark 2.0. So we mark ```jsonFile``` deprecated and use ```read.json``` at SparkR side. * Replace all ```jsonFile``` with ```read.json``` at test_sparkSQL.R, but still keep jsonFile test case. * If this PR is accepted, we should also make almost the same change for ```parquetFile```. cc felixcheung sun-rui shivaram Author: Yanbo Liang <[email protected]> Closes #10145 from yanboliang/spark-12146. (cherry picked from commit 0fb9825) Signed-off-by: Shivaram Venkataraman <[email protected]>
jsonFileshould support multiple input files, such as:jsonFilehas been deprecated by Spark SQL and will be removed at Spark 2.0. So we markjsonFiledeprecated and useread.jsonat SparkR side.jsonFilewithread.jsonat test_sparkSQL.R, but still keep jsonFile test case.parquetFile.cc @felixcheung @sun-rui @shivaram