Skip to content

Commit 792113b

Browse files
Branden Smithkai-chi
authored andcommitted
[SPARK-26745][SQL][TESTS] JsonSuite test case: empty line -> 0 record count
This PR consists of the `test` components of apache#23665 only, minus the associated patch from that PR. It adds a new unit test to `JsonSuite` which verifies that the `count()` returned from a `DataFrame` loaded from JSON containing empty lines does not include those empty lines in the record count. The test runs `count` prior to otherwise reading data from the `DataFrame`, so as to catch future cases where a pre-parsing optimization might result in `count` results inconsistent with existing behavior. This PR is intended to be deployed alongside apache#23667; `master` currently causes the test to fail, as described in [SPARK-26745](https://issues.apache.org/jira/browse/SPARK-26745). Manual testing, existing `JsonSuite` unit tests. Closes apache#23674 from sumitsu/json_emptyline_count_test. Authored-by: Branden Smith <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]> (cherry picked from commit 63bced9) Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent c57cad8 commit 792113b

File tree

1 file changed

+12
-0
lines changed
  • sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json

1 file changed

+12
-0
lines changed

sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2515,4 +2515,16 @@ class JsonSuite extends QueryTest with SharedSQLContext with TestJsonData {
25152515
checkCount(2)
25162516
countForMalformedJSON(0, Seq(""))
25172517
}
2518+
2519+
test("SPARK-26745: count() for non-multiline input with empty lines") {
2520+
withTempPath { tempPath =>
2521+
val path = tempPath.getCanonicalPath
2522+
Seq("""{ "a" : 1 }""", "", """ { "a" : 2 }""", " \t ")
2523+
.toDS()
2524+
.repartition(1)
2525+
.write
2526+
.text(path)
2527+
assert(spark.read.json(path).count() === 2)
2528+
}
2529+
}
25182530
}

0 commit comments

Comments
 (0)