Skip to content

Commit 1af7072

Browse files
LuciferYangMaxGekk
authored andcommitted
[SPARK-36970][SQL] Manual disabled format B of date_format function to make Java 17 compatible with Java 8
### What changes were proposed in this pull request? The `date_format` function with `B` format has different behavior when use Java 8 and Java 17, `select date_format('2018-11-17 13:33:33.333', 'B')` in `datetime-formatting-invalid.sql` can prove this. The case result with Java 8 is ``` -- !query select date_format('2018-11-17 13:33:33.333', 'B') -- !query schema struct<> -- !query output java.lang.IllegalArgumentException Unknown pattern letter: B ``` and the case result with Java 17 is ``` - datetime-formatting-invalid.sql *** FAILED *** datetime-formatting-invalid.sql Expected "struct<[]>", but got "struct<[date_format(2018-11-17 13:33:33.333, B):string]>" Schema did not match for query #34 select date_format('2018-11-17 13:33:33.333', 'B'): -- !query select date_format('2018-11-17 13:33:33.333', 'B') -- !query schema struct<date_format(2018-11-17 13:33:33.333, B):string> -- !query output in the afternoon (SQLQueryTestSuite.scala:469) ``` We found that this is due to the new support of format `B` in Java 17 ``` 'B' is used to represent Pattern letters to output a day period in Java 17 * Pattern Count Equivalent builder methods * ------- ----- -------------------------- * B 1 appendDayPeriodText(TextStyle.SHORT) * BBBB 4 appendDayPeriodText(TextStyle.FULL) * BBBBB 5 appendDayPeriodText(TextStyle.NARROW) ``` And through [ http://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html]( http://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html) , we can confirm that format `B` is not documented/supported for `date_format` function currently. So the main change of this pr is manual disabled format `B` of `date_format` function in `DateTimeFormatterHelper` to make Java 17 compatible with Java 8. ### Why are the changes needed? Ensure that Java 17 and Java 8 have the same behavior. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? - Pass the Jenkins or GitHub Action - Manual test `SQLQueryTestSuite` with JDK 17 **Before** ``` - datetime-formatting-invalid.sql *** FAILED *** datetime-formatting-invalid.sql Expected "struct<[]>", but got "struct<[date_format(2018-11-17 13:33:33.333, B):string]>" Schema did not match for query #34 select date_format('2018-11-17 13:33:33.333', 'B'): -- !query select date_format('2018-11-17 13:33:33.333', 'B') -- !query schema struct<date_format(2018-11-17 13:33:33.333, B):string> -- !query output in the afternoon (SQLQueryTestSuite.scala:469) ``` **After** The test `select date_format('2018-11-17 13:33:33.333', 'B')` in `datetime-formatting-invalid.sql` passed Closes #34237 from LuciferYang/SPARK-36970. Authored-by: yangjie01 <[email protected]> Signed-off-by: Max Gekk <[email protected]>
1 parent dc1db95 commit 1af7072

File tree

3 files changed

+7
-3
lines changed

3 files changed

+7
-3
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -279,7 +279,11 @@ private object DateTimeFormatterHelper {
279279
// localized, for the default Locale.US, it uses Sunday as the first day of week, while in Spark
280280
// 2.4, the SimpleDateFormat uses Monday as the first day of week.
281281
final val weekBasedLetters = Set('Y', 'W', 'w', 'u', 'e', 'c')
282-
final val unsupportedLetters = Set('A', 'n', 'N', 'p')
282+
// SPARK-36970: `select date_format('2018-11-17 13:33:33.333', 'B')` failed with Java 8,
283+
// but use Java 17 will return `in the afternoon` because 'B' is used to represent
284+
// `Pattern letters to output a day period` in Java 17. So there manual disabled `B` for
285+
// compatibility with Java 8 behavior.
286+
final val unsupportedLetters = Set('A', 'B', 'n', 'N', 'p')
283287
// The quarter fields will also be parsed strangely, e.g. when the pattern contains `yMd` and can
284288
// be directly resolved then the `q` do check for whether the month is valid, but if the date
285289
// fields is incomplete, e.g. `yM`, the checking will be bypassed.

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DatetimeFormatterSuite.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ trait DatetimeFormatterSuite extends SparkFunSuite with SQLHelper with Matchers
7676

7777
Seq(true, false).foreach { isParsing =>
7878
// not support by the legacy one too
79-
val unsupportedBoth = Seq("QQQQQ", "qqqqq", "eeeee", "A", "c", "n", "N", "p", "e")
79+
val unsupportedBoth = Seq("QQQQQ", "qqqqq", "eeeee", "A", "B", "c", "n", "N", "p", "e")
8080
unsupportedBoth.foreach { pattern =>
8181
intercept[IllegalArgumentException](checkFormatterCreation(pattern, isParsing))
8282
}

sql/core/src/test/resources/sql-tests/results/datetime-formatting-invalid.sql.out

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,7 @@ select date_format('2018-11-17 13:33:33.333', 'B')
314314
struct<>
315315
-- !query output
316316
java.lang.IllegalArgumentException
317-
Unknown pattern letter: B
317+
Illegal pattern character: B
318318

319319

320320
-- !query

0 commit comments

Comments
 (0)