Fix discrepancy in Float64 to timestamp(9) casts for constants #16639

findepi · 2025-07-01T09:30:38Z

Before the change, when casting Float64 value to Timestamp(Nanosecond, None), the result would depend on whether the source value is constant-foldable scalar. This is because ScalarValue.cast_to had a special treatment for that source & destination type pair, producing a different result from the canonical one.

So, this is not really changing cast(a_double as timestamp) cast. It is only changing a specialized and incorrect code path that's taken by that cast under very specific and narrow circumstances (constant folding and only for nano precision). This should be a non-controversial change.

Before the change, when casting `Float64` value to `Timestamp(Nanosecond, None)`, the result would depend on whether the source value is constant-foldable scalar. This is because `ScalarValue.cast_to` had a special treatment for that source & destination type pair, producing a different result from the canonical one.

findepi · 2025-07-01T09:30:42Z

cc @chenkovsky @alamb @jatin510

findepi · 2025-07-01T09:36:54Z

cc @Omega359 @spaydar per #16636 (comment)

jatin510 · 2025-07-01T13:56:04Z

@findepi
How duckdb works in case of decimal/float input:

D  SELECT to_timestamp(1.1) as c1;
┌─────────────────────────────┐
│             c1              │
│  timestamp with time zone   │
├─────────────────────────────┤
│ 1970-01-01 05:30:01.1+05:30 │
└─────────────────────────────┘

what is the expected output ?

findepi · 2025-07-01T15:24:20Z

@jatin510 thanks for feedback.
to_timestamp(a_double), to_timestamp_seconds(a_double) and cast(a_double as timestamp) are 3 different things that do not have to behave the same. it's nice if they do

however,
cast(a_double as timestamp) must behave the same as cast(a_double as timestamp)... which clearly is a problem now: #16636
moreover, cast(a_double as timestamp(p)) must behave reasonably consistently for various p values (another clear problem today)
also cast from a double (Float64) must behave consistently with a cast from a float (Float32) (~~will add~~ added test coverage for that)

i am not really changing cast(a_double as timestamp) cast. I am only changing a specialized incorrect code path that's taken by that cast under very specific and narrow cirumstances. this should be a non-controversial

Omega359 · 2025-07-02T13:45:36Z

I just did a comparison of to_timestamp calls that were in the slt file that were changed to what postgresql generates and they are different:

SELECT to_timestamp(1.1) as a, to_timestamp(-1.1) as b, to_timestamp(0.0) as c, to_timestamp(1.23456789) as d, to_timestamp(123456789.123456789) as e;

a	b	c	d	e
1970-01-01 00:00:01.100000 +00:00	1969-12-31 23:59:58.900000 +00:00	1970-01-01 00:00:00.000000 +00:00	1970-01-01 00:00:01.234568 +00:00	1973-11-29 21:33:09.123457 +00:00

findepi · 2025-07-02T19:05:52Z

@Omega359 thanks. This totally escaped my notice that to_timestamp also changed, not only the casts (that are part of the same test).

the function was not meant to be changed

findepi · 2025-07-02T19:19:50Z

I now realized that's exactly what @jatin510 pointed our earlier, just i didn't understand.
I pushed a commit restoring the to_timestamp(double) behavior to whatever it was before.

@Omega359 @jatin510 thanks again

Omega359 · 2025-07-02T19:46:12Z

cast(123456789.123456789 as timestamp) => 1970-01-01T00:00:00.123456789

That strikes me as wrong.

findepi · 2025-07-02T19:51:48Z

@Omega359 i am not exactly fan of the cast semantics. If i were to choose, i would maybe choose it to be different.
Note that it's pre-existing though:

Consider

cast source type float 32, float 64
cast target type timestamp seconds / millis / micros / nanos
in scalar (constant folding) or query context

from these 16 combinations, this PR changes only one, to make it consistent with other 15

Omega359 · 2025-07-02T19:56:54Z

Ok. Well considering I couldn't even figure out how to cast a float to a timestamp in postgres without using to_timestamp and duckdb outright says it's not implemented (see below) I guess it's consistent (must be nanos?) - it's just going to surprise pretty much everyone.

Use ".open FILENAME" to reopen on a persistent database.
D select 1.1::timestamp;
Conversion Error:
Unimplemented type for cast (DECIMAL(2,1) -> TIMESTAMP)

alamb

Thank you for working on this @findepi -- I think we are close

alamb · 2025-07-03T20:23:47Z

datafusion/sqllogictest/test_files/timestamps.slt

 SELECT to_timestamp(1.1) as c1, cast(1.1 as timestamp) as c2, 1.1::timestamp as c3;
 ----
-1970-01-01T00:00:01.100 1970-01-01T00:00:01.100 1970-01-01T00:00:01.100
+1970-01-01T00:00:01.100 1970-01-01T00:00:00.000000001 1970-01-01T00:00:00.000000001


Do I read this difference right as now cast(float_col AS timestamp) is treated as though it is 1.1 ns where as before it was treated as 1.1 sec?

What do we think about simply not supporting explicit conversion from float --> timestamp to follow the duckdb/postgres model. That feels far more defensible to me than this behavior which is both different than it was previously AND not consistent with other engines

both different than it was previously

it's NOT!

it's different ONLY in constant folding context.

BTW to_timestamp(1.1) works the way it seems to work on constant folding context only too, i filed #16678 for this, but this PR fixes this problem as well.

What do we think about simply not supporting explicit conversion from float --> timestamp

I am supportive of that. Same for decimals and perhaps ints.

But it's definitely more work and more controversial change.
So we should first fix #16636, #16531 and #16678 which are code bugs bringing embarrassment to the project and data corruption to the users.

Yes, thank you, I double checked and reminded myself of what was going on:

DataFusion CLI v48.0.0 > select cast(1.1 as timestamp); +-------------------------+ | Float64(1.1) | +-------------------------+ | 1970-01-01T00:00:01.100 | +-------------------------+ 1 row(s) fetched. Elapsed 0.005 seconds. > select cast(column1 as timestamp) from values (1.1); +-------------------------------+ | column1 | +-------------------------------+ | 1970-01-01T00:00:00.000000001 | +-------------------------------+ 1 row(s) fetched. Elapsed 0.001 seconds.

Indeed. I tried to capture this in the linked issue #16636
Thanks for bring this example here.

alamb

This PR looks good to me -- thank you @findepi

alamb · 2025-07-17T11:27:42Z

datafusion/sqllogictest/test_files/timestamps.slt

 SELECT to_timestamp(1.1) as c1, cast(1.1 as timestamp) as c2, 1.1::timestamp as c3;
 ----
-1970-01-01T00:00:01.100 1970-01-01T00:00:01.100 1970-01-01T00:00:01.100
+1970-01-01T00:00:01.100 1970-01-01T00:00:00.000000001 1970-01-01T00:00:00.000000001


Yes, thank you, I double checked and reminded myself of what was going on:

DataFusion CLI v48.0.0 > select cast(1.1 as timestamp); +-------------------------+ | Float64(1.1) | +-------------------------+ | 1970-01-01T00:00:01.100 | +-------------------------+ 1 row(s) fetched. Elapsed 0.005 seconds. > select cast(column1 as timestamp) from values (1.1); +-------------------------------+ | column1 | +-------------------------------+ | 1970-01-01T00:00:00.000000001 | +-------------------------------+ 1 row(s) fetched. Elapsed 0.001 seconds.

alamb · 2025-07-17T19:09:19Z

Thanks again @findepi

…e#16639) * Fix discrepancy in Float64 to timestamp(9) casts Before the change, when casting `Float64` value to `Timestamp(Nanosecond, None)`, the result would depend on whether the source value is constant-foldable scalar. This is because `ScalarValue.cast_to` had a special treatment for that source & destination type pair, producing a different result from the canonical one. * Test Float32 cast to timestamp ntz too * restore to_timestamp(double) behavior the function was not meant to be changed (cherry picked from commit 4e32ab9)

## Which issue does this PR close?  - Closes #16678. ## Rationale for this change  The issue has been fixed in #16639, this PR just adds a testcase for it. ## What changes are included in this PR?  Add a test case for `to_timestamp(double)` with vectorized input. Similar to the one presented in the issue. ## Are these changes tested?  Yes ## Are there any user-facing changes?   No

github-actions bot added sqllogictest SQL Logic Tests (.slt) common Related to common crate labels Jul 1, 2025

Test Float32 cast to timestamp ntz too

838eeda

github-actions bot added the functions Changes to functions implementation label Jul 2, 2025

restore to_timestamp(double) behavior

5c4a062

the function was not meant to be changed

findepi force-pushed the findepi/fix-discrepancy-in-float64-to-timestamp-9-casts-d3834c branch from 83c9c5e to 5c4a062 Compare July 2, 2025 19:19

findepi mentioned this pull request Jul 2, 2025

fix: The inconsistency between scalar and array on the cast decimal to timestamp #16539

Merged

alamb reviewed Jul 3, 2025

View reviewed changes

alamb changed the title ~~Fix discrepancy in Float64 to timestamp(9) casts~~ Fix discrepancy in Float64 to timestamp(9) casts for constants Jul 17, 2025

alamb approved these changes Jul 17, 2025

View reviewed changes

alamb merged commit 4e32ab9 into apache:main Jul 17, 2025
29 checks passed

findepi deleted the findepi/fix-discrepancy-in-float64-to-timestamp-9-casts-d3834c branch July 17, 2025 19:40

alexanderbianchi mentioned this pull request Jul 31, 2025

[branch-48] Cherry pick to_timestamp fix for float values DataDog/datafusion#35

Merged

dqkqd mentioned this pull request Oct 4, 2025

to_timestamp(double) gives different results depending on scalar/vectorized call context #16678

Closed

dqkqd mentioned this pull request Oct 18, 2025

test: to_timestamp(double) for vectorized input #18147

Merged

Fix discrepancy in Float64 to timestamp(9) casts for constants #16639

Fix discrepancy in Float64 to timestamp(9) casts for constants #16639

Uh oh!

Conversation

findepi commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

findepi commented Jul 1, 2025

Uh oh!

findepi commented Jul 1, 2025

Uh oh!

jatin510 commented Jul 1, 2025

Uh oh!

findepi commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Omega359 commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

findepi commented Jul 2, 2025

Uh oh!

findepi commented Jul 2, 2025

Uh oh!

Omega359 commented Jul 2, 2025

Uh oh!

findepi commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Omega359 commented Jul 2, 2025

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

findepi Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

alamb Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

findepi Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alamb commented Jul 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

findepi commented Jul 1, 2025 •

edited

Loading

findepi commented Jul 1, 2025 •

edited

Loading

Omega359 commented Jul 2, 2025 •

edited

Loading

findepi commented Jul 2, 2025 •

edited

Loading