-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
While reviewing #8270 I found some bugs with the ntile window function.
- Possible incorrect results when the
ntilefunction argument is larger than the number of rows - Returning an internal error with large function arguments as opposed to a standard error message
- Crashing the datafusion CLI on negative function arguments
To Reproduce
DataFusion CLI v33.0.0
❯ create table t1 (a int);
0 rows in set. Query took 0.005 seconds.
❯ insert into t1 values (1),(2),(3);
+-------+
| count |
+-------+
| 3 |
+-------+
1 row in set. Query took 0.006 seconds.
-- Do these results make sense? All other databases return ntile values 1,2,3.
-- Tested at https://dbfiddle.uk/
❯ select ntile(9223377) OVER(ORDER BY a) from t1;
+--------------------------------------------------------------------------------------------------------+
| NTILE(Int64(9223377)) ORDER BY [t1.a ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW |
+--------------------------------------------------------------------------------------------------------+
| 1 |
| 3074460 |
| 6148919 |
+--------------------------------------------------------------------------------------------------------+
-- This should return a regular error instead of an internal error
❯ select ntile(9223372036854775809) OVER(ORDER BY a) from t1;
Internal error: Cannot convert UInt64(9223372036854775809) to i64.
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker
-- This should not panic and crash the datafusion cli
❯ select ntile(-922337203685477580) OVER(ORDER BY a) from t1;
thread 'main' panicked at /home/ms/git/arrow-datafusion/datafusion/physical-expr/src/window/ntile.rs:100:23:
attempt to multiply with overflow
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
/home/ms/git/arrow-datafusion/datafusion-cli (ntile_output_type ✔) ᐅ Expected behavior
Covered in the SQL script comments
Additional context
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working