Skip to content

approx_percentile_cont_with_weight result type changed from datafusion v39 to v40 #11874

@Michael-J-Ward

Description

@Michael-J-Ward

Describe the bug

In datafusion:v39.0.0, calling approx_percentile_cont_with_weight on an integer column produced an integer result.

In datafusion:v40.0.0, calling approx_percentile_cont_with_weight on an integer column produces a float result.

To Reproduce

v39.0.0 behavior

Setup

git checkout 40.0.0
git log --oneline -n 1
cargo run
6a4a280e3 (HEAD, tag: 39.0.0-rc1, tag: 39.0.0, upstream/branch-39) remove unused file
DataFusion CLI v39.0.0
create table foo(x int);
insert into foo values (1), (2), (3);
select arrow_typeof(approx_percentile_cont_with_weight("x", 0.6, 0.5)), approx_percentile_cont_with_weight("x", 0.6, 0.5) from foo;

Output

+-----------------------------------------------------------------------------------+---------------------------------------------------------------------+
| arrow_typeof(APPROX_PERCENTILE_CONT_WITH_WEIGHT(foo.x,Float64(0.6),Float64(0.5))) | APPROX_PERCENTILE_CONT_WITH_WEIGHT(foo.x,Float64(0.6),Float64(0.5)) |
+-----------------------------------------------------------------------------------+---------------------------------------------------------------------+
| Int32                                                                             | 3                                                                   |
+-----------------------------------------------------------------------------------+---------------------------------------------------------------------+
1 row(s) fetched. 
Elapsed 0.010 seconds.

v40.0.0 behavior

Setup

git checkout 40.0.0
git log --oneline -n 1
cargo run
4cae81363 (HEAD, tag: 40.0.0-rc1, tag: 40.0.0, upstream/branch-40) manually remove a reverted PR from the breaking change section
DataFusion CLI v40.0.0
create table foo(x int);
insert into foo values (1), (2), (3);
select arrow_typeof(approx_percentile_cont_with_weight("x", 0.6, 0.5)), approx_percentile_cont_with_weight("x", 0.6, 0.5) from foo;

Output

+-----------------------------------------------------------------------------------+---------------------------------------------------------------------+
| arrow_typeof(approx_percentile_cont_with_weight(foo.x,Float64(0.6),Float64(0.5))) | approx_percentile_cont_with_weight(foo.x,Float64(0.6),Float64(0.5)) |
+-----------------------------------------------------------------------------------+---------------------------------------------------------------------+
| Float64                                                                           | 3.0                                                                 |
+-----------------------------------------------------------------------------------+---------------------------------------------------------------------+
1 row(s) fetched. 
Elapsed 0.011 seconds.

Expected behavior

I would have expected the return type to remain consistent, though I am unsure what datafusion uses as the canonical implementation. SQL Server does look like it should return a float

Additional context

approx_percentile_cont_with_weight was converted to a UDAF between v39 and v40.

#10917

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions