Skip to content

Average groups accumulator doesn't coerce to return type in presence of null values #10113

@gruuya

Description

@gruuya

Describe the bug

We observed TPC-DS q26 running into
Arrow error: Invalid argument error: column types must match schema types, expected Decimal128(11, 6) but found Decimal128(38, 10) at column index 2

The source files were generated using DuckDB, and the original data type is Decimal128(7, 2). This is then coerced to Decimal128(11, 6) by https://github.com/apache/arrow-datafusion/blob/4ad4f90d86c57226a4e0fb1f79dfaaf0d404c273/datafusion/expr/src/type_coercion/aggregates.rs#L457-L462

To Reproduce

❯ create table t as values ('a', arrow_cast(1, 'Decimal128(7,2)')), ('b', arrow_cast(NULL, 'Decimal128(7,2)'));
0 rows in set. Query took 0.045 seconds.

❯ select column1, avg(column2) from t group by column1;
Arrow error: Invalid argument error: column types must match schema types, expected Decimal128(11, 6) but found Decimal128(38, 10) at column index 1

Expected behavior

❯ create table t as values ('a', arrow_cast(1, 'Decimal128(7,2)')), ('b', arrow_cast(NULL, 'Decimal128(7,2)'));
0 rows in set. Query took 0.045 seconds.

❯ select column1, avg(column2) from t group by column1;
+---------+----------------+
| column1 | AVG(t.column2) |
+---------+----------------+
| a       | 1.000000       |
| b       |                |
+---------+----------------+
2 row(s) fetched. 
Elapsed 0.019 seconds.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions