Skip to content

Implement equality = and inequality <> support for BinaryView #10996

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

Part of #10918, [StringViewArray](https://docs.rs/arrow/latest/arrow/array/type.StringViewArray.html) support in DataFusion

@Weijun-H added support for <> and != forStringView in #10985. We also need similar support for BinaryView

Describe the solution you'd like

In order to improve performance of these queries we will need the ability to actually compare StringViewArrays to constant values (and likely to each other)

Thus I would like to be able to run

BinaryView = scalar
BianryView = StringViewColumn

I basically want to to run the following queries (where table foo has BinaryView columns)

> create table foo as values ('Andrew', 'X'), ('Xiangpeng', 'Xiangpeng'), ('Raphael', 'R');
0 row(s) fetched.
Elapsed 0.002 seconds.

> select * from foo where column1 = 'Andrew';
+---------+---------+
| column1 | column2 |
+---------+---------+
| Andrew  | X       |
+---------+---------+
1 row(s) fetched.
Elapsed 0.003 seconds.

> select * from foo where column1 <> 'Andrew';
+-----------+-----------+
| column1   | column2   |
+-----------+-----------+
| Xiangpeng | Xiangpeng |
| Raphael   | R         |
+-----------+-----------+
2 row(s) fetched.
Elapsed 0.001 seconds.

> select * from foo where column1 = column2;
+-----------+-----------+
| column1   | column2   |
+-----------+-----------+
| Xiangpeng | Xiangpeng |
+-----------+-----------+
1 row(s) fetched.
Elapsed 0.002 seconds.

> select * from foo where column1 <> column2;
+---------+---------+
| column1 | column2 |
+---------+---------+
| Andrew  | X       |
| Raphael | R       |
+---------+---------+
2 row(s) fetched.
Elapsed 0.001 seconds.

Describe alternatives you've considered

I suspect we will need to update the coercion logic and maybe also the arrow equality kernels like https://docs.rs/arrow/latest/arrow/compute/kernels/cmp/fn.eq.html

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions