Skip to content

REGEX_LIKE and REGEX_MATCH don't support LargeUtf8 type #12664

@goldmedal

Description

@goldmedal

Describe the bug

While working on #12415, I found REGEX_LIKE and REGEX_MATCH don't support LargeUtf8 type.

To Reproduce

It can be reproduced by the following SQL

> select regexp_like(arrow_cast('abcdef', 'LargeUtf8'), 'bc');
Internal error: could not cast value to arrow_array::array::byte_array::GenericByteArray<arrow_array::types::GenericStringType<i64>>.
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker

> select regexp_match(arrow_cast('abcdef', 'LargeUtf8'), 'bc');
Internal error: could not cast value to arrow_array::array::byte_array::GenericByteArray<arrow_array::types::GenericStringType<i64>>.
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker

Expected behavior

They should work well like Utf8

> select regexp_like(arrow_cast('abcdef', 'Utf8'), 'bc');
+-----------------------------------------------------------------+
| regexp_like(arrow_cast(Utf8("abcdef"),Utf8("Utf8")),Utf8("bc")) |
+-----------------------------------------------------------------+
| true                                                            |
+-----------------------------------------------------------------+
1 row(s) fetched. 
Elapsed 0.010 seconds.

> select regexp_match(arrow_cast('abcdef', 'Utf8'), 'bc');
+------------------------------------------------------------------+
| regexp_match(arrow_cast(Utf8("abcdef"),Utf8("Utf8")),Utf8("bc")) |
+------------------------------------------------------------------+
| [bc]                                                             |
+------------------------------------------------------------------+
1 row(s) fetched. 
Elapsed 0.016 seconds.

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinggood first issueGood for newcomers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions