Skip to content

Delete operation doesn't do type coercion #1921

@echai58

Description

@echai58

Environment

Delta-rs version: 0.13.0

Binding: python bindings

Environment:

  • Cloud provider: n/a
  • OS: ubuntu 22.04
  • Other: testing locally via jupyter notebook

Bug

What happened:
It seems that when a string predicate is passed into DeltaTable.delete, when it gets parsed as a Datafusion Expression, it is not taking into account the schema of the table. For example, if there is a column with type pa.int32(), and you try to use a predicate like price = 100, it raises an error ValueError: Invalid comparison operation: Int32 <= Int64, which I assume is coming from 100 being parsed as a int64. This is supported by if I pass in price = CAST(100 as INT) instead, it works as expected.

What you expected to happen:
The parser should be schema-aware when converting the string predicate to a Datafusion Expression.

How to reproduce it:
This is a minimal reproduction:

import tempfile
import pandas as pd
import pyarrow as pa
from deltalake import DeltaTable
from deltalake.writer import write_deltalake

pandas_data = pd.DataFrame.from_dict(
        {
            "price": [100, 150],
            "qty": [15, 20],
        }
    )

schema = pa.schema(
    [
        ("price", pa.int32()),
        ("qty", pa.int32()),
    ]
)

with tempfile.TemporaryDirectory() as path:
    table = pa.Table.from_pandas(pandas_data, schema)
    write_deltalake(
        table_or_uri=path,
        data=table,
        mode="error",
    )

    delta_table = DeltaTable(path) 

    # this does not work
    delta_table.delete(predicate="price = 100")

    # this works
    delta_table.delete(predicate="price = CAST(100 as INT)")

Metadata

Metadata

Assignees

No one assigned

    Labels

    binding/pythonIssues for the Python packagebinding/rustIssues for the Rust crateenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions