Skip to content

non-null sub-field on nullable struct-field has wrong nullity. #8507

@jacksonrnewhouse

Description

@jacksonrnewhouse

Describe the bug

If you have table schema like

DFSchema {
    fields: [
        DFField {
            qualifier: Some(
                Bare {
                    table: "nexmark",
                },
            ),
            field: Field {
                name: "bid",
                data_type: Struct(
                    [
                        Field {
                            name: "auction",
                            data_type: Int64,
                            nullable: false,
                        },
                nullable: true,
          }

And run a query like SELECT bid.auction FROM nexmark you'll get an error when bid is null. Error looks like

ArrowError(InvalidArgumentError("Column 'nexmark.bid[datetime]' is declared as non-nullable but contains null values")) 

The problem is that the nullable() method only looks at the sub-fields nullability, ignoring that the parent field may be null.
https://github.com/apache/arrow-datafusion/blob/main/datafusion/expr/src/expr_schema.rs#L280.

To Reproduce

No response

Expected behavior

The expression should be nullable, as its parent may not be present.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions