-
Couldn't load subscription status.
- Fork 1.7k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Datafusion produces substrait plans that do not include a struct with type information
"baseSchema": {
"names": [
"ps_partkey",
"ps_suppkey",
"ps_availqty",
"ps_supplycost",
"ps_comment"
]
},
It should look more like this to be valid
"baseSchema": {
"names": ["PS_PARTKEY", "PS_SUPPKEY", "PS_AVAILQTY", "PS_SUPPLYCOST", "PS_COMMENT"],
"struct": {
"types": [{
"i64": {
"nullability": "NULLABILITY_REQUIRED"
}
}, {
"i64": {
"nullability": "NULLABILITY_REQUIRED"
}
}, {
"i64": {
"nullability": "NULLABILITY_REQUIRED"
}
}, {
"decimal": {
"scale": 2,
"precision": 15,
"nullability": "NULLABILITY_REQUIRED"
}
}, {
"string": {
"nullability": "NULLABILITY_REQUIRED"
}
}],
"nullability": "NULLABILITY_REQUIRED"
}
},
To Reproduce
Generate any substrait plan that includes a read relation and you'll be able to see that the plan output doesn't include the struct field with type information in the baseSchema.
base_schema is a NamedStruct
https://substrait.io/relations/logical_relations/#__tabbed_1_1
https://substrait.io/types/named_structs/
Expected behavior
No response
Additional context
You can also vaidate plans by running them through the substrait-validator
import substrait_validator as sv
import substrait.gen.proto.plan_pb2 as plan_pb2
from datafusion import SessionContext
from datafusion import substrait as ss
ctx = SessionContext()
substrait_proto = plan_pb2.Plan()
substrait_plan = ss.serde.serialize_to_plan(sql_query, ctx)
substrait_plan_bytes = substrait_plan.encode()
config = sv.Config()
sv.check_plan_valid(substrait_plan_bytes, config)
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working