-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Is your feature request related to a problem or challenge?
Coercion is (TODO find definition)
At the moment DataFusion has one set of built in coercion rules. However, with a single set of coercion rules, we'll always end up with conflicting requirements such as
- fix(expr-common): Coerce to Decimal(20, 0) when combining UInt64 with signed integers #14223
- Coercion between floating point types and exact numeric types should result in floating point type as the common super type #14272
- DataFusion Regression (Starting in v43): Type Coercion for UDF Arguments (X --> String) for Specified UDFs #14230 from @shehabgamin
It also makes it hard, as @findepi has pointed out several times, to extend DataFusion with new data types / logica types.
My conclusion is we will never be able to have the behavior that works for everyone, and any change in coercion is likely to make some other tradeoff.
Rather than having to go back/forth in the core, I think an API would give us an escape hatch.
Describe the solution you'd like
While the user can in theory supply their own coercion rules by adding a new AnalyzerRule instead of the current [TypeCoercion] rule
| pub struct TypeCoercion {} |
Is it a crazy idea if we tried to implement "user defined coercion rules" -- that way we can have whatever coercion rules are in the DataFusion core but when they inevitably are not quite what users want the users can override them
It would also force us to design the type coercion APIs more coherently which I think would be good for the code quality in general
Describe alternatives you've considered
I was imagining something like
struct MyCoercer {}
/// Implement custom coercion ehre
impl TypeCoercion for MyCoercer {
...
}
// register coercion rules like
let ctx: SessionContext = SessionStateBuilder::new()
.with_type_coercion_rules(Arc::new(MyCoercer{}));
...The trait might look like
/// Defines coercion rules
pub trait TypeCoercion {
/// Given the types of arguments to a comparison operation (`=`, `<`, etc) what single type should all arguments be cast to
fn coerce_comparison(args: &[DataType]) -> DataType ;
...
}Maybe there should be methods for Field instead of DataType 🤔
Additional context
No response