- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.7k
Description
Is your feature request related to a problem or challenge?
I think there is room for improvement in type coerceion or casting.
Background
comparison_coercion is widely used in datafusion, a lossless conversion
https://github.com/apache/arrow-datafusion/blob/main/datafusion/expr/src/type_coercion/binary.rs
can_coerce_from is used mainly for signature, a lossless conversion
https://github.com/apache/arrow-datafusion/blob/main/datafusion/expr/src/type_coercion/functions.rs
can_cast_types is from arrow-cast, which is a lossy conversion. It is also used in some comparison_coercion building block. https://github.com/apache/arrow-rs/blob/df69ef57d055453c399fa925ad315d19211d7ab2/arrow-cast/src/cast.rs#L76-L273
Not sure if there is other coercion I missed
Proposal
comparison_coercion and can_coerce_from seem like doing the similar thing, maybe we can just have one lossless conversion. If lossless conversion is useful for arrow-rs, we can introduce a lossless version of can_cast_types, then rely on it for datafusion.
Lossy conversion vs Lossless
I think the definition for lossy is that the value is not recoverable after casting back, otherwise it is lossless.
Lossy
- Int32 to Int16 / Int8
Lossless
- Int32 to Int64
Describe the solution you'd like
- Replace can_coerce_fromwithcomparison_coercion's building blocknumeric coercion,list coercion,string coercion,null coercion, etc
- Split list_coercionfromstring_coercionto make each building block of coercion clear on the task it focus on. list_coercion dolist/fixed size list/large listcoercion,string_coerciondoutf/large utfcoercion.
- Introduce these lossless coercion to arrow-rs?
Known issue or question I have
- Introduce list_coercionthat currently exist instring_concat_coercion
- No list coercion for can_coerce_from
- Decimal128 can cast to Float64 in can_coerce_from, why?
Describe alternatives you've considered
If there are many customize conversion need, then this change might not be helpful at all. We need other approach to let type casting / coercion easy to use.
No response
Additional context
No response