- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.7k
          Implement ScalarUDF in terms of ScalarUDFImpl trait
          #8713
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ScalarUDF in terms of ScalarUDFImpl trait
      | ); | ||
|  | ||
| // UDF | ||
| #[derive(Debug)] | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example of the API change -- any new impl of ScalarUDFImpl must also derive Debug -- note that ScalarUDFImpl was introduced in #8578 and not yet released so this is not a breaking change for released versions
| /// | ||
| /// 2. For advanced use cases, use [`ScalarUDFImpl`] and [`advanced_udf.rs`]. | ||
| /// | ||
| /// # API Note | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I went through this PR, ScalarUDF is now basically a pass through wrapper to ScalarUDFImpl -- if we didn't want to maintain backwards compatibility we could probably simply remove the ScalarUDF struct and make it a trait, but I think that would be super disruptive to all exisiting users of DataFusion so I think we should avoid doing so unless absolutely necessary.
3abbd23    to
    1334667      
    Compare
  
    | } | ||
| } | ||
|  | ||
| /// ScalarUDF that adds an alias to the underlying function. It is better to | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat boilerplate, but it is a pretty straightforward example of using Trait objects to extend functionality
| /// [`advanced_udf.rs`]: https://github.com/apache/arrow-datafusion/blob/main/datafusion-examples/examples/advanced_udf.rs | ||
| #[derive(Clone)] | ||
| #[derive(Debug, Clone)] | ||
| pub struct ScalarUDF { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API for ScalarUDF is not changed at all -- only its internal implementation
| F: ScalarUDFImpl + Send + Sync + 'static, | ||
| F: ScalarUDFImpl + 'static, | ||
| { | ||
| // TODO change the internal implementation to use the trait object | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
| /// | ||
| /// If you implement [`ScalarUDFImpl`] directly you should return aliases directly. | ||
| pub fn with_aliases(self, aliases: impl IntoIterator<Item = &'static str>) -> Self { | ||
| Self::new_from_impl(AliasedScalarUDFImpl::new(self, aliases)) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat of a hack (to have a wrapper around the scalar UDF). It may make more sense to simply remove the call to with_aliases -- however, since it was released in datafusion 34.0.0 -- https://docs.rs/datafusion/34.0.0/datafusion/physical_plan/udf/struct.ScalarUDF.html -- that would be a breaking API change.
We could deprecate the API 🤔
        
          
                datafusion/expr/src/udf.rs
              
                Outdated
          
        
      | } | ||
| } | ||
|  | ||
| /// Implementation of [`ScalarUDFImpl`] that wraps the function style pointers of the older API | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically the old implementaiton of ScalarUDF moved into its own struct
| Ok(Arc::new(ScalarFunctionExpr::new( | ||
| fun.name(), | ||
| fun.fun().clone(), | ||
| fun.fun(), | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a drive by cleanup I noticed while working on the code
1334667    to
    41ec08a      
    Compare
  
    | Thank you for the review @crepererum | 
Which issue does this PR close?
Closes #8712
Rationale for this change
ScalarUDFis a struct of function pointers for historical reasons. After #8578,ScalarUDFImplis available, we can clean up its internal implementation to be in terms of that trait directly, simplifying the implementation greatly.What changes are included in this PR?
ScalarUDFto have a singleArc<dyn ScalarUDFImplfield+DebugtoScalarUDFImplwhich forces implementations to also derive Debug which I think is much better than the opaqueFUNCthat was there previouslyAre these changes tested?
yes
Are there any user-facing changes?
New requirement to derive
DebugforScalarUDFImplbut I think that is good hygene anyways