-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Is your feature request related to a problem or challenge?
Motivation
There is ongoing work migrating BuitlinScalarFunctions -> ScalarUDF #8045, and we noticed one Expr related issue during prototyping:
We can use Expr API to create builtin scalar functions directly:
let expr1 = abs(lit(-1));
let expr2 = call_fn("abs", vec![lit(-1)]);We want to still use this API after functions are migrated to ScalarUDF based implementation, it's not possible now because those Expr creations are stateless, and ScalarUDFs are registered inside SessionState. It's only possible if we do
let ctx = create_context()?;
ctx.register_udf(abs_impl);
let expr2 = ctx.call_fn("abs", vec![lit(-1)]);Describe the solution you'd like
To continue supporting the original stateless Expr creation API, we can create Expr with only the string function name, and resolve the function during logical optimization.
Then call_fn("abs", vec![lit(-1)]) can be supported (now it only support BuitlinScalarFunction and don't support UDFs)
Another macro-based API abs(lit(-1)) can be supported if we hard code all possible function names within the core (should we do that?)
Potential implementation is:
- After
let expr2 = call_fn("abs", vec![lit(-1)]);, create aScalarUDFexpression with dummy implementation. - Add an
AnalyzerRule(a mandatory logical optimizer rule) to resolve this UDF name using external functions registered inSessionState
Issue:
Now function implementation is inside SessionState but outside SessionConfig, and the analyzer can only access SessionConfig.
We have to move the function registry into SessionConfig first (or is there any better way?)
pub struct SessionState {
// ...
/// Scalar functions that are registered with the context
scalar_functions: HashMap<String, Arc<ScalarUDF>>,
/// Session configuration
config: SessionConfig,// ...
// `self.options()` is `ConfigOption`
let analyzed_plan =
self.analyzer
.execute_and_check(plan, self.options(), |_, _| {})?;cc @alamb I wonder if you have any thoughts on this approach 🤔
Describe alternatives you've considered
No response
Additional context
No response