-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-24361][SQL] Polish code block manipulation API #21405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…for constructing code block.
…ay not aware of necessary block inputs.
|
Test build #91014 has finished for PR 21405 at commit
|
|
retest this please. |
|
Test build #91021 has finished for PR 21405 at commit
|
|
retest this please. |
|
|
||
| // We want to replace all occurrences of `expr` with the variable `aliasedParam`. | ||
| val aliasedCode = code.transformExprValues { | ||
| case SimpleExprValue("1 + 1", _) => aliasedParam |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I know the current code works correctly. How about replacing _ with CodeGenerator.javaClass(IntegerType)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
|
@viirya sorry do you already have some use-cases, goals in mind for this API? May you please explain me them? Thanks. |
|
@mgaido91 Yes, as @rednaxelafx said in previous RP, #19813 will be the first use case. I'd like to use this API to transform statement-based expressions in code blocks. For example, val collectedStatements = HashMap[ExprValue, ExprValue]
val transformed = code.transformExprValues {
case s @ SimpleExprValue(_, javaType) =>
if (collectedStatements.contains(s)) {
collectedStatements(s)
} else {
val aliasedVariable = JavaCode.variable(ctx.freshName("var"), javaType)
collectedStatements += s -> aliasedVariable
aliasedVariable
}
}
val createVariables = collectedStatements.foldLeft(EmptyBlock) { (block, (statement, variable)) =>
block + code"${statement.javaType.getName} $variable = $statement;"
}
ev.copy(code = createVariables + transformed) |
|
Test build #91031 has finished for PR 21405 at commit
|
|
can we give a concrete use case in the PR description? I think splitting code into methods is a good one, but we need to make it completed, e.g. how the method is generated. |
|
@cloud-fan Thanks. I give a use case of splitting code into method in the PR description. I think it can show the basic idea and it is what I hope to make. |
| * Apply a map function to each java expression codes present in this java code, and return a new | ||
| * java code based on the mapped java expression codes. | ||
| */ | ||
| def transformExprValues(f: PartialFunction[ExprValue, ExprValue]): this.type = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
am I wrong or we are not updating the exprValues here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transformExprValues will create a new instance Block because a block is immutable. So once there are change, new block should take new exprValues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, can we add some tests for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Will add them in next commit. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added test for it.
|
@cloud-fan I found that the way to collect method parameters in the use case can be simplified. Updated in the PR description. Please take a look when you have time. Thanks. |
|
@cloud-fan @hvanhovell Do you have any comment/suggestion on this change? Thanks. |
|
Test build #91251 has finished for PR 21405 at commit
|
|
ping @cloud-fan @hvanhovell @kiszk |
| * Trait representing an opaque fragments of java code. | ||
| */ | ||
| trait JavaCode { | ||
| trait JavaCode extends TreeNode[JavaCode] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we only make Block extends TreeNode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Currently ExprValue doesn't have to be TreeNode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update in next commit.
| override def hashCode(): Int = value.hashCode() * 31 + javaType.hashCode() | ||
| } | ||
|
|
||
| case class LiteralExpr(override val value: String, override val javaType: Class[_]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need this change?
|
|
||
| // The expressions to be evaluated inside this block. | ||
| // All expressions to be evaluated inside this block and underlying blocks. | ||
| def exprValues: Set[ExprValue] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove it, as discussed in other PR previously.
|
|
||
| test ("transform expr in nested blocks") { | ||
| val expr = JavaCode.expression("1 + 1", IntegerType) | ||
| val isNull = JavaCode.isNullVariable("expr1_isNull") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not related to this PR, shall we remove isNullVariable and always use expression(..., BooleanType)? Or we can add a bunch of intExpression(...), booleanExpression(...) etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine and maybe more clear. I was thinking the name of isNullVariable is a bit confusing at the first look actually.
|
|
||
| val block = code"${subBlocks(0)}\n${subBlocks(1)}\n${subBlocks(2)}" | ||
| val transformedBlock = block.transform { | ||
| case b: Block => b.transformExprValues { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm, I'd image transformExprValues will also transform child blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the this should be
block.transformExprValues {
case SimpleExprValue ...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I may call it transformAllExprValues.
|
Test build #92637 has finished for PR 21405 at commit
|
|
LGTM, can you update the demo in your PR description? thanks! |
|
Test build #92640 has finished for PR 21405 at commit
|
|
Test build #92642 has finished for PR 21405 at commit
|
|
@cloud-fan Updated. Do you think I should address the comment #21405 (comment) to let |
|
we can do it later, when we do need |
|
thanks, merging to master! Looking forward to seeing how we can solve the method splitting problem with this infra :) |
What changes were proposed in this pull request?
Current code block manipulation API is immature and hacky. We need a formal API to manipulate code blocks.
The basic idea is making
JavaCodeasTreeNode. So we can use familiartransformAPI to manipulate code blocks and expressions in code blocks.For example, we can replace
SimpleExprValuein a code block like this:code.transformExprValues { case SimpleExprValue("1 + 1", _) => aliasedParam }The example use case is splitting code to methods.
For example, we have an
ExprCodecontaining generated code. But it is too long and we need to split it as method. Because statement-based expressions can't be directly passed into. We need to transform them as variables first:How was this patch tested?
Added unite tests.