Skip to content

Conversation

@hvanhovell
Copy link
Contributor

What changes were proposed in this pull request?

This PR slightly refactors the newly added ExprValue API by quite a bit. The following changes are introduced:

  1. ExprValue now uses the actual class instead of the class name as its type. This should give some more flexibility with generating code in the future.
  2. Renamed StatementValue to SimpleExprValue. The statement concept is broader then an expression (untyped and it cannot be on the right hand side of an assignment), and this was not really what we were using it for. I have added a top level JavaCode trait that can be used in the future to reinstate (no pun intended) a statement a-like code fragment.
  3. Added factory methods to the JavaCode companion object to make it slightly less verbose to create JavaCode/ExprValue objects. This is also what makes the diff quite large.
  4. Added one more factory method to ExprCode to make it easier to create code-less expressions.

How was this patch tested?

Existing tests.

@hvanhovell
Copy link
Contributor Author

cc @viirya @cloud-fan @rednaxelafx

@SparkQA
Copy link

SparkQA commented Apr 10, 2018

Test build #89128 has finished for PR 21026 at commit 821e08a.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

}

if (input.isNull == "false") {
if (input.isNull == FalseLiteral) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes an existing bug :)

# Conflicts:
#	sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala
@SparkQA
Copy link

SparkQA commented Apr 10, 2018

Test build #89129 has finished for PR 21026 at commit 0b194ca.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 10, 2018

Test build #89134 has finished for PR 21026 at commit 45d3ed8.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 10, 2018

Test build #89136 has finished for PR 21026 at commit 2488007.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • |class SpecificUnsafeProjection extends $

@SparkQA
Copy link

SparkQA commented Apr 10, 2018

Test build #89143 has finished for PR 21026 at commit b2c1601.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

case _ => s"$value = $initCode;"
}
ExprCode(code, FalseLiteral, GlobalValue(value, javaType(dataType)))
ExprCode.forNonNullValue(JavaCode.global(value, dataType))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops

Copy link
Contributor

@rednaxelafx rednaxelafx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good. I left some comments and let's see if we should change them.

// `rowWriter` is declared as a class field, so we can access it directly in methods.
ExprCode(code, FalseLiteral, StatementValue(s"$rowWriter.getRow()", "UnsafeRow",
canDirectAccess = true))
ExprCode(code, FalseLiteral, SimpleExprValue(s"$rowWriter.getRow()", classOf[UnsafeRow]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a JavaCode.expression(expr: String, javaClass: Class[_]) and use it here?

"""

ExprCode(code, FalseLiteral, VariableValue(output, "MapData"))
ExprCode(code, FalseLiteral, VariableValue(output, classOf[MapData]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use JavaCode.variable(name, javaClass) here?

"""

ExprCode(code, FalseLiteral, VariableValue(output, "ArrayData"))
ExprCode(code, FalseLiteral, VariableValue(output, classOf[ArrayData]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use JavaCode.variable(name, javaClass) here?

""".stripMargin

ExprCode(code, FalseLiteral, VariableValue(output, "InternalRow"))
ExprCode(code, FalseLiteral, VariableValue(output, classOf[InternalRow]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use JavaCode.variable(name, javaClass) here?

*/
trait ExprValue extends JavaCode {
def javaType: Class[_]
def isPrimitive: Boolean = javaType.isPrimitive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR got rid of the canDirectAccess property from @viirya 's base change. Should we add it back?
(and implement that property in concrete classes with an override def instead of an override val)

Copy link
Contributor Author

@hvanhovell hvanhovell Apr 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya where are you planning to use this? I removed it because it did not seem to have any benefit (yet).

Copy link
Contributor

@cloud-fan cloud-fan Apr 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a few lines in this file, easy to add it back if needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvanhovell it was introduced for the whole stage codegen when it is needed to know whether a variable can be directly accessed in the scope of a new method or if it should be passed as an argument. I think we should keep it/replace with a def

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is not being used at the moment right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not yet,as far as I know @viirya was waiting for this method to be available to go on with his work: #19813 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for replying late. Yeah, my plan is to use canDirectAccess to quickly determine whether an ExprValue can be used a method parameter. I will add it back if I need in my work later.

val value = eval.isNull match {
case TrueLiteral => FalseLiteral
case FalseLiteral => TrueLiteral
case v => JavaCode.isNullExpression(s"!($v)")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the parens here. SimpleExprValue guarantees this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@SparkQA
Copy link

SparkQA commented Apr 11, 2018

Test build #89157 has finished for PR 21026 at commit 8ab0931.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 11, 2018

Test build #89159 has finished for PR 21026 at commit ce39905.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 11, 2018

Test build #89160 has finished for PR 21026 at commit e6abd22.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

LGTM, merging to master!

@asfgit asfgit closed this in c604d65 Apr 11, 2018
/**
* Create a local isNull variable.
*/
def isNullVariable(name: String): VariableValue = variable(name, BooleanType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this is a booleanVariable, rather than a isNullVariable, isn't it?

Copy link
Contributor Author

@hvanhovell hvanhovell Apr 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used to create is null variables. That it currently creates a boolean variable is actually not that important. I just want to create a narrow waist for this specific case, so we can swap it out for a more sophisticated implementation later on (something that advertises that it a null check for example).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I see the goal of this, but I can't understand why it is useful. The isNull variable is a boolean as all the other boolean variables... It is the more frequent case, but I can't see any differentiation between them and other booleans. Do you have in mind a case for which this differentiation can be useful?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to change it.

Copy link
Contributor

@cloud-fan cloud-fan Apr 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a big deal but I'm ok if we change isNullVariable(xxx) to variable(xxx, BooleanType).

*/
trait ExprValue extends JavaCode {
def javaType: Class[_]
def isPrimitive: Boolean = javaType.isPrimitive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvanhovell it was introduced for the whole stage codegen when it is needed to know whether a variable can be directly accessed in the scope of a new method or if it should be passed as an argument. I think we should keep it/replace with a def

/**
* Create a global isNull variable.
*/
def isNullGlobal(name: String): GlobalValue = global(name, BooleanType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as well, I think this should be a booleanGlobal...

/**
* Create a isNull expression fragment.
*/
def isNullExpression(code: String): SimpleExprValue = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

/**
* Utility functions for creating [[JavaCode]] fragments.
*/
object JavaCode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

despite I really like the introduction of the JavaCode trait, all these methods seems more related to ExprValue (especially the expression one, which represents a SimpleExprValue is very specific to it, and not to JavaCode which can represent also complex expressions instead): shall we move these methods to the ExprValue object?

Copy link
Contributor Author

@hvanhovell hvanhovell Apr 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to create a narrow waist for the creation JavaCode/ExprValue objects so we can swap out stuff later on. Feel free to move these methods (and remove the JavaCode trait) if you think that is more appropriate.

* -1 for other primitive types.
*/
def defaultLiteral(dataType: DataType): LiteralValue = {
new LiteralValue(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just call literal(CodeGenerator.defaultValue(dataType, typedNull = true), dataType)? In current way, defaultLiteral(BooleanType) won't return a FalseLiteral.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants