Skip to content

Conversation

@kiszk
Copy link
Member

@kiszk kiszk commented Dec 13, 2017

What changes were proposed in this pull request?

This PR adds check whether Java code generated by Catalyst can be compiled by janino correctly or not into TPCDSQuerySuite. Before this PR, this suite only checks whether analysis can be performed correctly or not.

This check will be able to avoid unexpected performance degrade by interpreter execution due to a Java compilation error.

How was this patch tested?

Existing a test case, but updated it.

@mgaido91
Copy link
Contributor

wrong JIRA number, may you update it please? thanks.

@kiszk kiszk changed the title [SPARK-22773][SQL][Test] Add compilation check into TPCDSQuerySuite [SPARK-22774][SQL][Test] Add compilation check into TPCDSQuerySuite Dec 13, 2017
@kiszk
Copy link
Member Author

kiszk commented Dec 13, 2017

Thanks, done

@SparkQA
Copy link

SparkQA commented Dec 13, 2017

Test build #84872 has finished for PR 19971 at commit 19a0d2d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

: )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we essentially need the changes in this file? I feel it's okay to touch TPCDSQuerySuite.scala only like: master...maropu:SPARK-22140-FOLLOWUP

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, we can do. I wanted to reduce duplicated code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably, the debug package is not for test uses like this.

@maropu
Copy link
Member

maropu commented Dec 13, 2017

retest this please

@JoshRosen
Copy link
Contributor

Could we address this more broadly by ensuring that whole stage codegen fallback is disabled in tests?

@JoshRosen
Copy link
Contributor

Wait, nevermind: I see that this isn't actually running the queries, so setting spark.sql.codegen.fallback=false wouldn't be sufficient.

Separate from this PR, we might want to set spark.sql.codegen.fallback=false in more tests, though: it looks like it's on for Hive tests but might be disabled in some SQL suites.

@cloud-fan
Copy link
Contributor

We don't need to as we never fallback in test

// try to compile and fallback if it failed
    val (_, maxCodeSize) = try {
      CodeGenerator.compile(cleanedSource)
    } catch {
      case _: Exception if !Utils.isTesting && sqlContext.conf.codegenFallback =>
        // We should already saw the error message
        logWarning(s"Whole-stage codegen disabled for this plan:\n $treeString")
        return child.execute()
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it helpful to show up the subtree which causes the compilation error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@SparkQA
Copy link

SparkQA commented Dec 14, 2017

Test build #84886 has finished for PR 19971 at commit 19a0d2d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

codegenSubtrees += s
case s => s
}
codegenSubtrees.toSeq.map { subtree =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: map -> foreach

val msg =
s"Subtree:\n$subtree\n" +
s"Generated code:\n${CodeFormatter.format(code)}\n"
logDebug(msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's more useful to include the error message in the exception, e.g. throw new Exception(msg, e)

@SparkQA
Copy link

SparkQA commented Dec 14, 2017

Test build #84894 has finished for PR 19971 at commit 1480ae3.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

CodeGenerator.compile(code)
} catch {
case e: Exception =>
logError(s"failed to compile: $e", e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this log? I feel a bit meaningless.

Copy link
Member Author

@kiszk kiszk Dec 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no strong preference. I followed the following code in CodeGenerator.scala

    val maxCodeSize = try {
      evaluator.cook("generated.java", code.body)
      updateAndGetCompilationStats(evaluator)
    } catch {
      case e: InternalCompilerException =>
        val msg = s"failed to compile: $e"
        logError(msg, e)
        val maxLines = SQLConf.get.loggingMaxLinesForCodegen
        logInfo(s"\n${CodeFormatter.format(code, maxLines)}")
        throw new InternalCompilerException(msg, e)
      case e: CompileException =>
        val msg = s"failed to compile: $e"
        logError(msg, e)
        val maxLines = SQLConf.get.loggingMaxLinesForCodegen
        logInfo(s"\n${CodeFormatter.format(code, maxLines)}")
        throw new CompileException(msg, e.getLocation)
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya, I know though, the logging is there because compilation errors lead to the fallbacks and then the exception messages are not passed to users. But, in this test case, we explicitly throw an exception below with the message. So, I said a bit meaningless.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeGenerator is production code, we usually don't log in tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can combine it with the exception error message

val msg =
  s"""
    |failed to compile:
    |Subtree: ...
   """

@SparkQA
Copy link

SparkQA commented Dec 14, 2017

Test build #84906 has finished for PR 19971 at commit 7d2c599.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

* without hitting the max iteration threshold.
*/
class TPCDSQuerySuite extends QueryTest with SharedSQLContext with BeforeAndAfterAll {
class TPCDSQuerySuite extends QueryTest with SharedSQLContext with BeforeAndAfterAll with Logging {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: drop Logging and the import

|$subtree
|Generated code:
|${CodeFormatter.format(code)}
"""
Copy link
Contributor

@cloud-fan cloud-fan Dec 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stripMargin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this should be generated by IDE automatically...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, while I am using IntelliJ...

@cloud-fan
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Dec 14, 2017

Test build #84913 has finished for PR 19971 at commit dacb790.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 14, 2017

Test build #84917 has finished for PR 19971 at commit 44abd2f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class TPCDSQuerySuite extends QueryTest with SharedSQLContext with BeforeAndAfterAll

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in 606ae49 Dec 14, 2017
@gatorsmile
Copy link
Member

gatorsmile commented Dec 14, 2017

For safety, I also add a TPC-H query suite to do the same checks like TPC-DS, #19982

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants