-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-22774][SQL][Test] Add compilation check into TPCDSQuerySuite #19971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
wrong JIRA number, may you update it please? thanks. |
|
Thanks, done |
|
Test build #84872 has finished for PR 19971 at commit
|
|
: ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we essentially need the changes in this file? I feel it's okay to touch TPCDSQuerySuite.scala only like: master...maropu:SPARK-22140-FOLLOWUP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, we can do. I wanted to reduce duplicated code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably, the debug package is not for test uses like this.
|
retest this please |
|
Could we address this more broadly by ensuring that whole stage codegen fallback is disabled in tests? |
|
Wait, nevermind: I see that this isn't actually running the queries, so setting Separate from this PR, we might want to set |
|
We don't need to as we never fallback in test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it helpful to show up the subtree which causes the compilation error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
Test build #84886 has finished for PR 19971 at commit
|
| codegenSubtrees += s | ||
| case s => s | ||
| } | ||
| codegenSubtrees.toSeq.map { subtree => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: map -> foreach
| val msg = | ||
| s"Subtree:\n$subtree\n" + | ||
| s"Generated code:\n${CodeFormatter.format(code)}\n" | ||
| logDebug(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's more useful to include the error message in the exception, e.g. throw new Exception(msg, e)
|
Test build #84894 has finished for PR 19971 at commit
|
| CodeGenerator.compile(code) | ||
| } catch { | ||
| case e: Exception => | ||
| logError(s"failed to compile: $e", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this log? I feel a bit meaningless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no strong preference. I followed the following code in CodeGenerator.scala
val maxCodeSize = try {
evaluator.cook("generated.java", code.body)
updateAndGetCompilationStats(evaluator)
} catch {
case e: InternalCompilerException =>
val msg = s"failed to compile: $e"
logError(msg, e)
val maxLines = SQLConf.get.loggingMaxLinesForCodegen
logInfo(s"\n${CodeFormatter.format(code, maxLines)}")
throw new InternalCompilerException(msg, e)
case e: CompileException =>
val msg = s"failed to compile: $e"
logError(msg, e)
val maxLines = SQLConf.get.loggingMaxLinesForCodegen
logInfo(s"\n${CodeFormatter.format(code, maxLines)}")
throw new CompileException(msg, e.getLocation)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya, I know though, the logging is there because compilation errors lead to the fallbacks and then the exception messages are not passed to users. But, in this test case, we explicitly throw an exception below with the message. So, I said a bit meaningless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CodeGenerator is production code, we usually don't log in tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can combine it with the exception error message
val msg =
s"""
|failed to compile:
|Subtree: ...
"""
|
Test build #84906 has finished for PR 19971 at commit
|
| * without hitting the max iteration threshold. | ||
| */ | ||
| class TPCDSQuerySuite extends QueryTest with SharedSQLContext with BeforeAndAfterAll { | ||
| class TPCDSQuerySuite extends QueryTest with SharedSQLContext with BeforeAndAfterAll with Logging { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: drop Logging and the import
| |$subtree | ||
| |Generated code: | ||
| |${CodeFormatter.format(code)} | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stripMargin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this should be generated by IDE automatically...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, while I am using IntelliJ...
|
LGTM |
|
Test build #84913 has finished for PR 19971 at commit
|
|
Test build #84917 has finished for PR 19971 at commit
|
|
thanks, merging to master! |
|
For safety, I also add a TPC-H query suite to do the same checks like TPC-DS, #19982 |
What changes were proposed in this pull request?
This PR adds check whether Java code generated by Catalyst can be compiled by
janinocorrectly or not intoTPCDSQuerySuite. Before this PR, this suite only checks whether analysis can be performed correctly or not.This check will be able to avoid unexpected performance degrade by interpreter execution due to a Java compilation error.
How was this patch tested?
Existing a test case, but updated it.