-
Notifications
You must be signed in to change notification settings - Fork 0
address review comments #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make a method for it, so that we can reuse it in SparkSqlAstBuilder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
visitReplaceTableHeader simply return 3 false for these 3 properties. So it's simpler to use assert here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is bad practice for a method to assume it will get certain results from other methods. While using an assert handles correctness, if there is a change that violates the assertion, a user would get a nearly unusable error.
I think this change should be reverted so that the error messages are meaningful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, when table columns list is not specified, we ignore the partition columns with data type. It was fine before syntax merging, as there was no partition columns with data type in REPLACE TABLE. But now it's better to make it consistent with CREATE TABLE. I also added test to check it: https://github.com/rdblue/spark/pull/8/files#diff-b9e91f767e5562861565b0ce78759af3bcb7fff405a81e928894641147db2ae4R293
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a bit weird to print scala Option directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise the assert below is useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some refactoring, to avoid duplicated code between CREATE TABLE and CTAS, buildCatalogTable and buildHiveCatalogTable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me. I like that it should have fewer changes from master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method closely follow the original logic of creating hive table in SparkSqlAstBuilder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic here is roughly merging 3 serde infos: the one user-specified, the one inferred from STORED AS, and the default one.
| } | ||
| } | ||
|
|
||
| private def toStorageFormat( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can get rid of duplicated code here once we move CREATE TABLE LIKE and INSERT DIRECTORY to v2 command.
| operationNotAllowed("INSERT OVERWRITE DIRECTORY must be accompanied by path", ctx) | ||
| } | ||
|
|
||
| val default = HiveSerDe.getDefaultStorage(conf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why INSERT DIRECTORY considers the default serde info but CREATE TABLE LIKE does not. I'm going to fix it later and keep the behavior unchanged here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan, it mostly looks good but I did find a few things to change and at least one bug from using _._1 instead of _._2. Let me know when you've had time to update this.
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is bad practice for a method to assume it will get certain results from other methods. While using an assert handles correctness, if there is a change that violates the assertion, a user would get a nearly unusable error.
I think this change should be reverted so that the error messages are meaningful.
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statements.scala
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala
Outdated
Show resolved
Hide resolved
| serde = serdeInfo.serde.orElse(hiveSerde.serde), | ||
| properties = serdeInfo.serdeProperties) | ||
| case _ => | ||
| operationNotAllowed(s"STORED AS with file format '${serdeInfo.storedAs.get}'", ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is okay, but my intent was to avoid mixing parsing logic and validation where possible. The parser should return what was parsed to Spark, which should decide whether it is supported.
I don't think this is a blocker because it is in the SparkSqlParser instead of the one in catalyst. We can fix this when moving these commands to v2.
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me. I like that it should have fewer changes from master.
|
@rdblue thanks for the suggestions! Pushed a commit to update it, please take another look, thanks! |
| "CREATE OR REPLACE TEMPORARY TABLE ..., use CREATE TEMPORARY VIEW instead", | ||
| ctx) | ||
| operationNotAllowed("CREATE OR REPLACE TEMPORARY TABLE is not supported yet. " + | ||
| "Please use CREATE OR REPLACE TEMPORARY VIEW as an alternative.", ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should use the original error message, "CREATE OR REPLACE TEMPORARY TABLE ..., use CREATE TEMPORARY VIEW instead".
Using operationNotAllowed means that the message is prefixed with "Operation not allowed: ", so adding "is not supported yet." is not helpful and just makes the message harder to read. In addition, "yet" implies that this will be supported and there is no reason to do so.
| .getOrElse(StructType(partCols)) | ||
| if (temp) { | ||
| operationNotAllowed("CREATE TEMPORARY TABLE is not supported yet. " + | ||
| "Please use CREATE TEMPORARY VIEW as an alternative.", ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use the other error message, "CREATE TEMPORARY TABLE ... AS ..., use CREATE TEMPORARY VIEW instead".
As I noted on the similar case, "is not supported yet" is both redundant and misleading. I don't think that Spark intends to implement CREATE TEMPORARY TABLE. Even if it may be implemented, it has not been supported for years, so there is no value in implying that it will be supported.
Please update to use the simpler and clearer error mesage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only things that need to be fixed are the error messages that were changed and are now longer and less clear.
|
Thanks, @cloud-fan. I've merged this. |
No description provided.