-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-19235] [SQL] [TESTS] Enable Test Cases in DDLSuite with Hive Metastore #16592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| }.getMessage | ||
| assert(e.contains("Hive support is required to use CREATE Hive TABLE AS SELECT")) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above four cases are copied from the existing ones in DDLSuites. These test cases only makes sense to InMemoryCatalog.
|
|
||
| test("drop table - data source table") { | ||
| testDropTable(isDatasourceTable = true) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above 10 test cases are currently running with InMemoryCatalog only. The reason is HiveExternalCatalog does not allow users to change the table provider from hive to the others. In the future PRs, we can fix it.
|
cc @cloud-fan I think we really need to do this ASAP for improving the test case coverage in DDL commands, when I do the PR: #16587 |
|
Test build #71420 has finished for PR 16592 at commit
|
|
can we wait for #16597? |
|
Sure, no problem. |
|
retest this please |
|
Test build #71526 has finished for PR 16592 at commit
|
|
retest this please |
|
Test build #71651 has finished for PR 16592 at commit
|
|
ping @cloud-fan |
|
|
||
| private val escapedIdentifier = "`(.+)`".r | ||
|
|
||
| def normalizeCatalogTable(table: CatalogTable): CatalogTable = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we only need this in HiveCatalogedDDLSuite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Let me move it to HiveCatalogedDDLSuite
| intercept[DatabaseAlreadyExistsException] { | ||
| sql(s"CREATE DATABASE $dbName") | ||
| if (isUsingHiveMetastore) { | ||
| val e = intercept[AnalysisException] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can write intercept[AnalysisException] and leave a TODO saying that HiveExternalCatalog should throw DatabaseAlreadyExistsException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
| val dbPath = new Path(spark.sessionState.conf.warehousePath) | ||
| s"${dbPath.toUri.getPath.stripSuffix("/")}/$dbName.db" | ||
| } else { | ||
| s"spark-warehouse/$dbName.db" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why hardcode it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
: ) copied from the original place. Let me clean it. Thanks!
|
Test build #71740 has finished for PR 16592 at commit
|
|
@gatorsmile is this PR going to be merged soon, since the ALTER ADD column PR #16626 also depends on this to create test cases for InMemoryCatalog. Thanks! |
|
Let me continue the work now |
|
Test build #73975 has started for PR 16592 at commit |
| } | ||
| } | ||
|
|
||
| Seq("a b", "a:b", "a%b").foreach { specialChars => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test case is duplicate from the one in HiveCatalogedDDLSuite. Removed.
| } | ||
|
|
||
| Seq("a b", "a:b", "a%b").foreach { specialChars => | ||
| test(s"location uri contains $specialChars for database") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test case is duplicate from the one in HiveCatalogedDDLSuite. Removed.
|
Test build #74037 has finished for PR 16592 at commit
|
|
cc @cloud-fan |
|
Test build #74116 has finished for PR 16592 at commit
|
| val expectedLocation = makeQualifiedPath(s"spark-warehouse/$dbName.db") | ||
| val dbPath = new Path(spark.sessionState.conf.warehousePath) | ||
| val expectedDBLocation = s"file:${dbPath.toUri.getPath.stripSuffix("/")}/$dbName.db" | ||
| val expectedDBUri = CatalogUtils.stringToURI(expectedDBLocation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about
val expectedDBPath = new Path(spark.sessionState.conf.warehousePath, s"$dbName.db").toURI
| name: TableIdentifier): CatalogTable = { | ||
| val storage = | ||
| CatalogStorageFormat( | ||
| locationUri = Some(catalog.defaultTablePath(name)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: CatalogStorageFormat.copy(locationUri = xxx)
| val dbPath = new Path(spark.sessionState.conf.warehousePath) | ||
| val expectedDBLocation = | ||
| s"file:${dbPath.toUri.getPath.stripSuffix("/")}/$dbNameWithoutBackTicks.db" | ||
| val expectedDBUri = CatalogUtils.stringToURI(expectedDBLocation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
| val dbNameWithoutBackTicks = cleanIdentifier(dbName) | ||
| val location = makeQualifiedPath(s"spark-warehouse/$dbNameWithoutBackTicks.db") | ||
| val dbPath = new Path(spark.sessionState.conf.warehousePath) | ||
| val expectedDBLocation = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
| sql(s"DROP DATABASE $dbName") | ||
| }.getMessage | ||
| assert(message.contains(s"Database '$dbNameWithoutBackTicks' not found")) | ||
| if (isUsingHiveMetastore) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a TODO to unify the exception?
| sql(s"DROP DATABASE $dbName RESTRICT") | ||
| }.getMessage | ||
| assert(message.contains(s"Database '$dbName' is not empty. One or more tables exist")) | ||
| if (isUsingHiveMetastore) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a TODO to unify the exception?
| assert(catalog.getPartition(tableIdent, part1).parameters("numFiles") == "1") | ||
| assert(catalog.getPartition(tableIdent, part2).parameters("numFiles") == "2") | ||
| if (!isUsingHiveMetastore) { | ||
| assert(catalog.getPartition(tableIdent, part1).parameters("numFiles") == "1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why only test it for InMemoryExternalCatalog?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After ALTER TABLE, the statistics info of the first partition is removed by Hive megastore. : (
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me write a comment in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| } | ||
|
|
||
| private def testDropTable(isDatasourceTable: Boolean): Unit = { | ||
| protected def testDropTable(isDatasourceTable: Boolean): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's only used in InMemoryCatalogedDDLSuite,we should move this method there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not only for InMemoryCatalogedDDLSuite . We have another call here.
| "maxFileSize", | ||
| "minFileSize", | ||
| // EXTERNAL is not non-deterministic, but it is filtered out for external tables. | ||
| "EXTERNAL" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is gone now.
|
Test build #74160 has finished for PR 16592 at commit
|
|
retest this please |
|
Test build #74213 has finished for PR 16592 at commit
|
| val partitionLocation = if (isUsingHiveMetastore) { | ||
| val tableLocation = catalog.getTableMetadata(tableIdent).storage.locationUri | ||
| assert(tableLocation.isDefined) | ||
| makeQualifiedPath(tableLocation.get + "/paris") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should avoid hardcoding file seperator
|
Test build #74223 has finished for PR 16592 at commit
|
|
great~ waiting this to merge and fix my confilicts~ |
|
ping @cloud-fan |
| } | ||
|
|
||
| private def getDBPath(dbName: String): URI = { | ||
| val warehousePath = s"file:${spark.sessionState.conf.warehousePath.stripPrefix("file:")}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can call makeQualifiedPath here. We can fix it in follow up.
| fs.makeQualified(hadoopPath).toUri | ||
| } | ||
|
|
||
| def makeQualifiedPath(path: Path): URI = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually we don't need this, we already have def makeQualifiedPath(path: String): URI. let's remove it in follow up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fixed the above two issues in #17095
|
great work! merging to master! |
| } | ||
| // TODO(gatorsmile): fix the bug in alter table set location. | ||
| // if (isUsingHiveMetastore) { | ||
| // assert(storageFormat.properties.get("path") === expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to fix this bug and satify this test case?
When porting these test cases, a bug of SET LOCATION is found. path is not set when the location is changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. We need to fix it. Thanks for working on it.
What changes were proposed in this pull request?
So far, the test cases in DDLSuites only verify the behaviors of InMemoryCatalog. That means, they do not cover the scenarios using HiveExternalCatalog. Thus, we need to improve the existing test suite to run these cases using Hive metastore.
When porting these test cases, a bug of
SET LOCATIONis found.pathis not set when the location is changed.After this PR, a few changes are made, as summarized below,
DDLSuitebecomes an abstract class. BothInMemoryCatalogedDDLSuiteandHiveCatalogedDDLSuiteextend it.InMemoryCatalogedDDLSuiteis usingInMemoryCatalog.HiveCatalogedDDLSuiteis usingHiveExternalCatalog.InMemoryCatalogedDDLSuitecontains all the existing test cases inDDLSuite.HiveCatalogedDDLSuitecontains a subset ofDDLSuite. The following test cases are excluded:InMemoryCatalog:TODO : in the future PRs, we need to remove
HiveDDLSuiteand move the test cases to eitherDDLSuite,InMemoryCatalogedDDLSuiteorHiveCatalogedDDLSuite.How was this patch tested?
N/A