-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-5753] [SQL] add JDBCRDD support for postgres types: uuid, hstore and arrays #4549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for doing this and sorry for the delay reviewing! Des H2 support any of these types? Would it be possible to add some tests? |
|
test this please |
|
@liancheng what is the status of the postgres docker tests. It would be great if we could verify these changes. |
|
Test build #28530 has finished for PR 4549 at commit
|
|
not sure how or where to write tests. esp for the postgres-specific stuff. I'd appreciate some help and direction for this if you'd like me to write the tests. h2 does have an array type that could be tested for the ArrayConversion. I'm guessing this goes into JDBCSuite.scala. |
|
We used to have docker tests for running postgres specific things but had to remove them for dependency reasons. @liancheng will have more information about the status there. For the array type, yeah, it would be great to add something in JDBCSuite.scala. Thanks! |
|
@marmbrus @lepfhty The integration tests were removed in #4872 because they cause dependency issue and are ignored from the very beginning anyway. I plan to but haven't migrated the JDBC integration tests to databricks/spark-integration-tests yet. Tracking the migration with databricks/spark-integration-tests#6. |
|
@liancheng, can you provide pointers here on how tests for postgres could be added? I'd like to merge this feature, but I'm worried about doing it without tests. I'd be fine resurrecting the tests in apache (now that the release is over maybe we have more time to fix the dependency issues) as this would let us test patches like this more easily. @JoshRosen any idea if jenkins would support running docker based tests in the near future? |
|
@marmbrus, I think that we should be able to get Docker set up on Jenkins soon, since I think it should be as simple as an |
|
+1 for UUID type support. |
|
Yeah, @lepfhty, very sorry to have let this go stale. If you have time to reopen this, I believe that we can try and get the integration test we have already written working again (now that our jenkin's cluster has docker support) and augment them to test your new code. Basically start by reverting this patch: 76b472f Then we can add tests for these specific python quirks. |
|
Can one of the admins verify this patch? |
|
Really unfortunate that this patch (which seems unlikely to cause regressions) was held "hostage" by the lack of testing capabilities. By this logic pretty much all JDBC-related changes should be disallowed. :( |
|
@lepfhty do you mind closing this PR? |
|
I'll try to rework this into JdbcDialects |
|
@lepfhty Has any progress been made on this? Can you point me to a PR/branch/JIRA issue/... that I can subscribe to? |
|
sorry i haven't gotten around to this. they changed their base classes. it might be easier to do as a spark-package... i could give it another try in a week or two |
|
I've given this a shot in #9137 |
PostgresQuirks:
JDBCRDD needed conversion functions (nonspecific to postgres) for ArrayType and MapType.
ResultSet#getArray(int).getArray()method, so it should generally work.ResultSet#getObject(int)returns ajava.util.Map<String,String>. Maybe this is too strong an assumption?Not sure how to run/write tests. Would appreciate pointers here.
Thanks!