Skip to content

Conversation

@sureshthalamati
Copy link
Contributor

@sureshthalamati sureshthalamati commented Oct 19, 2015

This patch adds DB2 specific data type mappings for decfloat, real, xml , and timestamp with time zone (DB2Z specific type) types on read and for byte, short data types on write to the to jdbc data source DB2 dialect. Default mapping does not work for these types when reading/writing from DB2 database.

Added docker test, and a JDBC unit test case.

@rxin
Copy link
Contributor

rxin commented Oct 20, 2015

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Oct 20, 2015

Test build #1930 has finished for PR 9162 at commit ec8e546.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this mapping it to normal decimal with precision 31?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. precision lost here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this mapping DB2 will throw error if the precision of the value being written is > 31 during insert execution. But if the value scale is higher than 2 , value will rounded to scale of 2.

Other alternative was to let it fail with error when create table is executed (current behavior). I was not sure how common the decimal type system default (38, 18) is used to create the data frame. I thought it is better to map to DB2 max precision, instead of failing with error ,considering there is no way to write decimals of precision > 31 to DB2.

If you think it is better to fail on create , instead of surprises during execution. I will update the patch.

I am also working on creating pull request (SPARK-10849) to allow users to specify the target database column type. This will allow user to specify the decimal precision, and scale of their choice in these kind of scenarios.

@sureshthalamati
Copy link
Contributor Author

Thank you for reviewing patch , Reynold. Please let me know if it needs any more changes.

@rick-ibm
Copy link
Contributor

rick-ibm commented Nov 4, 2015

I think these are fine defaults. The work on SPARK-10849 gives fine-grained control to power users. LGTM. Thanks.

@sureshthalamati
Copy link
Contributor Author

@rxin Wondering if I need any additional work on this patch ? Thanks.

@JoshRosen
Copy link
Contributor

Hey @sureshthalamati, take a look at #9503, which we just merged to re-enable the Docker JDBC integration tests; you should be able to build on that patch to write integration tests that run against a real DB2 instance that's running in Docker.

@sureshthalamati
Copy link
Contributor Author

Thank you for the information , Josh. Luciano seems to working on creating PR for DB2 docker test setup. I will check with him , and incorporate the type mapping tests into the docker test case.

@lresende

@sureshthalamati sureshthalamati force-pushed the db2dialect_enhancements-spark-10655 branch from ec8e546 to 729fb85 Compare November 25, 2015 23:03
@lresende
Copy link
Member

I have a wip pr #9893, and an open question that I think @JoshRosen could help about the JDBC drivers for the other docker tests. Anyway, these two PRs are independent of each other, no ?

@sureshthalamati sureshthalamati force-pushed the db2dialect_enhancements-spark-10655 branch from 27eeba0 to 5d022be Compare May 21, 2016 00:26
@sureshthalamati
Copy link
Contributor Author

@JoshRosen @rxin

Updated the PR with the DB2 docker test case for the DB2 specific type mappings added in this PR. TIME STAMP WITH TIME ZONE is DB2Z specific , I could not add the docker test case for this data type.

Can you please review the updated PR.

@sureshthalamati
Copy link
Contributor Author

ping @JoshRosen @rxin

This issue was blocked due to docker relates issues for a while. If you can review the updated PR , that will be great.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a duplication of BooleanType line here ? probably copy + paste issue.

@lresende
Copy link
Member

Other then the minor comment around BooleanType, it LGTM.

@sureshthalamati
Copy link
Contributor Author

Thanks, Luciano. Addressed your comment.

@sureshthalamati
Copy link
Contributor Author

@JoshRosen @rxin Any suggestions to improve this fix to get it merged ?

@lresende
Copy link
Member

lresende commented Jun 3, 2016

Jenkins retest this please

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about DECFLOAT(16) and DECFLOAT(34)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing @gatorsmile . Added test cases for those two variations of the DECFLOAT types.

@SparkQA
Copy link

SparkQA commented Sep 26, 2016

Test build #65932 has finished for PR 9162 at commit f85c3d9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • assert(types(8).equals(\"class java.math.BigDecimal\"))
    • assert(types(9).equals(\"class java.math.BigDecimal\"))

@gatorsmile
Copy link
Member

@sureshthalamati Could you please retest it using DB2?

@gatorsmile
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jun 17, 2017

Test build #78198 has finished for PR 9162 at commit f85c3d9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • assert(types(8).equals(\"class java.math.BigDecimal\"))
  • assert(types(9).equals(\"class java.math.BigDecimal\"))

@sureshthalamati
Copy link
Contributor Author

sure @gatorsmile . Thanks.

@sureshthalamati sureshthalamati force-pushed the db2dialect_enhancements-spark-10655 branch from f85c3d9 to 8b8bc9a Compare June 21, 2017 00:31
@sureshthalamati
Copy link
Contributor Author

@gatorsmile I rebased and ran the DB2 docker test on my machine, it ran fine.

@SparkQA
Copy link

SparkQA commented Jun 21, 2017

Test build #78337 has finished for PR 9162 at commit 8b8bc9a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • assert(types(8).equals(\"class java.math.BigDecimal\"))
  • assert(types(9).equals(\"class java.math.BigDecimal\"))

@gatorsmile
Copy link
Member

The scope of this PR is just to the DB2 dialect. The risk is pretty small even if the fix might not cover all the scenarios. Thanks! @sureshthalamati

LGTM

@gatorsmile
Copy link
Member

Thanks! Merging to master.

@asfgit asfgit closed this in 9ce714d Jun 21, 2017
@sureshthalamati
Copy link
Contributor Author

Thank you @gatorsmile

robert3005 pushed a commit to palantir/spark that referenced this pull request Jun 29, 2017
…alect.

This patch adds DB2 specific data type mappings for decfloat, real, xml , and timestamp with time zone (DB2Z specific type)  types on read and for byte, short data types  on write to the to jdbc data source DB2 dialect. Default mapping does not work for these types when reading/writing from DB2 database.

Added docker test, and a JDBC unit test case.

Author: sureshthalamati <[email protected]>

Closes apache#9162 from sureshthalamati/db2dialect_enhancements-spark-10655.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants