Skip to content

Conversation

@EnricoMi
Copy link
Contributor

What changes were proposed in this pull request?

This is a follow-up on #16685 and #16692.

Implements upsert mode for SaveMode.Append of the MySql, MsSql, and Postgres JDBC source.

See #41611 for an alternative using the MERGE INTO command (not supported by MySql).

Why are the changes needed?

The JDBC writer only supports either truncating the existing table or inserting. Duplicates, i.e. rows with identical values in the primary or unique index columns, cause an exception, permitting updating existing and inserting new rows.

Re-evaluating a partition due to executor loss will insert rows that have been inserted in an earlier attempt, which kills the entier Spark job.

Does this PR introduce any user-facing change?

This adds upsert and upsertKeyColumns options for SaveMode.Append of the JDBC source.

How was this patch tested?

Tests in JdbcSuite and integration suites.

@EnricoMi
Copy link
Contributor Author

@MaxGekk thanks for the comments, all addressed in a03345c0.

@LuciferYang
Copy link
Contributor

LuciferYang commented Mar 6, 2025

cc @beliefer and @yaooqinn FYI

truncateTable(conn, options)
val tableSchema = JdbcUtils.getSchemaOption(conn, options)
saveTable(df, tableSchema, isCaseSensitive, options)
saveTable(df, tableSchema, isCaseSensitive, upsert, options)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks strange that apply upsert if SaveMode is Overwrite here.

dropTable(conn, options.table, options)
createTable(conn, options.table, df.schema, isCaseSensitive, options)
saveTable(df, Some(df.schema), isCaseSensitive, options)
saveTable(df, Some(df.schema), isCaseSensitive, upsert, options)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

@beliefer beliefer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea. It seems we should add a new SaveMode Upsert.

@github-actions
Copy link

github-actions bot commented Aug 7, 2025

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Aug 7, 2025
@github-actions github-actions bot closed this Aug 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants