Skip to content

Conversation

@abhi-airspace-intelligence
Copy link
Contributor

@abhi-airspace-intelligence abhi-airspace-intelligence commented Aug 28, 2025

What kind of change does this PR introduce?

Adds support for partitioned tables. It does this through a very convoluted sql query that I basically guess-and-checked until it worked. It basically uses a bunch of heuristics to confirm that if a table is a child table of a partitioned table, it's allowed to treat their PKs as its own.

What is the current behavior?

Fixes #296

What is the new behavior?

Partitioned tables are treated as a single table, and replicate as one unit.

Additional context

Note that this is stacked on top of my other MR because I found this bug after fixing that bug. I will also note that I happened to find a race condition in this PR where a replicate worker tries to push events while a sync worker is waiting for a schema. I solved this by simply blocking all workers until all schemas are fixed, but this is a suboptimal solution, to say the least.

@abhi-airspace-intelligence abhi-airspace-intelligence requested a review from a team as a code owner August 28, 2025 20:25
@abhi-airspace-intelligence abhi-airspace-intelligence changed the title Add partitioned table support feat(postgres): add partitioned table support Aug 28, 2025
@abhi-airspace-intelligence abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch 4 times, most recently from e65c749 to 230f109 Compare August 29, 2025 18:20
Copy link
Contributor

@iambriccardo iambriccardo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits:

  • We generally use only lowercase queries (this is a standard we adopt at Supabase).
  • Lets first merge the PR related to PG 14 support and then rebase it on this one to have a clean diff.

"select schemaname, tablename from pg_publication_tables where pubname = {};",
"select pt.schemaname, pt.tablename from pg_publication_tables pt
join pg_class c on c.relname = pt.tablename
join pg_namespace n on n.oid = c.relnamespace AND n.nspname = pt.schemaname
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there the need to perform this JOIN? Seems like we don't really need to validate namespaces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There wasn't, no. I think this was leftover code that I included attempting to fix the race issue I encountered. I cannot replicate it anymore.

@abhi-airspace-intelligence abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch 6 times, most recently from 4670481 to c983ec9 Compare September 2, 2025 14:21
@abhi-airspace-intelligence
Copy link
Contributor Author

This MR has mostly been rewritten, as I discovered there wasn't an easy way to capture new partitions being created. I did discover publish_via_partition_root, which treats a partitioned table as the "root" table in the eyes of logical replication which does exactly the behavior I want (and is likely what consumers want as well).

@abhi-airspace-intelligence
Copy link
Contributor Author

Tests are broken again on the partial branch. Can be solved by this patch: uni-intelligence@c0ec81f

@abhi-airspace-intelligence abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch 4 times, most recently from 96cc8de to 6522d79 Compare September 10, 2025 20:23
@abhi-airspace-intelligence abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch 5 times, most recently from 67cc277 to 4893b04 Compare September 17, 2025 15:36
@abhi-airspace-intelligence abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch from 4893b04 to 6d5132b Compare September 22, 2025 15:04
@abhi-airspace-intelligence abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch 2 times, most recently from 9b70c19 to 82436de Compare September 29, 2025 20:41
@abhi-airspace-intelligence
Copy link
Contributor Author

This is ready for review! Note that I did cherry-pick #361 on top of this since I was running non-bigquery tests open, but happy to revert if desired.

@iambriccardo
Copy link
Contributor

This is ready for review! Note that I did cherry-pick #361 on top of this since I was running non-bigquery tests open, but happy to revert if desired.

Thanks for the PR, once I have time, I will go through it. We are currently busy with other work so it might take some time.

@iambriccardo
Copy link
Contributor

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

Comment on lines +423 to +433
pub_tables as (
select r.prrelid as oid
from pg_publication_rel r
join pg_publication p on p.oid = r.prpubid
where p.pubname = {pub} and (select has from has_rel)
union all
select c.oid
from pg_publication_tables pt
join pg_class c on c.relname = pt.tablename
where pt.pubname = {pub} and not (select has from has_rel)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Join publication tables without schema

The new recursive query in PgReplicationClient::get_publication_table_ids now joins pg_publication_tables to pg_class only on relname. If a publication includes two tables with the same name in different schemas (e.g. sales.users and archive.users), this join can resolve to the wrong OID and the client will subscribe to or copy data from the wrong table. The previous implementation joined on both schema and table name and avoided this ambiguity.

Useful? React with 👍 / 👎.

@abhi-airspace-intelligence
Copy link
Contributor Author

This is ready for review! Note that I did cherry-pick #361 on top of this since I was running non-bigquery tests open, but happy to revert if desired.

Thanks for the PR, once I have time, I will go through it. We are currently busy with other work so it might take some time.

No worries! This library has been treating me well, appreciate all the hard work.

Copy link
Contributor

If you find any other issues, keep them coming!

@iambriccardo
Copy link
Contributor

Quickly coming back to this PR since I am doing some research on partitioned tables, I assume this works only when publish_via_partition_root=true is specified on the publication right?

@abhi-airspace-intelligence abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch from c41ed85 to 632d380 Compare October 13, 2025 15:38
@abhi-airspace-intelligence abhi-airspace-intelligence requested a review from a team as a code owner October 13, 2025 15:38
@abhi-airspace-intelligence
Copy link
Contributor Author

Quickly coming back to this PR since I am doing some research on partitioned tables, I assume this works only when publish_via_partition_root=true is specified on the publication right?

Yep, that's right. I originally tried it doing it the other way by replicating each table but it was very hacky. Postgres itself seems to prefer setting this and encourages it in their docs, it just isn't the default for backwards-compatibility

@abhi-airspace-intelligence abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch from f5665e1 to d6d5de4 Compare October 20, 2025 01:43
@abhi-airspace-intelligence abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch from d6d5de4 to 55d2ae5 Compare October 20, 2025 01:50
@iambriccardo
Copy link
Contributor

iambriccardo commented Oct 21, 2025

Hi @abhi-airspace-intelligence, I am now focusing on this task specifically and I am taking over into a PR inspired by this. Will mention your work within that PR, thanks!

@abhi-airspace-intelligence
Copy link
Contributor Author

Thanks! I hope your solution is better than mine, as very much a non-postgres expert!

Copy link
Contributor

Thanks to you!

We are currently exploring and clarifying semantics for the partitioning. Just a question to you, what would you expect to the replication stream if you detach a table from a partition?

  1. The data of the partition is deleted downstream.
  2. The data is kept downstream.
  3. The data is kept downstream and the detached table is added as standalone table (only if FOR ALL TABLES or FOR TABLES in a specific schema is enabled on the publication).

@abhi-airspace-intelligence
Copy link
Contributor Author

We have our production databases set up such that we periodically detach partitions while backing up data via debezium to BigQuery in environments where we have access to BigQuery. In this PR, when a partition is detached, I've noticed that it creates no events whatsoever, it's basically as if the data went "poof!". I heavily rely on this, but it's an open question as to whether it's the correct behavior.

I think of partitioning as a way to prevent unbounded data growth, essentially. Since our company's data is heavily event-stream driven (and mostly append-only), this works well for us. Perhaps it's not correct for other usecases.

@iambriccardo
Copy link
Contributor

We have our production databases set up such that we periodically detach partitions while backing up data via debezium to BigQuery in environments where we have access to BigQuery. In this PR, when a partition is detached, I've noticed that it creates no events whatsoever, it's basically as if the data went "poof!". I heavily rely on this, but it's an open question as to whether it's the correct behavior.

I think of partitioning as a way to prevent unbounded data growth, essentially. Since our company's data is heavily event-stream driven (and mostly append-only), this works well for us. Perhaps it's not correct for other usecases.

That's good to know!

It seems like this is a wanted behavior from the multiple people I talked with and semantically it makes sense. Detaching a partition doesn't really mean deleting the data so it's fair that Postgres behaves in the way it does.

I will close this PR in favor of this one: #410 which takes over

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Partitioned tables do not work directly, due to lack of PK enforcement

2 participants