feat(postgres): add partitioned table support #297

abhi-airspace-intelligence · 2025-08-28T20:25:54Z

What kind of change does this PR introduce?

Adds support for partitioned tables. It does this through a very convoluted sql query that I basically guess-and-checked until it worked. It basically uses a bunch of heuristics to confirm that if a table is a child table of a partitioned table, it's allowed to treat their PKs as its own.

What is the current behavior?

Fixes #296

What is the new behavior?

Partitioned tables are treated as a single table, and replicate as one unit.

Additional context

Note that this is stacked on top of my other MR because I found this bug after fixing that bug. I will also note that I happened to find a race condition in this PR where a replicate worker tries to push events while a sync worker is waiting for a schema. I solved this by simply blocking all workers until all schemas are fixed, but this is a suboptimal solution, to say the least.

iambriccardo

A few nits:

We generally use only lowercase queries (this is a standard we adopt at Supabase).
Lets first merge the PR related to PG 14 support and then rebase it on this one to have a clean diff.

etl/tests/failpoints/pipeline_test.rs

etl-postgres/src/tokio/test_utils.rs

iambriccardo · 2025-09-01T07:27:40Z

etl/src/replication/client.rs

-            "select schemaname, tablename from pg_publication_tables where pubname = {};",
+            "select pt.schemaname, pt.tablename from pg_publication_tables pt
+         join pg_class c on c.relname = pt.tablename
+         join pg_namespace n on n.oid = c.relnamespace AND n.nspname = pt.schemaname


Is there the need to perform this JOIN? Seems like we don't really need to validate namespaces.

There wasn't, no. I think this was leftover code that I included attempting to fix the race issue I encountered. I cannot replicate it anymore.

abhi-airspace-intelligence · 2025-09-02T14:22:49Z

This MR has mostly been rewritten, as I discovered there wasn't an easy way to capture new partitions being created. I did discover publish_via_partition_root, which treats a partitioned table as the "root" table in the eyes of logical replication which does exactly the behavior I want (and is likely what consumers want as well).

abhi-airspace-intelligence · 2025-09-02T15:56:39Z

Tests are broken again on the partial branch. Can be solved by this patch: uni-intelligence@c0ec81f

abhi-airspace-intelligence · 2025-09-29T20:56:18Z

This is ready for review! Note that I did cherry-pick #361 on top of this since I was running non-bigquery tests open, but happy to revert if desired.

iambriccardo · 2025-09-30T17:31:33Z

This is ready for review! Note that I did cherry-pick #361 on top of this since I was running non-bigquery tests open, but happy to revert if desired.

Thanks for the PR, once I have time, I will go through it. We are currently busy with other work so it might take some time.

iambriccardo · 2025-09-30T17:31:46Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

@codex fix this CI failure
@codex address that feedback

chatgpt-codex-connector · 2025-09-30T17:36:55Z

etl/src/replication/client.rs

+            pub_tables as (
+                select r.prrelid as oid
+                from pg_publication_rel r
+                join pg_publication p on p.oid = r.prpubid
+                where p.pubname = {pub} and (select has from has_rel)
+                union all
+                select c.oid
+                from pg_publication_tables pt
+                join pg_class c on c.relname = pt.tablename
+                where pt.pubname = {pub} and not (select has from has_rel)


[P1] Join publication tables without schema

The new recursive query in PgReplicationClient::get_publication_table_ids now joins pg_publication_tables to pg_class only on relname. If a publication includes two tables with the same name in different schemas (e.g. sales.users and archive.users), this join can resolve to the wrong OID and the client will subscribe to or copy data from the wrong table. The previous implementation joined on both schema and table name and avoided this ambiguity.

Useful? React with 👍 / 👎.

abhi-airspace-intelligence · 2025-10-01T15:46:03Z

This is ready for review! Note that I did cherry-pick #361 on top of this since I was running non-bigquery tests open, but happy to revert if desired.

Thanks for the PR, once I have time, I will go through it. We are currently busy with other work so it might take some time.

No worries! This library has been treating me well, appreciate all the hard work.

iambriccardo · 2025-10-02T16:06:39Z

If you find any other issues, keep them coming!

iambriccardo · 2025-10-08T22:27:47Z

Quickly coming back to this PR since I am doing some research on partitioned tables, I assume this works only when publish_via_partition_root=true is specified on the publication right?

abhi-airspace-intelligence · 2025-10-13T15:40:05Z

Quickly coming back to this PR since I am doing some research on partitioned tables, I assume this works only when publish_via_partition_root=true is specified on the publication right?

Yep, that's right. I originally tried it doing it the other way by replicating each table but it was very hacky. Postgres itself seems to prefer setting this and encourages it in their docs, it just isn't the default for backwards-compatibility

Signed-off-by: Abhi Agarwal <[email protected]>

iambriccardo · 2025-10-21T10:24:11Z

Hi @abhi-airspace-intelligence, I am now focusing on this task specifically and I am taking over into a PR inspired by this. Will mention your work within that PR, thanks!

abhi-airspace-intelligence · 2025-10-22T15:47:17Z

Thanks! I hope your solution is better than mine, as very much a non-postgres expert!

iambriccardo · 2025-10-22T15:49:31Z

Thanks to you!

We are currently exploring and clarifying semantics for the partitioning. Just a question to you, what would you expect to the replication stream if you detach a table from a partition?

The data of the partition is deleted downstream.
The data is kept downstream.
The data is kept downstream and the detached table is added as standalone table (only if FOR ALL TABLES or FOR TABLES in a specific schema is enabled on the publication).

abhi-airspace-intelligence · 2025-10-22T19:40:52Z

We have our production databases set up such that we periodically detach partitions while backing up data via debezium to BigQuery in environments where we have access to BigQuery. In this PR, when a partition is detached, I've noticed that it creates no events whatsoever, it's basically as if the data went "poof!". I heavily rely on this, but it's an open question as to whether it's the correct behavior.

I think of partitioning as a way to prevent unbounded data growth, essentially. Since our company's data is heavily event-stream driven (and mostly append-only), this works well for us. Perhaps it's not correct for other usecases.

iambriccardo · 2025-10-24T07:25:20Z

We have our production databases set up such that we periodically detach partitions while backing up data via debezium to BigQuery in environments where we have access to BigQuery. In this PR, when a partition is detached, I've noticed that it creates no events whatsoever, it's basically as if the data went "poof!". I heavily rely on this, but it's an open question as to whether it's the correct behavior.

I think of partitioning as a way to prevent unbounded data growth, essentially. Since our company's data is heavily event-stream driven (and mostly append-only), this works well for us. Perhaps it's not correct for other usecases.

That's good to know!

It seems like this is a wanted behavior from the multiple people I talked with and semantically it makes sense. Detaching a partition doesn't really mean deleting the data so it's fair that Postgres behaves in the way it does.

I will close this PR in favor of this one: #410 which takes over

abhi-airspace-intelligence requested a review from a team as a code owner August 28, 2025 20:25

abhi-airspace-intelligence changed the title ~~Add partitioned table support~~ feat(postgres): add partitioned table support Aug 28, 2025

abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch 4 times, most recently from e65c749 to 230f109 Compare August 29, 2025 18:20

iambriccardo reviewed Sep 1, 2025

View reviewed changes

abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch 6 times, most recently from 4670481 to c983ec9 Compare September 2, 2025 14:21

abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch 4 times, most recently from 96cc8de to 6522d79 Compare September 10, 2025 20:23

abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch 5 times, most recently from 67cc277 to 4893b04 Compare September 17, 2025 15:36

abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch from 4893b04 to 6d5132b Compare September 22, 2025 15:04

abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch 2 times, most recently from 9b70c19 to 82436de Compare September 29, 2025 20:41

chatgpt-codex-connector bot reviewed Sep 30, 2025

View reviewed changes

abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch from c41ed85 to 632d380 Compare October 13, 2025 15:38

abhi-airspace-intelligence requested a review from a team as a code owner October 13, 2025 15:38

abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch from f5665e1 to d6d5de4 Compare October 20, 2025 01:43

abhi-airspace-intelligence added 5 commits October 19, 2025 21:49

Add support for partitioned tables

5198588

Signed-off-by: Abhi Agarwal <[email protected]>

Use cargo nextest to speed up runs

675a12f

Signed-off-by: Abhi Agarwal <[email protected]>

add retries to test for nondeterministic tests

6225918

Signed-off-by: Abhi Agarwal <[email protected]>

Add slow timeout

3205d56

Signed-off-by: Abhi Agarwal <[email protected]>

Export "sign" as well

55d2ae5

Signed-off-by: Abhi Agarwal <[email protected]>

abhi-airspace-intelligence force-pushed the abhi/add-partitioned-table-support branch from d6d5de4 to 55d2ae5 Compare October 20, 2025 01:50

iambriccardo closed this Oct 24, 2025

Uh oh!

feat(postgres): add partitioned table support #297

feat(postgres): add partitioned table support #297

Uh oh!

Conversation

abhi-airspace-intelligence commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What kind of change does this PR introduce?

What is the current behavior?

What is the new behavior?

Additional context

Uh oh!

iambriccardo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

iambriccardo Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

abhi-airspace-intelligence Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

abhi-airspace-intelligence commented Sep 2, 2025

Uh oh!

abhi-airspace-intelligence commented Sep 2, 2025

Uh oh!

abhi-airspace-intelligence commented Sep 29, 2025

Uh oh!

iambriccardo commented Sep 30, 2025

Uh oh!

iambriccardo commented Sep 30, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

abhi-airspace-intelligence commented Oct 1, 2025

Uh oh!

iambriccardo commented Oct 2, 2025

Uh oh!

iambriccardo commented Oct 8, 2025

Uh oh!

abhi-airspace-intelligence commented Oct 13, 2025

Uh oh!

iambriccardo commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abhi-airspace-intelligence commented Oct 22, 2025

Uh oh!

iambriccardo commented Oct 22, 2025

Uh oh!

abhi-airspace-intelligence commented Oct 22, 2025

Uh oh!

iambriccardo commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

abhi-airspace-intelligence commented Aug 28, 2025 •

edited

Loading

iambriccardo commented Oct 21, 2025 •

edited

Loading