Re: Orphaned records in pg_replication_origin_status after subscription drop
Michael Paquier <michael@paquier.xyz>
From: Michael Paquier <michael@paquier.xyz>
To: Masahiko Sawada <sawada.mshk@gmail.com>
Cc: "Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>, Amit Kapila <amit.kapila16@gmail.com>, Postgres hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-12-23T07:20:45Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Update comments to reflect changes in 8e0d32a4a1.
- 98e8fe57c24f 19 (unreleased) landed
- 214c17bd623e 18.2 landed
- a1cdb8120110 17.8 landed
-
Fix orphaned origin in shared memory after DROP SUBSCRIPTION
- 25a04aa52b52 14.21 landed
- 90d1beef62de 15.16 landed
- e22e9ab0cd45 16.12 landed
- e063ccc722e4 17.8 landed
- b07c326192d0 18.2 landed
- 8e0d32a4a1bf 19 (unreleased) landed
On Tue, Dec 23, 2025 at 08:21:39AM +0900, Michael Paquier wrote: > Creating the origin at the end of the same transaction that sets the > state to SUBREL_STATE_DATASYNC seems sensible here. I'll spend a > couple of extra hours playing with all that across all the branches, > see if I can wrap it. This includes some more error injection to > cross-check the state of all these transactions with the states in > the catalogs while we drop the subscription. While looking at the state of the code across the six branches where we need to fix this, there were two points that have been slightly sticky on my mind: 1) check_old_cluster_subscription_state() in pg_upgrade's check.c, where we have a set of comments dealing with the reasons why only the initial and ready states are allowed for the transfers of the relation data. The patch only makes the origin visible in the catalogs for one extra state, DATASYNC now, meaning that we have nothing to care about. I was wondering about the comment of DATASYNC being slightly incorrect now because it only mentions a replication slot. Do you think that we should adjust that as well to mention the case of origins, knowing that their names are also based on subscription OIDs whose value change across upgrades? That would not apply for relation IDs as these are fixed, but this feels a bit inexact now for the branches where this code applies. 2) 88f488319bac and f6c5edb8abca, that have slightly changed the replication origin logic in v16~. Still, after looking at the code for a couple of hours, as well as testing the DROP SUBSCRIPTION stability while enforcing ERRORs in the tablesync worker to play with the shmem state vs the catalog state, I could not find a hole in the v15 and v14 code caused by the fact that we have the catalog state for the origin available one state earlier. At the end I think that we are OK here in light of these two commits. The comment could be improved, but I don't see that as a reason good enough to fix the issue first, so I have applied that down to v14 with the couple of conflicts handled on the way. -- Michael