Re: Improve pg_sync_replication_slots() to wait for primary to advance
shveta malik <shveta.malik@gmail.com>
From: shveta malik <shveta.malik@gmail.com>
To: Ajin Cherian <itsajin@gmail.com>
Cc: Japin Li <japinli@hotmail.com>,
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, Ashutosh Sharma <ashu.coek88@gmail.com>,
Amit Kapila <amit.kapila16@gmail.com>, PostgreSQL mailing lists <pgsql-hackers@postgresql.org>,
shveta malik <shveta.malik@gmail.com>
Date: 2025-11-04T06:23:34Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Enhance slot synchronization API to respect promotion signal.
- 4bed04d39566 17.10 landed
- 94efd308bcec 18.4 landed
- 1362bc33e025 19 (unreleased) landed
-
Fix inconsistent elevel in pg_sync_replication_slots() retry logic.
- f1ddaa15357f 19 (unreleased) landed
-
Refactor slot synchronization logic in slotsync.c.
- 788ec96d591d 19 (unreleased) landed
-
Fix intermittent BF failure in 040_standby_failover_slots_sync.
- b47c50e5667b 19 (unreleased) landed
-
Add retry logic to pg_sync_replication_slots().
- 0d2d4a0ec3ec 19 (unreleased) landed
-
Fix LOCK_TIMEOUT handling in slotsync worker.
- 04396eacd3fa 19 (unreleased) cited
-
Add slotsync skip statistics.
- 76b78721ca49 19 (unreleased) cited
>
> I have addressed the above comments in patch v21.
>
Thank You. Please find a few comments:
1)
+ fparams.slot_names = slot_names = NIL;
I think it is not needed to set slot_names to NIL.
2)
- WAIT_EVENT_REPLICATION_SLOTSYNC_MAIN);
+ WAIT_EVENT_REPLICATION_SLOTSYNC_PRIMARY_CATCHUP);
The new name does not seem appropriate. For the slotsync-worker case,
even when the primary is not behind, the worker still waits but it is
not waiting for primary to catch-up. I could not find a better name
except the original one 'WAIT_EVENT_REPLICATION_SLOTSYNC_MAIN'. We can
change the explanation to :
"Waiting in main loop of slot sync worker and slot sync API."
Or
"Waiting in main loop of slot synchronization."
If anyone has any better name suggestions, we can consider changing.
3)
+# Attempt to synchronize slots using API. The API will continue retrying
+# synchronization until the remote slot catches up with the locally reserved
+# position. The API will not return until this happens, to be able to make
+# further calls, call the API in a background process.
Shall we remove 'with the locally reserved position', it’s already
explained in the test header and the comment is good enough even
without it.
4)
+# Confirm log that the slot has been synced after becoming sync-ready.
Shall we just say:
Confirm from the log that the slot is sync-ready now.
5)
# Synchronize the primary server slots to the standby.
$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
@@ -945,6 +1007,7 @@ $subscriber1->safe_psql('postgres',
is( $standby1->safe_psql(
'postgres',
q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN
('lsub1_slot', 'snap_test_slot') AND synced AND NOT temporary;}
+
),
Redundant change.
thanks
Shveta