Re: Improve pg_sync_replication_slots() to wait for primary to advance
shveta malik <shveta.malik@gmail.com>
From: shveta malik <shveta.malik@gmail.com>
To: Ajin Cherian <itsajin@gmail.com>
Cc: Japin Li <japinli@hotmail.com>,
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, Ashutosh Sharma <ashu.coek88@gmail.com>,
Amit Kapila <amit.kapila16@gmail.com>, PostgreSQL mailing lists <pgsql-hackers@postgresql.org>,
shveta malik <shveta.malik@gmail.com>
Date: 2025-10-31T05:51:16Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Enhance slot synchronization API to respect promotion signal.
- 4bed04d39566 17.10 landed
- 94efd308bcec 18.4 landed
- 1362bc33e025 19 (unreleased) landed
-
Fix inconsistent elevel in pg_sync_replication_slots() retry logic.
- f1ddaa15357f 19 (unreleased) landed
-
Refactor slot synchronization logic in slotsync.c.
- 788ec96d591d 19 (unreleased) landed
-
Fix intermittent BF failure in 040_standby_failover_slots_sync.
- b47c50e5667b 19 (unreleased) landed
-
Add retry logic to pg_sync_replication_slots().
- 0d2d4a0ec3ec 19 (unreleased) landed
-
Fix LOCK_TIMEOUT handling in slotsync worker.
- 04396eacd3fa 19 (unreleased) cited
-
Add slotsync skip statistics.
- 76b78721ca49 19 (unreleased) cited
On Fri, Oct 31, 2025 at 11:04 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Thu, Oct 30, 2025 at 3:48 PM Ajin Cherian <itsajin@gmail.com> wrote:
> >
> >
> > Thanks for your review, Japin. Here's patch v20 addressing the comments.
> >
>
> Thank You for the patch. Please find a few comment son test:
>
>
> 1)
> +# until the slot becomes sync-ready (when the standby catches up to the
> +# slot's restart_lsn).
>
> I think it should be 'when the primary server catches up' or 'when the
> remote slot catches up with the locally reserved position.'
>
> 2)
> +# Attempt to synchronize slots using API. This will initially fail because
> +# the slot is not yet sync-ready (standby hasn't caught up to slot's
> restart_lsn),
> +# but the API will wait and retry. Call the API in a background process.
>
> a)
> 'This will initially fail ' seems like the API will give an error,
> which is not the case
>
> b) 'standby hasn't caught up to slot's restart_lsn' is not correct.
>
> We can rephrase to:
> # Attempt to synchronize slots using the API. The API will continue
> retrying synchronization until the remote slot catches up with the
> locally reserved position.
>
> 3)
> +# Enable the Subscription, so that the slot catches up
>
> slot --> remote slot
>
> 4)
> +# Create xl_running_xacts records on the primary for which the
> standby is waiting
>
> Shall we rephrase to below or anything better if you have?:
> Create xl_running_xacts on the primary to speed up restart_lsn advancement.
>
> 5)
> +# Confirm that the logical failover slot is created on the standby and is
> +# flagged as 'synced'
>
> Suggestion:
> Verify that the logical failover slot is created on the standby,
> marked as 'synced', and persisted.
>
> (It is important to mention persisted because even temporary slot is
> marked as synced)
>
Shall we remove this change as it does not belong to the current patch
directly? I think it was a suggestion earlier, but we shall remove it.
6)
-# Confirm the synced slot 'lsub1_slot' is retained on the new primary
+# Confirm that the synced slots 'lsub1_slot' and 'snap_test_slot' are
retained on the new primary
is( $standby1->safe_psql(
'postgres',
q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN
('lsub1_slot', 'snap_test_slot') AND synced AND NOT temporary;}
+
thanks
Shveta