Re: Improve pg_sync_replication_slots() to wait for primary to advance
Ajin Cherian <itsajin@gmail.com>
From: Ajin Cherian <itsajin@gmail.com>
To: shveta malik <shveta.malik@gmail.com>
Cc: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>,
Amit Kapila <amit.kapila16@gmail.com>, PostgreSQL mailing lists <pgsql-hackers@postgresql.org>
Date: 2025-08-26T04:28:30Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Enhance slot synchronization API to respect promotion signal.
- 4bed04d39566 17.10 landed
- 94efd308bcec 18.4 landed
- 1362bc33e025 19 (unreleased) landed
-
Fix inconsistent elevel in pg_sync_replication_slots() retry logic.
- f1ddaa15357f 19 (unreleased) landed
-
Refactor slot synchronization logic in slotsync.c.
- 788ec96d591d 19 (unreleased) landed
-
Fix intermittent BF failure in 040_standby_failover_slots_sync.
- b47c50e5667b 19 (unreleased) landed
-
Add retry logic to pg_sync_replication_slots().
- 0d2d4a0ec3ec 19 (unreleased) landed
-
Fix LOCK_TIMEOUT handling in slotsync worker.
- 04396eacd3fa 19 (unreleased) cited
-
Add slotsync skip statistics.
- 76b78721ca49 19 (unreleased) cited
Attachments
- v9-0001-Improve-initial-slot-synchronization-in-pg_sync_r.patch (application/octet-stream) patch v9-0001
On Fri, Aug 22, 2025 at 3:44 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Aug 20, 2025 at 10:53 AM Ajin Cherian <itsajin@gmail.com> wrote:
> >
> >
> > I've removed them.
> > Attaching patch v8 addressing the above comments.
> >
>
> Thanks for the patch. Please find a few comments:
>
> 1)
> When the API is in progress, and meanwhile in another session we turn
> off hot_standby_feedback, the API session terminates abnormally.
>
> postgres=# SELECT pg_sync_replication_slots();
> server closed the connection unexpectedly
>
> It seems slotsync_reread_config() is not adjusted for API. It does
> proc_exit assuming it is a worker process.
>
I've removed the API calling ProcessSlotSyncInterrupts() as I don't
think the API needs to specifically handle shutdown requests, it works
without calling this. And the other config checks, I don't think the
API needs to handle, I think we should leave it to the user.
> 2)
> slotsync_worker_onexit():
>
> SlotSyncCtx->slot_persistence_pending = false;
>
> /*
> * If syncing_slots is true, it indicates that the process errored out
> * without resetting the flag. So, we need to clean up shared memory and
> * reset the flag here.
> */
> if (syncing_slots)
> {
> SlotSyncCtx->syncing = false;
> syncing_slots = false;
> }
>
> Shall we reset slot_persistence_pending inside 'if (syncing_slots)'?
> slot_persistence_pending can not be true without syncing_slots being
> true.
>
> 3)
> reset_syncing_flag():
>
> SpinLockAcquire(&SlotSyncCtx->mutex);
> SlotSyncCtx->syncing = false;
> + SlotSyncCtx->slot_persistence_pending = false;
> SpinLockRelease(&SlotSyncCtx->mutex);
>
> Here we are changing slot_persistence_pending under mutex, while at
> other places, it is not protected by mutex. Is it intentional here?
>
> 4)
> On rethinking, we maintain anything in shared memory if it has to be
> shared between a few processes. 'slot_persistence_pending' OTOH is
> required to be set and accessed by only one process at a time. Shall
> we move it out of SlotSyncCtxStruct and keep it static similar to
> 'syncing_slots'? Rest of the setting, resetting flow remains the same.
> What do you think?
>
Yes, I agree. I have modified it accordingly.
>
> 5)
> /* Build the query based on whether we're fetching all or refreshing
> specific slots */
>
> Perhaps we can shift this comment to where we actually append
> target_slot_list. Better to have it something like:
> 'If target_slot_list is provided, construct the query only to fetch given slots'
>
Changed.
Attaching patch v9 addressing the above comments.
regards,
Ajin Cherian
Fujitsu Australia