Re: Improve pg_sync_replication_slots() to wait for primary to advance

Ajin Cherian <itsajin@gmail.com>

From: Ajin Cherian <itsajin@gmail.com>
To: shveta malik <shveta.malik@gmail.com>
Cc: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, Amit Kapila <amit.kapila16@gmail.com>, PostgreSQL mailing lists <pgsql-hackers@postgresql.org>
Date: 2025-08-14T01:58:26Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Enhance slot synchronization API to respect promotion signal.

  2. Fix inconsistent elevel in pg_sync_replication_slots() retry logic.

  3. Refactor slot synchronization logic in slotsync.c.

  4. Fix intermittent BF failure in 040_standby_failover_slots_sync.

  5. Add retry logic to pg_sync_replication_slots().

  6. Fix LOCK_TIMEOUT handling in slotsync worker.

  7. Add slotsync skip statistics.

Attachments

On Wed, Aug 13, 2025 at 2:47 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Aug 11, 2025 at 1:37 PM Ajin Cherian <itsajin@gmail.com> wrote:
> >
> > On Fri, Aug 8, 2025 at 11:22 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > >
> > > There's also a minor merge conflict because func.sgml is not split
> > > into multiple files.
> > >
> >
> > Yes, I fixed this.
> >
>
> Thanks for the patch. Please find a few comments:
>
> 1)
> We can merge refresh_remote_slots and fetch_remote_slots by passing an
> argument of remote_list. If no remote_list passed, fetch all failover
> slots, else extend the query and fetch only the listed ones.
>

Done.

> 2)
> We can get rid of 'sync_iterations' and the logic within, as I think
> there is no need to distinguish between slotsync and API in terms of
> logs.
>

Done.

> 3)
> sync_start_pending is not needed to be passed to
> update_and_persist_local_synced_slot(), as the output of this function
> is good enough to tell whether slot is persisted or not.
>
> 4)
> Also how about having sync-pending in SlotSyncCtxStruct. It can be set
> unconditionally by both slotsync and API, but will be used by API. I
> think it can simplify the code.
>

Done.

> 5)
> We can get rid of 'pending_sync_start_slots', as it is not being used anywhere.
>

Fixed.

> 6)
> Also we can mention in comments as to why we are using the old
> remote_slots list in refresh_remote_slots() during subsequent cycles
> of API rather than using only the pending-slot list.

Done.

Patch v6 attached.

regards,
Ajin Cherian
Fujitsu Australia