Re: Improve pg_sync_replication_slots() to wait for primary to advance

Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>

From: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
To: shveta malik <shveta.malik@gmail.com>
Cc: Ajin Cherian <itsajin@gmail.com>, Amit Kapila <amit.kapila16@gmail.com>, PostgreSQL mailing lists <pgsql-hackers@postgresql.org>
Date: 2025-08-08T13:21:59Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Enhance slot synchronization API to respect promotion signal.

  2. Fix inconsistent elevel in pg_sync_replication_slots() retry logic.

  3. Refactor slot synchronization logic in slotsync.c.

  4. Fix intermittent BF failure in 040_standby_failover_slots_sync.

  5. Add retry logic to pg_sync_replication_slots().

  6. Fix LOCK_TIMEOUT handling in slotsync worker.

  7. Add slotsync skip statistics.

On Wed, Aug 6, 2025 at 8:48 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Aug 6, 2025 at 7:35 AM Ajin Cherian <itsajin@gmail.com> wrote:
> >
> > On Tue, Aug 5, 2025 at 4:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Aug 5, 2025 at 9:28 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Mon, Aug 4, 2025 at 3:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Mon, Aug 4, 2025 at 12:19 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > If we want to avoid continuously syncing newly added slots in later
> > > > cycles and instead focus only on the ones that failed to sync during
> > > > the first attempt, one approach is to maintain a list of failed slots
> > > > from the initial cycle and only retry those in subsequent attempts.
> > > > But this will add complexity to the implementation.
> > > >
> > >
> > > There will be some additional code for this but overall it improves
> > > the code in the lower level functions. We may want to use the existing
> > > remote_slot list for this purpose.
> > >
> > > The current proposed change in low-level functions appears to be
> > > difficult to maintain, especially the change proposed in
> > > update_and_persist_local_synced_slot(). If we can find a better way to
> > > achieve the same then we can consider the current approach as well.
> > >
> >
> > Next patch, I'll work on addressing this comment. I'll need to
> > restructure the code to make this happen.
> >
>
> Okay, thanks Ajin. I will resume review after this comment is
> addressed as I am assuming that the new logic will get rid of most of
> the current wait logic and thus it makes sense to review it after it
> is addressed.

There's also a minor merge conflict because func.sgml is not split
into multiple files.

-- 
Best Wishes,
Ashutosh Bapat