Re: Improve pg_sync_replication_slots() to wait for primary to advance

shveta malik <shveta.malik@gmail.com>

From: shveta malik <shveta.malik@gmail.com>
To: Amit Kapila <amit.kapila16@gmail.com>
Cc: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, Ajin Cherian <itsajin@gmail.com>, Ashutosh Sharma <ashu.coek88@gmail.com>, PostgreSQL mailing lists <pgsql-hackers@postgresql.org>, shveta malik <shveta.malik@gmail.com>
Date: 2025-10-09T08:59:58Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Enhance slot synchronization API to respect promotion signal.

  2. Fix inconsistent elevel in pg_sync_replication_slots() retry logic.

  3. Refactor slot synchronization logic in slotsync.c.

  4. Fix intermittent BF failure in 040_standby_failover_slots_sync.

  5. Add retry logic to pg_sync_replication_slots().

  6. Fix LOCK_TIMEOUT handling in slotsync worker.

  7. Add slotsync skip statistics.

On Thu, Oct 9, 2025 at 2:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 7, 2025 at 5:13 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Oct 7, 2025 at 4:49 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > Shorter nap times mean higher possibility of wasted CPU cycles - that
> > > should be avoided. Doing that for a test's sake seems wrong. Is there
> > > a way that the naptime can controlled by external factors such as
> > > likelihood of an advanced slot (just firing bullets in the dark) or is
> > > the naptime controllable by user interface like GUC? The test can use
> > > those interfaces.
> > >
> >
> > Yes, we can control naptime based on the fact whether any slots are
> > being advanced on primary. This is how a slotsync worker does. It
> > keeps on doubling the naptime if there is no activity on primary
> > starting from 200ms till max of 30 sec. As soon as activity happens,
> > naptime is reduced to 200ms again.
> >
>
> Is there a reason why we don't want to use the same naptime strategy
> for API and worker?
>

There was a suggestion at [1] for a shorter naptime in case of API.

[1]: https://www.postgresql.org/message-id/CAExHW5sQLJGhEA%2B9ZFVwZUpqfFFP5KPn9w64t3uiHSuiEH-9mQ%40mail.gmail.com

thanks
Shveta