Re: Improve pg_sync_replication_slots() to wait for primary to advance

shveta malik <shveta.malik@gmail.com>

From: shveta malik <shveta.malik@gmail.com>
To: Ajin Cherian <itsajin@gmail.com>
Cc: Ashutosh Sharma <ashu.coek88@gmail.com>, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, Amit Kapila <amit.kapila16@gmail.com>, PostgreSQL mailing lists <pgsql-hackers@postgresql.org>, shveta malik <shveta.malik@gmail.com>
Date: 2025-10-07T10:16:52Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Enhance slot synchronization API to respect promotion signal.

  2. Fix inconsistent elevel in pg_sync_replication_slots() retry logic.

  3. Refactor slot synchronization logic in slotsync.c.

  4. Fix intermittent BF failure in 040_standby_failover_slots_sync.

  5. Add retry logic to pg_sync_replication_slots().

  6. Fix LOCK_TIMEOUT handling in slotsync worker.

  7. Add slotsync skip statistics.

On Tue, Oct 7, 2025 at 3:24 PM Ajin Cherian <itsajin@gmail.com> wrote:
>
> Hello Hackers,
>
> In an offline discussion, I was considering adding a TAP test for this
> patch. However, testing the pg_sync_replication_slots() API’s wait
> logic requires a delay of at least 2 seconds, since that’s the
> interval the API sleeps before retrying. I’m not sure it’s acceptable
> to add a TAP test that increases runtime by 2 seconds.
> I’m also wondering if 2 seconds is too long for the API to wait?
> Should we reduce it to something like 200 ms instead? I’d appreciate
> your feedback.
>

I feel a shorter nap will be good since it is an API and should finish
fast. But too short a nap may result in too many primary pings
specially when primary-slots are not advancing. But that case should
be a rare one. Shall we have a nap of say 500ms? It is neither too
short nor too long. Thoughts?

thanks
Shveta