Re: Improve pg_sync_replication_slots() to wait for primary to advance

Ajin Cherian <itsajin@gmail.com>

From: Ajin Cherian <itsajin@gmail.com>
To: shveta malik <shveta.malik@gmail.com>
Cc: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, Amit Kapila <amit.kapila16@gmail.com>, PostgreSQL mailing lists <pgsql-hackers@postgresql.org>
Date: 2025-08-29T06:12:39Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Enhance slot synchronization API to respect promotion signal.

  2. Fix inconsistent elevel in pg_sync_replication_slots() retry logic.

  3. Refactor slot synchronization logic in slotsync.c.

  4. Fix intermittent BF failure in 040_standby_failover_slots_sync.

  5. Add retry logic to pg_sync_replication_slots().

  6. Fix LOCK_TIMEOUT handling in slotsync worker.

  7. Add slotsync skip statistics.

On Fri, Aug 29, 2025 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Aug 26, 2025 at 9:58 AM Ajin Cherian <itsajin@gmail.com> wrote:
> 4)
> In the worker, before each call to synchronize_slots(), we are
> starting a new transaction. It aligns with the previous implementation
> where StartTransaction was inside synchronize_slots(). But in API, we
> are doing StartTransaction once outside of the loop instead of doing
> before each synchronize_slots(), is it intentional? It may keep the
> transaction open for a long duration for the case where slots are not
> getting persisted soon.
>

I’ll address your other comments separately, but I wanted to respond
to this one first. I did try the approach you suggested, but the issue
is that we use the remote_slots list across loop iterations. If we end
the transaction at the end of each iteration, the list gets freed and
is no longer available for the next pass. Each iteration relies on the
remote_slots list from the previous one to build the new list, which
is why we can’t free it inside the loop.

regards,
Ajin Cherian
Fujitsu Australia