Re: Improve pg_sync_replication_slots() to wait for primary to advance

Ajin Cherian <itsajin@gmail.com>

From: Ajin Cherian <itsajin@gmail.com>
To: shveta malik <shveta.malik@gmail.com>
Cc: Amit Kapila <amit.kapila16@gmail.com>, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, Japin Li <japinli@hotmail.com>, Ashutosh Sharma <ashu.coek88@gmail.com>, PostgreSQL mailing lists <pgsql-hackers@postgresql.org>
Date: 2025-12-10T02:23:10Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Enhance slot synchronization API to respect promotion signal.

  2. Fix inconsistent elevel in pg_sync_replication_slots() retry logic.

  3. Refactor slot synchronization logic in slotsync.c.

  4. Fix intermittent BF failure in 040_standby_failover_slots_sync.

  5. Add retry logic to pg_sync_replication_slots().

  6. Fix LOCK_TIMEOUT handling in slotsync worker.

  7. Add slotsync skip statistics.

Attachments

On Tue, Dec 9, 2025 at 10:45 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Dec 9, 2025 at 4:04 PM Ajin Cherian <itsajin@gmail.com> wrote:
> >
> > Hi all,
> >
> > Since commit 04396ea [1] has been pushed, which included part of the
> > changes this patch set was addressing, I have updated and rebased the
> > patch set to incorporate those changes.
> >
> > The patch set now contains two patches:
> >
> > 0001 – Modify the pg_sync_replication_slots API to also handle
> > promotion signals and stop synchronization, similar to the slot sync
> > worker.
> > 0002 – Improve pg_sync_replication_slots to wait for and persist slots
> > until they are sync-ready.
> >
> > Please review the updated patch set (v31).
> >
>
> Thanks. Please find a few comments on 001:
>
> 1)
> Commit message:
> "That meant backends
> that perform slot synchronization via the pg_sync_replication_slots()
> SQL function were not signalled at all because their PIDs were not
> recorded in the slot-sync context."
>
> We should phrase it in the singular ('backend'), since multiple
> backends cannot run simultaneously because concurrent sync is not
> allowed.
> Using the term 'their PIDs' gives an impression that there could be
> multiple PIDs, which is not the case.
>

Fixed.

> 2)
>   primary_slotname_changed = strcmp(old_primary_slotname, PrimarySlotName) != 0;
> +
>   pfree(old_primary_conninfo);
>
> This change to put blank line is not needed.
>

Removed.

> 3)
>
> + /* Check for sync_replication_slots change */
> + /* Check for parameter changes common to both API and worker */
>
> IMO, these comments are not needed as it is self-explanatory. Even if
> we plan to put these, both should be same, either both mentioning API
> and worker or neither.
>

Removed.

> 4)
> - * receives a stop request from the startup process, or when there is an
> + * because receives a stop request from the startup process, or when
> there is an
>
> I think, this change is done by mistake.
>

Yes, removed.

> 5)
> + * Signal slotsync worker or backend process running
> pg_sync_replication_slots()
> + * if running. The process will stop upon detecting that the stopSignaled
> + * flag is set to true.
>
> Comment looks slightly odd. Shall we say:
>
> Signal the slotsync worker or the backend process running
> pg_sync_replication_slots(), if either one is active. The process...
>

Changed.

Attaching patch v32 addressing the above comments.

regards,
Ajin Cherian
Fujitsu Australia