Thread

  1. Re: Improve pg_sync_replication_slots() to wait for primary to advance

    Japin Li <japinli@hotmail.com> — 2025-11-07T05:06:26Z

    On Thu, 06 Nov 2025 at 18:53, Ajin Cherian <itsajin@gmail.com> wrote:
    > On Tue, Nov 4, 2025 at 5:23 PM shveta malik <shveta.malik@gmail.com> wrote:
    >>
    >> >
    >> > I have addressed the above comments in patch v21.
    >> >
    >>
    >> Thank You. Please find a few comments:
    >>
    >> 1)
    >> + fparams.slot_names = slot_names = NIL;
    >>
    >> I think it is not needed to set slot_names to NIL.
    >>
    >> 2)
    >> -    WAIT_EVENT_REPLICATION_SLOTSYNC_MAIN);
    >> +    WAIT_EVENT_REPLICATION_SLOTSYNC_PRIMARY_CATCHUP);
    >>
    >> The new name does not seem appropriate. For the slotsync-worker case,
    >> even when the primary is not behind, the worker still waits but it is
    >> not waiting for primary to catch-up. I could not find a better name
    >> except the original one 'WAIT_EVENT_REPLICATION_SLOTSYNC_MAIN'. We can
    >> change the explanation to :
    >>
    >> "Waiting in main loop of slot sync worker and slot sync API."
    >> Or
    >> "Waiting in main loop of slot synchronization."
    >>
    >> If anyone has any better name suggestions, we can consider changing.
    >
    > Changed as suggested above.
    >
    >>
    >> 3)
    >>
    >> +# Attempt to synchronize slots using API. The API will continue retrying
    >> +# synchronization until the remote slot catches up with the locally reserved
    >> +# position. The API will not return until this happens, to be able to make
    >> +# further calls, call the API in a background process.
    >>
    >> Shall we remove 'with the locally reserved position', it’s already
    >> explained in the test header and the comment is good enough even
    >> without it.
    >>
    >
    > Changed.
    >
    >> 4)
    >> +# Confirm log that the slot has been synced after becoming sync-ready.
    >>
    >> Shall we just say:
    >> Confirm from the log that the slot is sync-ready now.
    >>
    >> 5)
    >>  # Synchronize the primary server slots to the standby.
    >>  $standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
    >> @@ -945,6 +1007,7 @@ $subscriber1->safe_psql('postgres',
    >>  is( $standby1->safe_psql(
    >>   'postgres',
    >>   q{SELECT count(*) = 2 FROM pg_replication_slots WHERE slot_name IN
    >> ('lsub1_slot', 'snap_test_slot') AND synced AND NOT temporary;}
    >> +
    >>   ),
    >>
    >> Redundant change.
    >
    > Removed.
    >
    > Attaching patch v22 addressing the above comments.
    
    @@ -62,8 +62,8 @@ LOGICAL_APPLY_MAIN	"Waiting in main loop of logical replication apply process."
     LOGICAL_LAUNCHER_MAIN	"Waiting in main loop of logical replication launcher process."
     LOGICAL_PARALLEL_APPLY_MAIN	"Waiting in main loop of logical replication parallel apply process."
     RECOVERY_WAL_STREAM	"Waiting in main loop of startup process for WAL to arrive, during streaming recovery."
    -REPLICATION_SLOTSYNC_MAIN	"Waiting in main loop of slot sync worker."
     REPLICATION_SLOTSYNC_SHUTDOWN	"Waiting for slot sync worker to shut down."
    +REPLICATION_SLOTSYNC_MAIN	"Waiting in main loop of slot synchronization."
     SYSLOGGER_MAIN	"Waiting in main loop of syslogger process."
     WAL_RECEIVER_MAIN	"Waiting in main loop of WAL receiver process."
     WAL_SENDER_MAIN	"Waiting in main loop of WAL sender process."
    
    I've noticed that all events are sorted alphabetically.  I think we should keep
    the order of REPLICATION_SLOTSYNC_MAIN unchanged.
    
    -- 
    Regards,
    Japin Li
    ChengDu WenWu Information Technology Co., Ltd.