Thread
-
Re: Improve pg_sync_replication_slots() to wait for primary to advance
Japin Li <japinli@hotmail.com> — 2025-11-28T06:03:44Z
On Fri, 28 Nov 2025 at 15:46, Ajin Cherian <itsajin@gmail.com> wrote: > On Wed, Nov 26, 2025 at 7:42 PM shveta malik <shveta.malik@gmail.com> > wrote: >> >> A few comments: >> >> 1) >> +/* >> + * Structure holding parameters that need to be freed on error in >> + * pg_sync_replication_slots() >> + */ >> +typedef struct SlotSyncApiFailureParams >> +{ >> + WalReceiverConn *wrconn; >> + List *slot_names; >> +} SlotSyncApiFailureParams; >> + >> >> We can get rid of it now as we do not use it. >> > > Removed. > >> 2) >> ProcessSlotSyncInterrupts(): >> >> + if (!AmLogicalSlotSyncWorkerProcess()) >> + ereport(ERROR, >> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), >> + errmsg("cannot continue replication slots synchronization" >> + " as standby promotion is triggered")); >> + else >> + { >> >> Can we please reverse the if-else i.e. first worker and then API. >> Negated if-condition can be avoided in this case. >> > > Changed. > >> 3) >> >> slotsync_reread_config(): >> + /* Worker-specific check for sync_replication_slots change */ >> >> Now since we check for both API and worker, this comment is not >> needed. >> > > Removed. > >> 4) >> - ereport(LOG, >> - errmsg("replication slot synchronization worker will restart >> because >> of a parameter change")); >> >> - /* >> - * Reset the last-start time for this worker so that the postmaster >> - * can restart it without waiting for >> SLOTSYNC_RESTART_INTERVAL_SEC. >> - */ >> - SlotSyncCtx->last_start_time = 0; >> + ereport(AmLogicalSlotSyncWorkerProcess() ? LOG : ERROR, >> + errmsg("replication slot synchronization will stop because of a >> parameter change")); >> >> Here, we should retain the same old message for worker i.e. 'worker >> will restart..'. instead of 'synchronization will stop'. I find the >> old message better in this case. >> >> 5) >> slotsync_reread_config() is slightly difficult to follow. >> I think in the case of API, we can display a common error message >> instead of 2 different messages for 'sync_replication_slot change' >> and >> the rest of the parameters. We can mark if any of the parameters >> changed in both 'if' blocks and if the current process has not >> exited, >> then at the end based on 'parameter-changed', we can deal with API >> by >> giving a common message. Something like: >> >> /* >> * If we have reached here with a parameter change, we must be >> running >> * in SQL function, emit error in such a case. >> */ >> if (parameter_changed (new variable)) >> { >> Assert (!AmLogicalSlotSyncWorkerProcess); >> ereport(ERROR, >> errmsg("replication slot synchronization will stop because of a >> parameter change")); >> } >> > > Fixed as above. > > I've addressed the above comments as well as rebased the patch based > on changes in commit 76b7872 in patch v26 > 1. Initialize slot_persistence_pending to false (to avoid uninitialized values, or initialize to true by mistaken) in update_and_persist_local_synced_slot(). This aligns with the handling of found_consistent_snapshot and remote_slot_precedes in update_local_synced_slot(). diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c index 20eada3393..c55ba11f17 100644 --- a/src/backend/replication/logical/slotsync.c +++ b/src/backend/replication/logical/slotsync.c @@ -617,6 +617,9 @@ update_and_persist_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid, bool found_consistent_snapshot = false; bool remote_slot_precedes = false; + if (slot_persistence_pending) + *slot_persistence_pending = false; + /* Slotsync skip stats are handled in function update_local_synced_slot() */ (void) update_local_synced_slot(remote_slot, remote_dbid, &found_consistent_snapshot, 2. This change seems unnecessary。 static void -slotsync_reread_config(void) +slotsync_reread_config() static void -ProcessSlotSyncInterrupts(void) +ProcessSlotSyncInterrupts() 3. Since we are already caching the result of AmLogicalSlotSyncWorkerProcess() in a local worker variable, how about applying this replacement: s/if (AmLogicalSlotSyncWorkerProcess())/if (worker)/g? + bool worker = AmLogicalSlotSyncWorkerProcess(); + bool parameter_changed = false; - Assert(sync_replication_slots); + if (AmLogicalSlotSyncWorkerProcess()) + Assert(sync_replication_slots); -- Regards, Japin Li ChengDu WenWu Information Technology Co., Ltd.