Re: Improve pg_sync_replication_slots() to wait for primary to advance
shveta malik <shveta.malik@gmail.com>
From: shveta malik <shveta.malik@gmail.com>
To: Ajin Cherian <itsajin@gmail.com>
Cc: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>,
Japin Li <japinli@hotmail.com>, Ashutosh Sharma <ashu.coek88@gmail.com>,
Amit Kapila <amit.kapila16@gmail.com>, PostgreSQL mailing lists <pgsql-hackers@postgresql.org>,
shveta malik <shveta.malik@gmail.com>
Date: 2025-11-26T08:42:29Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Enhance slot synchronization API to respect promotion signal.
- 4bed04d39566 17.10 landed
- 94efd308bcec 18.4 landed
- 1362bc33e025 19 (unreleased) landed
-
Fix inconsistent elevel in pg_sync_replication_slots() retry logic.
- f1ddaa15357f 19 (unreleased) landed
-
Refactor slot synchronization logic in slotsync.c.
- 788ec96d591d 19 (unreleased) landed
-
Fix intermittent BF failure in 040_standby_failover_slots_sync.
- b47c50e5667b 19 (unreleased) landed
-
Add retry logic to pg_sync_replication_slots().
- 0d2d4a0ec3ec 19 (unreleased) landed
-
Fix LOCK_TIMEOUT handling in slotsync worker.
- 04396eacd3fa 19 (unreleased) cited
-
Add slotsync skip statistics.
- 76b78721ca49 19 (unreleased) cited
A few comments:
1)
+/*
+ * Structure holding parameters that need to be freed on error in
+ * pg_sync_replication_slots()
+ */
+typedef struct SlotSyncApiFailureParams
+{
+ WalReceiverConn *wrconn;
+ List *slot_names;
+} SlotSyncApiFailureParams;
+
We can get rid of it now as we do not use it.
2)
ProcessSlotSyncInterrupts():
+ if (!AmLogicalSlotSyncWorkerProcess())
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot continue replication slots synchronization"
+ " as standby promotion is triggered"));
+ else
+ {
Can we please reverse the if-else i.e. first worker and then API.
Negated if-condition can be avoided in this case.
3)
slotsync_reread_config():
+ /* Worker-specific check for sync_replication_slots change */
Now since we check for both API and worker, this comment is not needed.
4)
- ereport(LOG,
- errmsg("replication slot synchronization worker will restart because
of a parameter change"));
- /*
- * Reset the last-start time for this worker so that the postmaster
- * can restart it without waiting for SLOTSYNC_RESTART_INTERVAL_SEC.
- */
- SlotSyncCtx->last_start_time = 0;
+ ereport(AmLogicalSlotSyncWorkerProcess() ? LOG : ERROR,
+ errmsg("replication slot synchronization will stop because of a
parameter change"));
Here, we should retain the same old message for worker i.e. 'worker
will restart..'. instead of 'synchronization will stop'. I find the
old message better in this case.
5)
slotsync_reread_config() is slightly difficult to follow.
I think in the case of API, we can display a common error message
instead of 2 different messages for 'sync_replication_slot change' and
the rest of the parameters. We can mark if any of the parameters
changed in both 'if' blocks and if the current process has not exited,
then at the end based on 'parameter-changed', we can deal with API by
giving a common message. Something like:
/*
* If we have reached here with a parameter change, we must be running
* in SQL function, emit error in such a case.
*/
if (parameter_changed (new variable))
{
Assert (!AmLogicalSlotSyncWorkerProcess);
ereport(ERROR,
errmsg("replication slot synchronization will stop because of a
parameter change"));
}
thanks
Shveta