v4-0001-Allow-old-WAL-recycling-during-REPACK-CONCURRENTL.patch

application/octet-stream

Filename: v4-0001-Allow-old-WAL-recycling-during-REPACK-CONCURRENTL.patch
Type: application/octet-stream
Part: 0
Message: RE: Adding REPACK [concurrently]
From 74881e1bf03da8a4772b3bc5a24542b6dfb51042 Mon Sep 17 00:00:00 2001
From: Zhijie Hou <houzj.fnst@fujitsu.com>
Date: Fri, 10 Apr 2026 16:24:55 +0800
Subject: [PATCH v4 1/2] Allow old WAL recycling during REPACK CONCURRENTLY

During REPACK CONCURRENTLY, logical decoding can keep replication
slot.restart_lsn pinned behind the oldest running transaction, which is often
the long-lived REPACK transaction itself. As a result, old WAL segments are
retained longer than necessary.

This commit advances the replication slot each time WAL insertion crosses a
segment boundary, so obsolete WAL files can be recycled while REPACK is still
running.

This change does not advance catalog_xmin. REPACK already holds a snapshot that
prevents catalog dead tuple removal, so catalog_xmin handling can be addressed
independently.

Additionally, this commit improves LogicalConfirmReceivedLocation to compute the
oldest restart LSN whenever slot.restart_lsn is updated. Previously, this
function performed the computation only when catalog_xmin was updated, which was
less problematic because catalog_xmin typically advances in most replication
cases, but not for REPACK.
---
 src/backend/commands/repack_worker.c      | 20 +++++++++++++++++++-
 src/backend/replication/logical/logical.c |  8 +++++++-
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/src/backend/commands/repack_worker.c b/src/backend/commands/repack_worker.c
index 4f82eb46bec..56974cbc1f5 100644
--- a/src/backend/commands/repack_worker.c
+++ b/src/backend/commands/repack_worker.c
@@ -397,12 +397,30 @@ decode_concurrent_changes(LogicalDecodingContext *ctx,
 
 			/*
 			 * If WAL segment boundary has been crossed, inform the decoding
-			 * system that the catalog_xmin can advance.
+			 * system that the slot can advance.
+			 *
+			 * Once REPACK begins copying data to the new table, the logical
+			 * decoding machinery prevents the slot from advancing beyond the
+			 * oldest running transaction (which is the REPACK transaction
+			 * itself). As a result, restart_lsn and catalog_xmin can no
+			 * longer advance automatically.
+			 *
+			 * To allow old WAL files to be recycled, we manually advance the
+			 * slot each time a WAL segment boundary is crossed. This is safe
+			 * because the REPACK slot is temporary and will be dropped
+			 * automatically if the REPACK command fails. There is no scenario
+			 * where this slot needs to restart decoding from an earlier
+			 * position while still alive.
+			 *
+			 * We do not advance catalog_xmin here because the REPACK
+			 * transaction anyway holds a snapshot that prevents catalog dead
+			 * tuple removal.
 			 */
 			end_lsn = ctx->reader->EndRecPtr;
 			XLByteToSeg(end_lsn, segno_new, wal_segment_size);
 			if (segno_new != repack_current_segment)
 			{
+				LogicalIncreaseRestartDecodingForSlot(end_lsn, end_lsn);
 				LogicalConfirmReceivedLocation(end_lsn);
 				elog(DEBUG1, "REPACK: confirmed receive location %X/%X",
 					 (uint32) (end_lsn >> 32), (uint32) end_lsn);
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index b969caae72e..8b8095bd5d8 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -1910,8 +1910,14 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
 			SpinLockRelease(&MyReplicationSlot->mutex);
 
 			ReplicationSlotsComputeRequiredXmin(false);
-			ReplicationSlotsComputeRequiredLSN();
 		}
+
+		/*
+		 * Now the new restart_lsn is safely on disk, recompute the global WAL
+		 * retention requirement.
+		 */
+		if (updated_restart)
+			ReplicationSlotsComputeRequiredLSN();
 	}
 	else
 	{
-- 
2.43.0