v01-0001-Avoid-using-deleted-context-with-replication-com.patch

application/octet-stream

Filename: v01-0001-Avoid-using-deleted-context-with-replication-com.patch
Type: application/octet-stream
Part: 0
Message: Memory context can be its own parent and child in replication command
From 1f65a8e7690c1941c9b7f85d0476c1b1f7d51f76 Mon Sep 17 00:00:00 2001
From: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Date: Thu, 6 Mar 2025 16:42:52 +0100
Subject: Avoid using deleted context with replication command

When creating a new replication slot, with snapshot export and using
replication command, a new transaction is started by
SnapBuildExportSnapshot. This transaction will save the memory context
at the start of the transaction, which would be the replication command
context.

This memory context is deleted at the end of exec_replication_command.
During the next replication command call, SnapBuildClearExportedSnapshot
will abort the previous transaction, which will set CurrentMemoryContext
to the memory context stored at the transaction creation.

However, this memory context was deleted, and the next memory context
allocation pass this deleted context as a parent. It is possible that
the context pulled from the freelist and the parent to be the same,
leading to a context whose parent is itself. This will trigger an
infinite loop during when attempting to delete this context when trying
to descend to a leaf with no children.

To fix this issue, CurrentMemoryContext is saved then restored after
AbortCurrentTransaction() call in SnapBuildClearExportedSnapshot.
---
 src/backend/replication/logical/snapbuild.c   | 12 +++++++
 src/backend/utils/mmgr/aset.c                 |  2 ++
 src/test/recovery/meson.build                 |  1 +
 .../recovery/t/045_replication_commands.pl    | 32 +++++++++++++++++++
 4 files changed, 47 insertions(+)
 create mode 100644 src/test/recovery/t/045_replication_commands.pl

diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index bd0680dcbe5..332fb99ad06 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -600,6 +600,7 @@ void
 SnapBuildClearExportedSnapshot(void)
 {
 	ResourceOwner tmpResOwner;
+	MemoryContext oldcontext;
 
 	/* nothing exported, that is the usual case */
 	if (!ExportInProgress)
@@ -614,10 +615,21 @@ SnapBuildClearExportedSnapshot(void)
 	 */
 	tmpResOwner = SavedResourceOwnerDuringExport;
 
+	/*
+	 * AbortCurrentTransaction() will also restore CurrentMemoryContext to the
+	 * memory context used at transaction start. This memory context would be
+	 * the Replication command context used during the call to
+	 * CreateReplicationSlot, and was already deleted. To avoid reusing this
+	 * deleted context, save CurrentMemoryContext and restore it after
+	 * AbortCurrentTransaction.
+	 */
+	oldcontext = CurrentMemoryContext;
+
 	/* make sure nothing could have ever happened */
 	AbortCurrentTransaction();
 
 	CurrentResourceOwner = tmpResOwner;
+	MemoryContextSwitchTo(oldcontext);
 }
 
 /*
diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c
index 666ecd8f78d..e69ff760541 100644
--- a/src/backend/utils/mmgr/aset.c
+++ b/src/backend/utils/mmgr/aset.c
@@ -425,6 +425,8 @@ AllocSetContextCreateInternal(MemoryContext parent,
 			((MemoryContext) set)->mem_allocated =
 				KeeperBlock(set)->endptr - ((char *) set);
 
+			/* Check the returned context is not the same as the parent */
+			Assert((MemoryContext) set != parent);
 			return (MemoryContext) set;
 		}
 	}
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 057bcde1434..7e33dd897b2 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -53,6 +53,7 @@ tests += {
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
       't/044_invalidate_inactive_slots.pl',
+      't/045_replication_commands.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/045_replication_commands.pl b/src/test/recovery/t/045_replication_commands.pl
new file mode 100644
index 00000000000..385f666aa96
--- /dev/null
+++ b/src/test/recovery/t/045_replication_commands.pl
@@ -0,0 +1,32 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test of replication commands
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 'logical');
+$node_primary->start;
+
+my $primary_host = $node_primary->host;
+my $primary_port = $node_primary->port;
+my $connstr_db = "host=$primary_host port=$primary_port replication=database dbname=postgres";
+
+my ($ret, $stdout, $stderr) = $node_primary->psql(
+	'postgres', qq[
+		CREATE_REPLICATION_SLOT "test_slot" LOGICAL "test_decoding" ( SNAPSHOT "export");
+		DROP_REPLICATION_SLOT "test_slot";
+	],
+	on_error_die => 1,
+	extra_params => [ '-d', $connstr_db ]);
+ok($ret == 0, "Create and drop of replication slot");
+
+# Testcase end
+# =============================================================================
+
+done_testing();
+
-- 
2.39.5 (Apple Git-154)