v1-0001-Avoid-self-deadlock-on-MultiXactOffsetSLRULock-dur.patch

application/octet-stream

Filename: v1-0001-Avoid-self-deadlock-on-MultiXactOffsetSLRULock-dur.patch
Type: application/octet-stream
Part: 0
Message: Re: BUG #19490: Streaming standby on 16.14 stops applying WAL on MultiXactOffsetSLRU when primary is 16.8
From b33abeede0847edac3603b87a478a832be1784f8 Mon Sep 17 00:00:00 2001
From: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Date: Thu, 21 May 2026 07:39:28 +0000
Subject: [PATCH REL_16_STABLE v1] Avoid self-deadlock on
 MultiXactOffsetSLRULock during WAL replay

Commit 77dff5d937b added a compatibility check in RecordNewMultiXact()
that can call SimpleLruWriteAll(MultiXactOffsetCtl, false) while already
holding MultiXactOffsetSLRULock.  In REL_16, SimpleLruWriteAll() tries
to acquire the same SLRU control lock, so WAL replay can self-deadlock
with the startup process waiting on LWLock:MultiXactOffsetSLRU.

The flush is not needed for the page tested in this fallback path.  If
RecordNewMultiXact() initializes that offsets page, it writes it
synchronously with SimpleLruWritePage() before updating
last_initialized_offsets_page.  Drop the unsafe flush and keep the
existing missing-page initialization logic.

Reported-by: Radim Marek <radim@boringsql.com>
Reported-by: Marko Tiikkaja <marko@joh.to>
Diagnosed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/19490-9c59c6a583513b99@postgresql.org
---
 src/backend/access/transam/multixact.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index f825579e888..5b6b48eb79c 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -934,16 +934,17 @@ RecordNewMultiXact(MultiXactId multi, MultiXactOffset offset,
 		 * seen any XLOG_MULTIXACT_ZERO_OFF_PAGE records yet, which should
 		 * happen at most once after starting WAL recovery.
 		 *
-		 * As an extra safety measure, if we do resort to
-		 * SimpleLruDoesPhysicalPageExist(), flush the SLRU buffers first so
-		 * that it will return an accurate result.
+		 *
+		 * We cannot call SimpleLruWriteAll() to flush the SLRU buffers
+		 * here, because that would self-deadlock on MultiXactOffsetSLRULock,
+		 * which we already hold.  Fortunately we do not need to: every
+		 * page that this code path initializes is synchronously flushed via
+		 * SimpleLruWritePage() below before this lock is released, so there
+		 * are no relevant dirty pages.
 		 *----------
 		 */
 		if (last_initialized_offsets_page == -1)
-		{
-			SimpleLruWriteAll(MultiXactOffsetCtl, false);
 			init_needed = !SimpleLruDoesPhysicalPageExist(MultiXactOffsetCtl, next_pageno);
-		}
 		else
 			init_needed = (last_initialized_offsets_page == pageno);
 
-- 
2.43.0