Thread

  1. Re: BUG #19490: Streaming standby on 16.14 stops applying WAL on MultiXactOffsetSLRU when primary is 16.8

    Ayush Tiwari <ayushtiwari.slg01@gmail.com> — 2026-05-21T07:45:07Z

    Hi,
    
    On Thu, 21 May 2026 at 12:55, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
    
    >
    >
    > > On 21 May 2026, at 00:12, Marko Tiikkaja <marko@joh.to> wrote:
    > >
    > > #8  0x0000654c8ae2acba in SimpleLruWriteAll (ctl=0x654c8b63e400
    >
    > Thanks!
    >
    > This clearly points to SimpleLruWriteAll() added in 77dff5d937b1.
    > If by chance you will have a backtrace of another deadlocking process -
    > please post it.
    >
    > But it's not strictly necessary for analysis, I think we can figure out
    > what
    > happened from the backtrace you already posted.
    >
    
    I had a look at the code that Marko's backtrace pointed at and I
    believe this is a straightforward self-deadlock introduced by
    77dff5d937b.
    
    In RecordNewMultiXact() on REL_16_STABLE:
    
      LWLockAcquire(MultiXactOffsetSLRULock, LW_EXCLUSIVE);
    
      ...
    
      if (InRecovery && next_pageno != pageno)
      {
          ...
          if (last_initialized_offsets_page == -1)
          {
              SimpleLruWriteAll(MultiXactOffsetCtl, false);  /* <-- here */
              init_needed = !SimpleLruDoesPhysicalPageExist(MultiXactOffsetCtl,
    next_pageno);
          }
          else
              init_needed = (last_initialized_offsets_page == pageno);
          ...
      }
    
    The outer LWLockAcquire takes MultiXactOffsetSLRULock EXCLUSIVE.
    SimpleLruWriteAll() in REL_16_STABLE then does
    
      LWLockAcquire(shared->ControlLock, LW_EXCLUSIVE);
    
    and for the MultiXactOffsetCtl SLRU, shared->ControlLock is
    MultiXactOffsetSLRULock (set up by SimpleLruInit(...
    MultiXactOffsetSLRULock ...)).
    So it tries to take the very lock the same backend already holds.
    LWLockAcquire does not detect that and parks the process on
    LWLock:MultiXactOffsetSLRU forever.
    
    That matches every datum in the report:
    
      -  wait_event = LWLock:MultiXactOffsetSLRU.
       - pg_stat_slru shows zero MultiXact activity, because the
        SimpleLruWriteAll loop never gets past LWLockAcquire to actually
        write a page.
      - Restart unwedges things briefly.
      - The deadlock only triggers when last_initialized_offsets_page is
        still -1, i.e. before any XLOG_MULTIXACT_ZERO_OFF_PAGE record has
        been replayed in this recovery session, which is at most once per
        startup and consistent with the "recurs after catch-up" behaviour.
    
    The "safety flush" the comment justifies is it needed?
    Every offsets page that this code path initializes is synchronously written
    via SimpleLruWritePage() a few lines below the SimpleLruZeroPage(),
    with an Assert that the page is clean afterwards.  So at the moment
    we call SimpleLruDoesPhysicalPageExist(), there shouldn't be a relevant
    dirty offsets page in the SLRU buffer cache that would lead to a
    false negative.  Dropping the SimpleLruWriteAll() call therefore
    removes the self-deadlock without changing correctness.
    
    Maybe I'm missing something here. Thoughts?
    
    Regards,
    Ayush