Thread

  1. [PATCH] Fix WAIT FOR LSN cleanup on subtransaction abort

    Ayush Tiwari <ayushtiwari.slg01@gmail.com> — 2026-05-06T07:18:31Z

    Hi,
    
    I found a backend crash in WAIT FOR LSN when it is interrupted inside a
    savepoint and the session then waits again.
    
    I tried to find if it was already reported, but could not find it, so,
    posting it.
    
    While navigating I noticed  WAIT FOR LSN cleanup is incomplete on
    subtransaction abort. An interrupt such as statement_timeout while
    waiting inside a savepoint leaves stale per-backend wait state,
    causing a later WAIT FOR LSN in the same backend to violate
    the wait-heap invariant and crash an assertion-enabled build.
    
    A small reproducer is:
    
        BEGIN;
        SAVEPOINT s;
        SET statement_timeout = '100ms';
        WAIT FOR LSN '<future-lsn>' WITH (MODE 'primary_flush');
        ROLLBACK TO s;
        SET statement_timeout = 0;
        WAIT FOR LSN '0/0' WITH (MODE 'primary_flush', TIMEOUT '10ms',
    NO_THROW);
        COMMIT;
    
    where <future-lsn> can be generated with:
    
        SELECT pg_current_wal_insert_lsn() + 10000000000;
    
    TRAP: failed Assert("!procInfo->inHeap"), File: "xlogwait.c"
    
    The attached patch mirrors the top-level abort cleanup by calling
    WaitLSNCleanup() from AbortSubTransaction(), after LWLockReleaseAll().  It
    also adds a TAP test to verify that WAIT FOR LSN can be reused in the same
    backend after a statement_timeout and ROLLBACK TO SAVEPOINT.
    
    Thoughts?
    
    Regards,
    Ayush