Thread

  1. Segmentation fault on proc exit after dshash_find_or_insert

    Rahila Syed <rahilasyed90@gmail.com> — 2025-11-21T11:45:35Z

    Hi,
    
    If a process encounters a FATAL error after acquiring a dshash lock but
    before releasing it,
    and it is not within a transaction, it can lead to a segmentation fault.
    
    The FATAL error causes the backend to exit, triggering proc_exit() and
    similar functions.
    In the absence of a transaction, LWLockReleaseAll() is delayed until
    ProcKill. ProcKill is
    an on_shmem_exit callback, and dsm_backend_shutdown() is called before any
    on_shmem_exit callbacks are invoked.
    Consequently, if a dshash lock was acquired before the FATAL error
    occurred, the lock
    will only be released after dsm_backend_shutdown() detaches the DSM segment
    containing
    the lock, resulting in a segmentation fault.
    
    Please find a reproducer attached. I have modified the test_dsm_registry
    module to create
    a background worker that does nothing but throws a FATAL error after
    acquiring the dshash lock.
    The reason this must be executed in the background worker is to ensure it
    runs without a transaction.
    
    To trigger the segmentation fault, apply the 0001-Reproducer* patch, run
    make install in the
    test_dsm_registry module, specify test_dsm_registry as
    shared_preload_libraries in postgresql.conf,
    and start the server.
    
    Please find attached a fix to call LWLockReleaseAll() early in the
    shmem_exit() routine. This ensures
    that the dshash lock is released before dsm_backend_shutdown() is called.
    This will  also ensure that
    any subsequent callbacks invoked in shmem_exit() will not fail to acquire
    any lock.
    
    Please see the backtrace below.
    
    ```
    Program terminated with signal SIGSEGV, Segmentation fault.
    #0  0x000055a7515af56c in pg_atomic_fetch_sub_u32_impl (ptr=0x7f92c4b334f4,
    sub_=262144)
        at ../../../../src/include/port/atomics/generic-gcc.h:218
    218             return __sync_fetch_and_sub(&ptr->value, sub_);
    (gdb) bt
    #0  0x000055a7515af56c in pg_atomic_fetch_sub_u32_impl (ptr=0x7f92c4b334f4,
    sub_=262144)
        at ../../../../src/include/port/atomics/generic-gcc.h:218
    #1  0x000055a7515af625 in pg_atomic_sub_fetch_u32_impl (ptr=0x7f92c4b334f4,
    sub_=262144)
        at ../../../../src/include/port/atomics/generic.h:232
    #2  0x000055a7515af709 in pg_atomic_sub_fetch_u32 (ptr=0x7f92c4b334f4,
    sub_=262144)
        at ../../../../src/include/port/atomics.h:441
    #3  0x000055a7515b1583 in LWLockReleaseInternal (lock=0x7f92c4b334f0,
    mode=LW_EXCLUSIVE) at lwlock.c:1840
    #4  0x000055a7515b1638 in LWLockRelease (lock=0x7f92c4b334f0) at
    lwlock.c:1902
    #5  0x000055a7515b16e9 in LWLockReleaseAll () at lwlock.c:1951
    #6  0x000055a7515ba63d in ProcKill (code=1, arg=0) at proc.c:953
    #7  0x000055a7515913af in shmem_exit (code=1) at ipc.c:276
    #8  0x000055a75159119b in proc_exit_prepare (code=1) at ipc.c:198
    #9  0x000055a7515910df in proc_exit (code=1) at ipc.c:111
    #10 0x000055a7517be71d in errfinish (filename=0x7f92ce41d062
    "test_dsm_registry.c", lineno=187,
        funcname=0x7f92ce41d160 <__func__.0> "TestDSMRegistryMain") at
    elog.c:596
    #11 0x00007f92ce41ca62 in TestDSMRegistryMain (main_arg=0) at
    test_dsm_registry.c:187
    #12 0x000055a7514db00c in BackgroundWorkerMain
    (startup_data=0x55a752dd8028, startup_data_len=1472)
        at bgworker.c:846
    #13 0x000055a7514de1e8 in postmaster_child_launch (child_type=B_BG_WORKER,
    child_slot=239,
        startup_data=0x55a752dd8028, startup_data_len=1472, client_sock=0x0) at
    launch_backend.c:268
    #14 0x000055a7514e530d in StartBackgroundWorker (rw=0x55a752dd8028) at
    postmaster.c:4168
    #15 0x000055a7514e55a4 in maybe_start_bgworkers () at postmaster.c:4334
    #16 0x000055a7514e4200 in LaunchMissingBackgroundProcesses () at
    postmaster.c:3408
    #17 0x000055a7514e205b in ServerLoop () at postmaster.c:1728
    #18 0x000055a7514e18b0 in PostmasterMain (argc=3, argv=0x55a752dd0e70) at
    postmaster.c:1403
    #19 0x000055a75138eead in main (argc=3, argv=0x55a752dd0e70) at main.c:231
    ```
    
    Thank you,
    Rahila Syed