Thread

  1. Re: injection_points: Switch wait/wakeup to use atomics rather than latches

    Andrey Borodin <x4mmm@yandex-team.ru> — 2026-05-30T08:05:25Z

    
    > On 28 May 2026, at 07:43, Michael Paquier <michael@paquier.xyz> wrote:
    > 
    > Andrey in CC, as I'm sure he is interested in that.
    
    Thanks! That's exactly what I need for my tests.
    
    > On 29 May 2026, at 18:31, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
    > 
    >> I'm still struggling to understand. Condition variables and latches
    >> are both designed to allow for nice waits and wakeups.
    > 
    > They only work after you have a PGPROC slot. If you want to inject code to authentication, or into postmaster, you cannot use them.
    
    I have another reason: postmaster death behavior. When we wait on
    ConVar and postmaster is kill-9-ed, we release all LWLocks. Which causes
    corruption [0], because checkpointer can flush something that's not in WAL.
    
    So I'm trying to build corruption-seeking tests using tool that can induce corruption
    in tests.
    
    About the patch:
    - inj_state->wait_counts[index]++;
    SpinLockRelease(&inj_state->lock);
    
    - /* And broadcast the change to the waiters */
    - ConditionVariableBroadcast(&inj_state->wait_point);
    + pg_atomic_fetch_add_u32(&inj_state->wait_counts[index], 1);
    
    Can we move pg_atomic_fetch_add_u32() back under the lock?
    We determine slot index under lock, then wakeup slot outside the lock.
    In a correctly written test meaning this is not a problem.
    However, technically, identity of a slot can change between releasing the lock
    and incrementing wait_counts[index].
    
    I'll do another pass tomorrow, maybe something else will catch my eye.
    
    
    Best regards, Andrey Borodin.
    
    [0] https://www.postgresql.org/message-id/B3C69B86-7F82-4111-B97F-0005497BB745%40yandex-team.ru