Re: injection_points: Switch wait/wakeup to use atomics rather than latches
Andrey Borodin <x4mmm@yandex-team.ru>
From: Andrey Borodin <x4mmm@yandex-team.ru>
To: Heikki Linnakangas <hlinnaka@iki.fi>
Cc: Robert Haas <robertmhaas@gmail.com>,
Michael Paquier <michael@paquier.xyz>,
Postgres hackers <pgsql-hackers@lists.postgresql.org>
Date: 2026-05-30T08:05:25Z
Lists: pgsql-hackers
> On 28 May 2026, at 07:43, Michael Paquier <michael@paquier.xyz> wrote: > > Andrey in CC, as I'm sure he is interested in that. Thanks! That's exactly what I need for my tests. > On 29 May 2026, at 18:31, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > >> I'm still struggling to understand. Condition variables and latches >> are both designed to allow for nice waits and wakeups. > > They only work after you have a PGPROC slot. If you want to inject code to authentication, or into postmaster, you cannot use them. I have another reason: postmaster death behavior. When we wait on ConVar and postmaster is kill-9-ed, we release all LWLocks. Which causes corruption [0], because checkpointer can flush something that's not in WAL. So I'm trying to build corruption-seeking tests using tool that can induce corruption in tests. About the patch: - inj_state->wait_counts[index]++; SpinLockRelease(&inj_state->lock); - /* And broadcast the change to the waiters */ - ConditionVariableBroadcast(&inj_state->wait_point); + pg_atomic_fetch_add_u32(&inj_state->wait_counts[index], 1); Can we move pg_atomic_fetch_add_u32() back under the lock? We determine slot index under lock, then wakeup slot outside the lock. In a correctly written test meaning this is not a problem. However, technically, identity of a slot can change between releasing the lock and incrementing wait_counts[index]. I'll do another pass tomorrow, maybe something else will catch my eye. Best regards, Andrey Borodin. [0] https://www.postgresql.org/message-id/B3C69B86-7F82-4111-B97F-0005497BB745%40yandex-team.ru