Thread
-
injection_points: Switch wait/wakeup to use atomics rather than latches
Michael Paquier <michael@paquier.xyz> — 2026-05-28T02:43:29Z
Hi all, (Adding Andrey in CC, as I'm sure he is interested in that.) While looking at the test proposed on the thread about the ProcKill(), I have been reminded about the fact that relying on latches and a condition variable for the wait and the wakeups has its limits: https://www.postgresql.org/message-id/aheVjCHmcbXBtiy0%40paquier.xyz In this case, we are trying to synchronize backends once they don't have latch assigned anymore, which defeats the purpose of wait/wakeup because the condition variable used in injection_points while waiting expects a Latch to be set for the processes we are waiting on. Folks have complained about this limitation a couple of times in the past, and I never got around to do something about it. While looking at that I have finished with the patch attached, which was surprisingly simpler than what I thought was needed. This replaces the condition variable with a set of atomic counters. The counters are incremented at wakeup, and the wait checks them on a periodic basis. The wait loop uses a delay that increases over time, maxed at 100ms so as we can get a good responsiveness on fast machines, without burning CPU for nothing in tests that require more wait time due to a tight loop with the counter checks. One thing worth noticing is the CHECK_FOR_INTERRUPTS() in the wait loop, which is something we need for the autovacuum test in test_misc that requires some signaling and interrupt processing. It may make sense to be conservative and limit ourselves to do this change on HEAD, but I'd like to suggest a backpatch down to v17 so as future tests that rely on a such change can be backpatched. I would need this change for the other test, still consistency in the facility primes for me here. Note: The CI seems happy with the patch. Thoughts or comments? -- Michael