Thread

  1. Re: injection_points: Switch wait/wakeup to use atomics rather than latches

    Michael Paquier <michael@paquier.xyz> — 2026-05-28T23:19:42Z

    On Thu, May 28, 2026 at 08:40:39AM -0400, Robert Haas wrote:
    > After reading this email, the linked-to email, and the commit message
    > for the patch, I still don't have a clear understanding of what this
    > is intended to fix. It seems like it's going to make the
    > responsiveness worse. In general, we want to replace escalating wait
    > loops with things that wake up instantly at the right time, and this
    > is going in the opposite direction.
    
    This is an exchange between responsiveness of the system and
    flexibility.  I have had two complaints in the past about the fact
    that the waits and wakeups were not doable due to the fact that we
    rely on condition variables and latches:
    - Postmaster context (lack of dsm access as one).  Heikki has
    mentioned that to me once as annoying when hacking on tests there at
    protocol level, at least.
    - Second case as shown on the previous thread, which was a tricky
    scenario involving the termination of backends.
    
    One limitation is also related to wait event visibility, which may not
    be visible in pg_stat_activity.  We could simply add a LOG entry in
    injection_wait() once the old count is read, and rely on a server log
    lookup in the TAP tests where we cannot use pg_stat_activity.
    
    Compared to redesigning all the facilities that injection_points
    relies on, this patch was striking me as having a good balance in
    terms of responsiveness (min 10us, max 100ms) vs portability.  The
    minimum threshold does not really matter much in terms of runtime on
    fast machines.
    
    Does this explanation make sense?
    --
    Michael