Thread

  1. Re: small cleanup for s_lock.h

    Nathan Bossart <nathandbossart@gmail.com> — 2026-05-07T20:41:56Z

    On Tue, May 05, 2026 at 12:57:10PM -0500, Nathan Bossart wrote:
    > On Tue, May 05, 2026 at 12:56:09PM -0400, Tom Lane wrote:
    >> Lastly, I definitely agree now that the file's header comment needs
    >> some work.  Maybe this insight helps you with that?  (One thing
    >> I noticed is that the ending comment about "Equivalent OS-supplied
    >> mutex routines could be used too" feels pretty obsolete.  Maybe
    >> instead, "Equivalent compiler intrinsics are another popular option".)
    > 
    > I think it does help, thanks.  I'll give it a whirl.
    
    Here is a first try.  While at it, I removed the note about using
    "volatile" prior to v9.5, as well as the note about using TAS() in a loop.
    I couldn't really understand why the latter part needs to be said out loud,
    so I did some more digging.  It was added by commit 7f60b81e and refers to
    an unsupported platform.  At the time, it was apparently normal to use
    TAS() outside of s_lock.h, but the "NOT part of the API" note above it was
    added the following year in commit 499abb0c.
    
    Two other things I noticed after staring at s_lock.h for a while:
    
    * If we're willing to define TAS_SPIN to first do an unlocked test
    everywhere, we can simplify the ARM and x86 code.  Specifically, we can
    merge the x86 blocks together, and we can remove all but the SPIN_DELAY
    definition for AArch64.  We've thus far been hesistant to add the unlocked
    test to platforms without evidence of improvements, but I'm a little
    skeptical that folks have performance-critical workloads on architectures
    that don't already have it.
    
    * Furthermore, do we really need architecture-specific implementations of
    anything except for SPIN_DELAY()?  I suspect that
    __sync_lock_test_and_set() and __sync_lock_release() work pretty well most
    of the time, and they've been available in gcc and clang for ~20 years.
    Perhaps we could even start using the __atomic builtins if available.
    We've been slowly trying to move away from spinlocks, anyway, and I'm not
    seeing huge differences in generated code on https://godbolt.org/ for newer
    compiler versions.
    
    Of course, further research and benchmarking would be needed, but I figured
    I'd at least jot down these thoughts.
    
    -- 
    nathan