Thread

  1. Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ​barriers

    Greg Burd <greg@burd.me> — 2025-11-21T19:37:08Z

    On Nov 20 2025, at 7:07 pm, Andres Freund <andres@anarazel.de> wrote:
    
    > Hi,
    > 
    > On 2025-11-20 19:03:47 -0500, Andres Freund wrote:
    >> > MSVC's _InterlockedCompareExchange() intrinsic on ARM64 performs the
    >> > atomic operation but does NOT emit the necessary Data Memory Barrier
    >> > (DMB) instructions [4][5].
    >> 
    >> I couldn't reproduce this result when playing around on godbolt. By specifying
    >> /arch:armv9.4 msvc can be convinced to emit the code for the
    >> intrinsics inline
    >> (at least for most of them).  And that makes it visible that
    >> _InterlockedCompareExchange() results in a "casal" instruction.
    >> Looking that
    >> up shows:
    >>   https://developer.arm.com/documentation/dui0801/l/A64-Data-Transfer-Instructions/CASA--CASAL--CAS--CASL--CASAL--CAS--CASL--A64-
    >> which includes these two statements:
    >> "CASA and CASAL load from memory with acquire semantics."
    >> "CASL and CASAL store to memory with release semantics."
    > 
    > Further evidence for that is that
    > https://learn.microsoft.com/en-us/windows/win32/api/winnt/nf-winnt-interlockedcompareexchange
    > states:
    > "This function generates a full memory barrier (or fence) to ensure
    > that memory operations are completed in order."
    > 
    > (note that we are using the function, not the intrinsic for TAS())
    
    Got it, thanks.
    
    > Greetings,
    > 
    > Andres
    
    best.
    
    -greg