Thread

  1. memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Robert Haas <robertmhaas@gmail.com> — 2011-09-14T20:29:31Z

    On Mon, Aug 8, 2011 at 7:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
    > I've been thinking about this too and actually went so far as to do
    > some research and put together something that I hope covers most of
    > the interesting cases.  The attached patch is pretty much entirely
    > untested, but reflects my present belief about how things ought to
    > work.
    
    And, here's an updated version, with some of the more obviously broken
    things fixed.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
  2. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-09-15T15:57:56Z

    On 14.09.2011 23:29, Robert Haas wrote:
    > On Mon, Aug 8, 2011 at 7:47 AM, Robert Haas<robertmhaas@gmail.com>  wrote:
    >> I've been thinking about this too and actually went so far as to do
    >> some research and put together something that I hope covers most of
    >> the interesting cases.  The attached patch is pretty much entirely
    >> untested, but reflects my present belief about how things ought to
    >> work.
    >
    > And, here's an updated version, with some of the more obviously broken
    > things fixed.
    
    s/atomic/barrier/
    
    > +/*
    > + * A compiler barrier need not (and preferably should not) emit any actual
    > + * machine code, but must act as an optimization fence: the compiler must not
    > + * reorder loads or stores to main memory around the barrier.  However, the
    > + * CPU may still reorder loads or stores at runtime, if the architecture's
    > + * memory model permits this.
    > + *
    > + * A memory barrier must act as a compiler barrier, and in addition must
    > + * guarantee that all loads and stores issued prior to the barrier are
    > + * completed before any loads or stores issued after the barrier.  Unless
    > + * loads and stores are totally ordered (which is not the case on most
    > + * architectures) this requires issuing some sort of memory fencing
    > + * instruction.
    
    This seems like a strange way to explain the problem. I would suggest 
    structuring those paragraphs along the lines of:
    
    "
    A PostgreSQL memory barrier guarantees that any loads/stores before the 
    barrier are completely finished and visible to other CPUs, before the 
    loads/stores after the barrier are performed.
    
    That involves two things: 1. We must stop the compiler from rearranging 
    loads/stores across the barrier. 2. We must stop the CPU from reordering 
    the loads/stores across the barrier.
    "
    
    Do we have any use for compiler barriers that are not also memory 
    barriers? If not, I would suggest not exposing the pg_compiler_barrier() 
    macro, but keep that as an implementation detail in the implementations 
    of pg_memory_barrier().
    
    Some examples on the correct usage of these barriers would be nice, too.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  3. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Robert Haas <robertmhaas@gmail.com> — 2011-09-21T18:33:12Z

    On Thu, Sep 15, 2011 at 11:57 AM, Heikki Linnakangas
    <heikki.linnakangas@enterprisedb.com> wrote:
    > s/atomic/barrier/
    
    Oops.
    
    >> +/*
    >> + * A compiler barrier need not (and preferably should not) emit any
    >> actual
    >> + * machine code, but must act as an optimization fence: the compiler must
    >> not
    >> + * reorder loads or stores to main memory around the barrier.  However,
    >> the
    >> + * CPU may still reorder loads or stores at runtime, if the
    >> architecture's
    >> + * memory model permits this.
    >> + *
    >> + * A memory barrier must act as a compiler barrier, and in addition must
    >> + * guarantee that all loads and stores issued prior to the barrier are
    >> + * completed before any loads or stores issued after the barrier.  Unless
    >> + * loads and stores are totally ordered (which is not the case on most
    >> + * architectures) this requires issuing some sort of memory fencing
    >> + * instruction.
    >
    > This seems like a strange way to explain the problem. I would suggest
    > structuring those paragraphs along the lines of:
    >
    > "
    > A PostgreSQL memory barrier guarantees that any loads/stores before the
    > barrier are completely finished and visible to other CPUs, before the
    > loads/stores after the barrier are performed.
    >
    > That involves two things: 1. We must stop the compiler from rearranging
    > loads/stores across the barrier. 2. We must stop the CPU from reordering the
    > loads/stores across the barrier.
    > "
    
    That doesn't seem much different than I wrote?
    
    One thing to keep in mind about whatever language we use here is that
    memory barrier instructions need not (and often do not) compel the CPU
    to "completely finish" anything before doing the next thing.
    Execution is heavily pipelined, and on a sequence like STORE/WRITE
    BARRIER/STORE the CPU is perfectly well entitled to begin the second
    store before the first one is done.  It just can't let the second
    STORE get far enough for other CPUs to see it.
    
    > Do we have any use for compiler barriers that are not also memory barriers?
    > If not, I would suggest not exposing the pg_compiler_barrier() macro, but
    > keep that as an implementation detail in the implementations of
    > pg_memory_barrier().
    
    I think there might be some use for a pure compiler barrier, but I'm
    not sure yet.  It's probably not worth having a "fallback"
    implementation involving a spinlock, though, because odds are good
    that any code that is performance-critical enough to be attempting to
    safely use a compiler barrier can't survive having a spinlock
    acquire-and-release shoved in there, so any such code should probably
    be #ifdef'd.
    
    > Some examples on the correct usage of these barriers would be nice, too.
    
    That's a big topic.  A trivial example is that if you write:
    
    a[*x] = i;
    ++*x;
    
    ...where x and i are pointers into shared memory, another backend
    might loop over a from 0 to x-1 and accidentally read off the end of
    the array.
    
    But even a full explanation of that case seems like almost too much
    for the comment of a header file, and there are certainly far more
    complex cases.  I think anyone who is using these primitives is going
    to have to do some independent reading...
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  4. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Kevin Grittner <kevin.grittner@wicourts.gov> — 2011-09-21T18:48:17Z

    Robert Haas <robertmhaas@gmail.com> wrote:
     
    > But even a full explanation of that case seems like almost too
    > much for the comment of a header file, and there are certainly far
    > more complex cases.  I think anyone who is using these primitives
    > is going to have to do some independent reading...
     
    Maybe a URL or two in the header comments, pointing to relevant
    papers for the techniques used?  After all, years from now someone
    might be familiar with other techniques from newer papers and wonder
    what the techniques in the code are based on.
     
    -Kevin
    
    
  5. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Robert Haas <robertmhaas@gmail.com> — 2011-09-21T19:08:29Z

    On Wed, Sep 21, 2011 at 2:48 PM, Kevin Grittner
    <Kevin.Grittner@wicourts.gov> wrote:
    > Robert Haas <robertmhaas@gmail.com> wrote:
    >> But even a full explanation of that case seems like almost too
    >> much for the comment of a header file, and there are certainly far
    >> more complex cases.  I think anyone who is using these primitives
    >> is going to have to do some independent reading...
    >
    > Maybe a URL or two in the header comments, pointing to relevant
    > papers for the techniques used?  After all, years from now someone
    > might be familiar with other techniques from newer papers and wonder
    > what the techniques in the code are based on.
    
    If there are any academic papers on this topic, I haven't found them.
    Mostly, I've found lots of articles written by people who were coding
    for the Linux kernel, and a lot of those articles are extremely
    Linux-specific (you could use the smb_rb() macro here, but it's better
    to instead use this RCU-related macro, because it'll do it for you,
    blah blah).  I'm happy to link to any sources anyone thinks we ought
    to link to, but I've mostly had to piece this together bit by bit from
    blog posts and (sometimes buggy) examples.  It hasn't been a
    particularly thrilling exercise.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  6. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Kevin Grittner <kevin.grittner@wicourts.gov> — 2011-09-21T19:39:48Z

    Robert Haas <robertmhaas@gmail.com> wrote:
    > On Wed, Sep 21, 2011 at 2:48 PM, Kevin Grittner
    > <Kevin.Grittner@wicourts.gov> wrote:
    >> Robert Haas <robertmhaas@gmail.com> wrote:
    >>> But even a full explanation of that case seems like almost too
    >>> much for the comment of a header file, and there are certainly
    >>> far more complex cases.  I think anyone who is using these
    >>> primitives is going to have to do some independent reading...
    >>
    >> Maybe a URL or two in the header comments, pointing to relevant
    >> papers for the techniques used?  After all, years from now
    >> someone might be familiar with other techniques from newer papers
    >> and wonder what the techniques in the code are based on.
    > 
    > If there are any academic papers on this topic, I haven't found
    > them.  Mostly, I've found lots of articles written by people who
    > were coding for the Linux kernel, and a lot of those articles are
    > extremely Linux-specific (you could use the smb_rb() macro here,
    > but it's better to instead use this RCU-related macro, because
    > it'll do it for you, blah blah).  I'm happy to link to any sources
    > anyone thinks we ought to link to, but I've mostly had to piece
    > this together bit by bit from blog posts and (sometimes buggy)
    > examples.  It hasn't been a particularly thrilling exercise.
     
    Well, if it really is that hard to piece together the relevant
    techniques, it seems cruel not to check in the results of your
    efforts to work it out somewhere.  Perhaps a README file?
     
    On the other hand, a search turned up these two papers (which I
    haven't yet read, but will):
     
    http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.97.2397&rep=rep1&type=pdf
     
    http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.153.6657&rep=rep1&type=pdf
     
    On a quick scan, they both look promising in themselves, and
    reference a lot of other promising-sounding papers.
     
    -Kevin
    
    
  7. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Gurjeet Singh <singh.gurjeet@gmail.com> — 2011-09-21T20:19:49Z

    On Wed, Sep 14, 2011 at 4:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:
    
    > On Mon, Aug 8, 2011 at 7:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
    > > I've been thinking about this too and actually went so far as to do
    > > some research and put together something that I hope covers most of
    > > the interesting cases.  The attached patch is pretty much entirely
    > > untested, but reflects my present belief about how things ought to
    > > work.
    >
    > And, here's an updated version, with some of the more obviously broken
    > things fixed.
    >
    
    You declare dummy_spinlock variable as extren and use it, but it is not
    defined anywhere. Wouldn't that be a linker error?
    
    -- 
    Gurjeet Singh
    EnterpriseDB Corporation
    The Enterprise PostgreSQL Company
    
  8. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Robert Haas <robertmhaas@gmail.com> — 2011-09-21T20:21:59Z

    On Wed, Sep 21, 2011 at 4:19 PM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
    > On Wed, Sep 14, 2011 at 4:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:
    >>
    >> On Mon, Aug 8, 2011 at 7:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
    >> > I've been thinking about this too and actually went so far as to do
    >> > some research and put together something that I hope covers most of
    >> > the interesting cases.  The attached patch is pretty much entirely
    >> > untested, but reflects my present belief about how things ought to
    >> > work.
    >>
    >> And, here's an updated version, with some of the more obviously broken
    >> things fixed.
    >
    > You declare dummy_spinlock variable as extren and use it, but it is not
    > defined anywhere. Wouldn't that be a linker error?
    
    Yeah, we need to add that somewhere, maybe s_lock.c
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  9. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Robert Haas <robertmhaas@gmail.com> — 2011-09-21T20:40:59Z

    On Wed, Sep 21, 2011 at 3:39 PM, Kevin Grittner
    <Kevin.Grittner@wicourts.gov> wrote:
    > Robert Haas <robertmhaas@gmail.com> wrote:
    >> On Wed, Sep 21, 2011 at 2:48 PM, Kevin Grittner
    >> <Kevin.Grittner@wicourts.gov> wrote:
    >>> Robert Haas <robertmhaas@gmail.com> wrote:
    >>>> But even a full explanation of that case seems like almost too
    >>>> much for the comment of a header file, and there are certainly
    >>>> far more complex cases.  I think anyone who is using these
    >>>> primitives is going to have to do some independent reading...
    >>>
    >>> Maybe a URL or two in the header comments, pointing to relevant
    >>> papers for the techniques used?  After all, years from now
    >>> someone might be familiar with other techniques from newer papers
    >>> and wonder what the techniques in the code are based on.
    >>
    >> If there are any academic papers on this topic, I haven't found
    >> them.  Mostly, I've found lots of articles written by people who
    >> were coding for the Linux kernel, and a lot of those articles are
    >> extremely Linux-specific (you could use the smb_rb() macro here,
    >> but it's better to instead use this RCU-related macro, because
    >> it'll do it for you, blah blah).  I'm happy to link to any sources
    >> anyone thinks we ought to link to, but I've mostly had to piece
    >> this together bit by bit from blog posts and (sometimes buggy)
    >> examples.  It hasn't been a particularly thrilling exercise.
    >
    > Well, if it really is that hard to piece together the relevant
    > techniques, it seems cruel not to check in the results of your
    > efforts to work it out somewhere.  Perhaps a README file?
    
    I don't know if techniques is the right word.  I mean, there are three
    questions that you might want to answer here:
    
    1. I have a new architecture and I want barrier.h to support it.  What
    do I need to do?
    2. What is the behavior of the various constructs provided by barrier.h?
    3. Why would I want that behavior and how can I use it to do cool stuff?
    
    I intended the comment in that file to be enough to answer questions
    #1 and #2.  What you and Heikki are asking about is really #3, and
    that seems to me to be setting the bar awfully high.  I mean, lwlock.c
    explains what a lightweight lock does, but it doesn't explain all of
    the ways that a lightweight lock can be used to solve performance
    problems, nor should it.  That would be recapitulating the
    documentation that is hopefully present everywhere that LWLocks are
    used as well as speculating about future applications.  It just
    doesn't seem sensible to me to try to enumerate all the ways that a
    fundamental primitive can potentially be used down the line.
    
    What I found hard about memory barriers is basically this: I didn't
    understand that the CPU is allowed to perform operations out of order.
     And I couldn't understand what the consequences of that fact were.  I
    sort of understood - but hadn't really internalized - the idea that
    execution is highly pipelined, so the single moment at which an
    execution is performed is not well defined.  Before I really got my
    head around it, I had to read the explanations of what a memory
    barrier actually does over and over again.  I probably read ten
    different explanations saying the same thing in different words about
    twenty times a piece, and slowly the light dawned.  I did my best to
    explain that in the existing comment; I'm happy to expand the comment
    if people have suggestions for what to put in there; but basically I
    think this is a hard concept and if you haven't done this stuff before
    it's going to take a while to get up to speed.
    
    > On the other hand, a search turned up these two papers (which I
    > haven't yet read, but will):
    >
    > http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.97.2397&rep=rep1&type=pdf
    >
    > http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.153.6657&rep=rep1&type=pdf
    >
    > On a quick scan, they both look promising in themselves, and
    > reference a lot of other promising-sounding papers.
    
    I'm not sure these are much help for learning how to program with
    memory barriers, but if somebody really wants them (or something else)
    included, I'm not going to fight too hard.  I don't expect this to be
    perfect the first time through; I am just trying to get a basic
    infrastructure in place.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  10. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Kevin Grittner <kevin.grittner@wicourts.gov> — 2011-09-21T21:07:51Z

    Robert Haas <robertmhaas@gmail.com> wrote:
     
    > there are three questions that you might want to answer here:
    > 
    > 1. I have a new architecture and I want barrier.h to support it. 
    >    What do I need to do?
    > 2. What is the behavior of the various constructs provided by
    >    barrier.h?
    > 3. Why would I want that behavior and how can I use it to do cool
    >    stuff?
    > 
    > I intended the comment in that file to be enough to answer
    > questions #1 and #2.  What you and Heikki are asking about is
    > really #3, and that seems to me to be setting the bar awfully
    > high.
     
    OK, put that way, I mostly agree.  A general overview of the known
    usage patterns in a header or README file still doesn't seem out of
    line to me.  With LW locks, for example, I've only seen two patterns
    used in PostgreSQL:
     
    (1) Get a shared lock for reads and an exclusive lock for writes, or
    (2) get a shared lock to read or write your own data, but an
    exclusive lock to read anyone else's data.
     
    In both cases, there must be a defined order of lock acquisition to
    avoid deadlock, with a strong recommendation that locks be released
    in the reverse order.  Mentioning that much doesn't preclude other
    uses of LW locks, but might help people new to the code.  That's the
    level of summary *I* would like to see included.
     
    > What I found hard about memory barriers is basically this: I
    > didn't understand that the CPU is allowed to perform operations
    > out of order.  And I couldn't understand what the consequences of
    > that fact were.  I sort of understood - but hadn't really
    > internalized - the idea that execution is highly pipelined, so the
    > single moment at which an execution is performed is not well
    > defined.  Before I really got my head around it, I had to read the
    > explanations of what a memory barrier actually does over and over
    > again.  I probably read ten different explanations saying the same
    > thing in different words about twenty times a piece, and slowly
    > the light dawned.  I did my best to explain that in the existing
    > comment; I'm happy to expand the comment if people have
    > suggestions for what to put in there; but basically I think this
    > is a hard concept and if you haven't done this stuff before it's
    > going to take a while to get up to speed.
     
    That's the sort of thing where it would be helpful to provide one or
    two URLs for cogent explanations of this.  Even if it takes repeated
    readings and meditations on the explanations for it to sink in, this
    is worth it.  (For SSI I had to read the paper many times, and then
    go read several referenced papers, before I really had my head
    around it, and I've had others say the same thing.  But having a
    link to the material gives someone a chance to *do* that.)
     
    -Kevin
    
    
  11. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Robert Haas <robertmhaas@gmail.com> — 2011-09-22T13:55:35Z

    On Wed, Sep 21, 2011 at 5:07 PM, Kevin Grittner
    <Kevin.Grittner@wicourts.gov> wrote:
    > That's the sort of thing where it would be helpful to provide one or
    > two URLs for cogent explanations of this.  Even if it takes repeated
    > readings and meditations on the explanations for it to sink in, this
    > is worth it.  (For SSI I had to read the paper many times, and then
    > go read several referenced papers, before I really had my head
    > around it, and I've had others say the same thing.  But having a
    > link to the material gives someone a chance to *do* that.)
    
    Hmm....
    
    <looks around the Internet some more>
    
    These might be a good place to start, although the first one is
    somewhat Linux-kernel specific:
    
    http://www.rdrop.com/users/paulmck/scalability/paper/ordering.2007.09.19a.pdf
    http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.06.07c.pdf
    
    There's also a reasonably cogent explanation in the Linux kernel
    itself, in Documentation/memory-barriers.txt
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  12. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Peter Geoghegan <peter@2ndquadrant.com> — 2011-09-22T14:53:22Z

    On 14 September 2011 21:29, Robert Haas <robertmhaas@gmail.com> wrote:
    > On Mon, Aug 8, 2011 at 7:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
    >> I've been thinking about this too and actually went so far as to do
    >> some research and put together something that I hope covers most of
    >> the interesting cases.  The attached patch is pretty much entirely
    >> untested, but reflects my present belief about how things ought to
    >> work.
    >
    > And, here's an updated version, with some of the more obviously broken
    > things fixed.
    
    As I've already pointed out, the comment "Won't work on Visual Studio
    2003" is not accurate:
    
    http://msdn.microsoft.com/en-us/library/f20w0x5e(v=vs.71).aspx
    
    Besides, if it's not supported, why bother mentioning it?
    
    -- 
    Peter Geoghegan       http://www.2ndQuadrant.com/
    PostgreSQL Development, 24x7 Support, Training and Services
    
    
  13. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Robert Haas <robertmhaas@gmail.com> — 2011-09-22T15:18:47Z

    On Thu, Sep 22, 2011 at 10:53 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote:
    > As I've already pointed out, the comment "Won't work on Visual Studio
    > 2003" is not accurate:
    >
    > http://msdn.microsoft.com/en-us/library/f20w0x5e(v=vs.71).aspx
    >
    > Besides, if it's not supported, why bother mentioning it?
    
    I mentioned it because it took me a long time to figure out whether it
    was supported or not, and I finally came to the conclusion that it
    wasn't.  I stand corrected, though; I've now removed that reference.
    Sorry for not jumping on it sooner; it was still vaguely on my list of
    things to fix at some point, but it hadn't percolated to the top yet.
    
    The attached version (hopefully) fixes various other things people
    have complained about as well, including:
    
    - Heikki's complaint about sometimes writing atomic instead of barrier
    (which was leftovers), and
    - Gurjeet's complaint that I hadn't defined the variable anywhere
    
    I've also added a lengthy README file to the patch that attempts to
    explain how barriers should be used in PostgreSQL coding.  It's
    certainly not a comprehensive treatment of the topic, but hopefully
    it's enough to get people oriented.  I've attempted to tailor it a bit
    to PostgreSQL conventions, like talking about shared memory vs.
    backend-private memory instead of assuming (as a number of other
    discussions of this topic do) a thread model.  It also includes some
    advice about when memory barriers shouldn't be used or won't work, and
    some references for further reading.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
  14. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Thom Brown <thom@linux.com> — 2011-09-22T15:25:33Z

    On 22 September 2011 16:18, Robert Haas <robertmhaas@gmail.com> wrote:
    > On Thu, Sep 22, 2011 at 10:53 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote:
    >> As I've already pointed out, the comment "Won't work on Visual Studio
    >> 2003" is not accurate:
    >>
    >> http://msdn.microsoft.com/en-us/library/f20w0x5e(v=vs.71).aspx
    >>
    >> Besides, if it's not supported, why bother mentioning it?
    >
    > I mentioned it because it took me a long time to figure out whether it
    > was supported or not, and I finally came to the conclusion that it
    > wasn't.  I stand corrected, though; I've now removed that reference.
    > Sorry for not jumping on it sooner; it was still vaguely on my list of
    > things to fix at some point, but it hadn't percolated to the top yet.
    >
    > The attached version (hopefully) fixes various other things people
    > have complained about as well, including:
    >
    > - Heikki's complaint about sometimes writing atomic instead of barrier
    > (which was leftovers), and
    > - Gurjeet's complaint that I hadn't defined the variable anywhere
    >
    > I've also added a lengthy README file to the patch that attempts to
    > explain how barriers should be used in PostgreSQL coding.  It's
    > certainly not a comprehensive treatment of the topic, but hopefully
    > it's enough to get people oriented.  I've attempted to tailor it a bit
    > to PostgreSQL conventions, like talking about shared memory vs.
    > backend-private memory instead of assuming (as a number of other
    > discussions of this topic do) a thread model.  It also includes some
    > advice about when memory barriers shouldn't be used or won't work, and
    > some references for further reading.
    
    s/visca-versa/vice-versa/
    s/laods/loads/
    
    -- 
    Thom Brown
    Twitter: @darkixion
    IRC (freenode): dark_ixion
    Registered Linux user: #516935
    
    EnterpriseDB UK: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  15. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Robert Haas <robertmhaas@gmail.com> — 2011-09-22T15:31:36Z

    On Thu, Sep 22, 2011 at 11:25 AM, Thom Brown <thom@linux.com> wrote:
    > s/visca-versa/vice-versa/
    > s/laods/loads/
    
    Fixed.  v4 attached.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
  16. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-09-22T15:44:58Z

    Excerpts from Robert Haas's message of jue sep 22 12:18:47 -0300 2011:
    
    > I've also added a lengthy README file to the patch that attempts to
    > explain how barriers should be used in PostgreSQL coding.
    
    Very enlightening, thanks.  Note a typo "laods".
    
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
    
    
  17. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Kevin Grittner <kevin.grittner@wicourts.gov> — 2011-09-22T17:48:38Z

    Robert Haas <robertmhaas@gmail.com> wrote:
     
    > I've also added a lengthy README file to the patch that attempts
    > to explain how barriers should be used in PostgreSQL coding.  It's
    > certainly not a comprehensive treatment of the topic, but
    > hopefully it's enough to get people oriented.  I've attempted to
    > tailor it a bit to PostgreSQL conventions, like talking about
    > shared memory vs.backend-private memory instead of assuming (as a
    > number of other discussions of this topic do) a thread model.  It
    > also includes some advice about when memory barriers shouldn't be
    > used or won't work, and some references for further reading.
     
    Thanks, that seems like it's at the right level of detail to me.
     
    -Kevin
    
    
  18. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Jeff Davis <pgsql@j-davis.com> — 2011-09-22T21:45:50Z

    On Thu, 2011-09-22 at 11:31 -0400, Robert Haas wrote:
    > On Thu, Sep 22, 2011 at 11:25 AM, Thom Brown <thom@linux.com> wrote:
    > > s/visca-versa/vice-versa/
    > > s/laods/loads/
    > 
    > Fixed.  v4 attached.
    > 
    
    Can you please explain the "more subtly" part below?
    
    +A common pattern where this actually does result in a bug is when
    adding items
    +onto a queue.  The writer does this:
    +
    +    q->items[q->num_items] = new_item;
    +    ++q->num_items;
    +
    +The reader does this:
    +
    +    num_items = q->num_items;
    +    for (i = 0; i < num_items; ++i)
    +        /* do something with q->items[i] */
    +
    +This code turns out to be unsafe, because the writer might increment
    +q->num_items before it finishes storing the new item into the
    appropriate slot.
    +More subtly, the reader might prefetch the contents of the q->items
    array
    +before reading q->num_items.
    
    How would the reader prefetch the contents of the items array, without
    knowing how big it is?
    
    Regards,
    	Jeff Davis
    
    
    
  19. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Robert Haas <robertmhaas@gmail.com> — 2011-09-22T23:12:41Z

    On Thu, Sep 22, 2011 at 5:45 PM, Jeff Davis <pgsql@j-davis.com> wrote:
    > +This code turns out to be unsafe, because the writer might increment
    > +q->num_items before it finishes storing the new item into the
    > appropriate slot.
    > +More subtly, the reader might prefetch the contents of the q->items
    > array
    > +before reading q->num_items.
    >
    > How would the reader prefetch the contents of the items array, without
    > knowing how big it is?
    
    By guessing or (I think) just by having a stale value left over in
    some CPU cache.  It's pretty mind-bending, but it's for real.
    
    I didn't, in either the implementation or the documentation, go much
    into the difference between dependency barriers and general read
    barriers.  We might need to do that at some point, but for a first
    version I don't think it's necessary.  But since you asked... as I
    understand it, unless you're running on Alpha, you actually don't need
    a barrier here at all, because all currently-used CPUs other than
    alpha "respect data dependencies", which means that if q->num_items is
    used to compute an address to be read from memory, the CPU will ensure
    that the read of that address is performed after the read of the value
    used to compute the address.  At least that's my understanding.  But
    Alpha won't.
    
    So we could try to further distinguish between read barriers where a
    data dependency is present and read barriers where no data dependency
    is present, and the latter type could be a no-op on all CPUs other
    than Alpha.  Or we could even jettison support for Alpha altogether if
    we think it's hopelessly obsolete and omit
    read-barriers-with-dependency altogether.  I think that would be a bad
    idea, though.  First, it's not impossible that some future CPU could
    have behavior similar to Alpha, and the likelihood of such a thing is
    substantially more because of the fact that the Linux kernel, which
    seems to be the gold standard in this area, still supports them.  If
    we don't record places where a dependency barrier would be needed and
    then need to go find them later, that will be a lot more work, and a
    lot more error-prone.  Second, there's a natural pairing between read
    barriers and write barriers.  Generally, for every algorithm, each
    write barrier on the write side should be matched by a read barrier on
    the read side.  So putting them all in will make it easier to verify
    code correctness.  Now, if we find down the line that some of those
    read barriers are hurting our performance on, say, Itanium, or
    PowerPC, then we can certainly consider distinguishing further.  But
    for round one I'm voting for not worrying about it.  I think it's
    going to be a lot more important to put our energy into (1) adding
    barrier implementations for any platforms that aren't included in this
    initial patch that we want to support, (2) making sure that all of our
    implementations actually work, and (3) making sure that the algorithms
    that use them are correct.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  20. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Jeff Davis <pgsql@j-davis.com> — 2011-09-22T23:46:17Z

    On Thu, 2011-09-22 at 19:12 -0400, Robert Haas wrote:
    > But since you asked... as I
    > understand it, unless you're running on Alpha, you actually don't need
    > a barrier here at all, because all currently-used CPUs other than
    > alpha "respect data dependencies", which means that if q->num_items is
    > used to compute an address to be read from memory, the CPU will ensure
    > that the read of that address is performed after the read of the value
    > used to compute the address.  At least that's my understanding.  But
    > Alpha won't.
    
    I'm still trying to figure out how it's even possible to read an address
    that's not computed yet. Something sounds strange about that...
    
    I think it might have more to do with branch prediction or something
    else. In your example, the address is not computed from q->num_items
    directly, it's computed using "i". But that branch being followed is
    dependent on a comparison with q->num_items. Maybe that's the dependency
    that's not respected?
    
    Regards,
    	Jeff Davis
    
    
    
  21. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Robert Haas <robertmhaas@gmail.com> — 2011-09-23T01:11:54Z

    On Thu, Sep 22, 2011 at 7:46 PM, Jeff Davis <pgsql@j-davis.com> wrote:
    > On Thu, 2011-09-22 at 19:12 -0400, Robert Haas wrote:
    >> But since you asked... as I
    >> understand it, unless you're running on Alpha, you actually don't need
    >> a barrier here at all, because all currently-used CPUs other than
    >> alpha "respect data dependencies", which means that if q->num_items is
    >> used to compute an address to be read from memory, the CPU will ensure
    >> that the read of that address is performed after the read of the value
    >> used to compute the address.  At least that's my understanding.  But
    >> Alpha won't.
    >
    > I'm still trying to figure out how it's even possible to read an address
    > that's not computed yet. Something sounds strange about that...
    
    That's because it's strange.  You might have a look at
    http://www.linuxjournal.com/article/8212
    
    Basically, it seems like on Alpha, the CPU is allowed to do pretty
    much anything short of entirely fabricating the value that gets
    returned.
    
    > I think it might have more to do with branch prediction or something
    > else. In your example, the address is not computed from q->num_items
    > directly, it's computed using "i". But that branch being followed is
    > dependent on a comparison with q->num_items. Maybe that's the dependency
    > that's not respected?
    
    You might be right.  I can't swear I understand exactly what goes
    wrong there; in fact I'm not 100% sure that you don't need a
    read-barrier on things less crazy than Alpha.  I speculate that the
    problem is something this: q->num_items is in some cache line and all
    the elements of q->items is in some other cache line, and you see that
    you're about to use both of those so you suck the cache lines into
    memory.  But because one cache bank is busier than the other, you get
    q->items first.  And between the time you get the cache line
    containing q->items and the time you get the cache line containing
    q->num_items, someone insert an item into the queue, and now you're
    hosed, because you have the old array contents with the new array
    length.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  22. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Greg Stark <stark@mit.edu> — 2011-09-23T15:22:09Z

    On Thu, Sep 22, 2011 at 10:45 PM, Jeff Davis <pgsql@j-davis.com> wrote:
    > +    for (i = 0; i < num_items; ++i)
    > +        /* do something with q->items[i] */
    > +
    > +This code turns out to be unsafe, because the writer might increment
    > +q->num_items before it finishes storing the new item into the
    > appropriate slot.
    > +More subtly, the reader might prefetch the contents of the q->items
    > array
    > +before reading q->num_items.
    >
    > How would the reader prefetch the contents of the items array, without
    > knowing how big it is?
    
    I don't think this is as mysterious as it sounds. Imagine the compiler
    unrolled the loop so that it does something like
    
    fetch num_items into register
    if register > 0 fetch q[0], process it
    if register > 1 fetch q[1], process it
    ...
    
    Then the cpu could easily do branch prediction on the ifs and fetch
    q[0] while the fetch of num_items was still in progress. If it turns
    out that num_items is 0 then it would invalidate the operations
    initiated after the branch prediction but if it sees 1 (ie after the
    increment in the other process) then it's not so it doesn't.
    
    So you have two memory fetches which I guess I still imagine have to
    be initiated in the right order but they're both in flight at the same
    time. I have no idea how the memory controller works and I could
    easily imagine either one grabbing the value before the other.
    
    I'm not even clear how other processors can reasonably avoid this. It
    must be fantastically expensive to keep track of which branch
    predictions depended on which registers and which memory fetches the
    value of those registers depended on. And then it would have  to
    somehow inform the memory controller of those old memory fetches that
    this new memory fetch is dependent on and have it ensure that the
    fetches are read the right order?
    
    -- 
    greg
    
    
  23. Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Robert Haas <robertmhaas@gmail.com> — 2011-09-23T22:02:31Z

    On Thu, Sep 22, 2011 at 11:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
    > On Thu, Sep 22, 2011 at 11:25 AM, Thom Brown <thom@linux.com> wrote:
    >> s/visca-versa/vice-versa/
    >> s/laods/loads/
    >
    > Fixed.  v4 attached.
    
    Since it seems like people are fairly happy with this now, I've gone
    ahead and committed this version.  I suspect there are bugs, but I
    don't think we're going to get much further until we actually try to
    use this for something.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  24. Re: Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Martijn van Oosterhout <kleptog@svana.org> — 2011-09-24T13:45:26Z

    On Fri, Sep 23, 2011 at 04:22:09PM +0100, Greg Stark wrote:
    > So you have two memory fetches which I guess I still imagine have to
    > be initiated in the right order but they're both in flight at the same
    > time. I have no idea how the memory controller works and I could
    > easily imagine either one grabbing the value before the other.
    
    That's easy. If one is in cache and the other isn't then the results
    will come back out of order.
    
    > I'm not even clear how other processors can reasonably avoid this. It
    > must be fantastically expensive to keep track of which branch
    > predictions depended on which registers and which memory fetches the
    > value of those registers depended on. And then it would have  to
    > somehow inform the memory controller of those old memory fetches that
    > this new memory fetch is dependent on and have it ensure that the
    > fetches are read the right order?
    
    I think memory accesses are also fantastically expensive, so it's worth
    some effort to optimise that.
    
    I found the Linux kernel document on this topic quite readable. I think
    the main lesson here is that processors track data dependancies (other
    than the Alpha apparently), but not control dependancies.  So in the
    example, the value of i is dependant on num_items, but not via any
    calculation.  IThat control dependancies are not tracked makes some
    sense, since branches depend on flags bit, and just about any
    calculation changes the flag bits, but most of the time these changes
    are not used.
    
    It also not a question of the knowing the address either, since the
    first load, if any, will be *q->items, irrespective of the precise
    value of num_items.  This address may be calculated long in advance.
    
    Have a nice day,
    -- 
    Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
    > He who writes carelessly confesses thereby at the very outset that he does
    > not attach much importance to his own thoughts.
       -- Arthur Schopenhauer
    
  25. Re: Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Robert Haas <robertmhaas@gmail.com> — 2011-09-24T16:46:48Z

    On Sat, Sep 24, 2011 at 9:45 AM, Martijn van Oosterhout
    <kleptog@svana.org> wrote:
    > I think memory accesses are also fantastically expensive, so it's worth
    > some effort to optimise that.
    
    This is definitely true.
    
    > I found the Linux kernel document on this topic quite readable. I think
    > the main lesson here is that processors track data dependancies (other
    > than the Alpha apparently), but not control dependancies.  So in the
    > example, the value of i is dependant on num_items, but not via any
    > calculation.  IThat control dependancies are not tracked makes some
    > sense, since branches depend on flags bit, and just about any
    > calculation changes the flag bits, but most of the time these changes
    > are not used.
    
    Oh, that's interesting.  So that implies that a read-barrier would be
    needed here even on non-Alpha.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  26. Re: Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

    Martijn van Oosterhout <kleptog@svana.org> — 2011-09-24T17:37:52Z

    On Sat, Sep 24, 2011 at 12:46:48PM -0400, Robert Haas wrote:
    > > I found the Linux kernel document on this topic quite readable. I think
    > > the main lesson here is that processors track data dependancies (other
    > > than the Alpha apparently), but not control dependancies.  So in the
    > > example, the value of i is dependant on num_items, but not via any
    > > calculation.  IThat control dependancies are not tracked makes some
    > > sense, since branches depend on flags bit, and just about any
    > > calculation changes the flag bits, but most of the time these changes
    > > are not used.
    > 
    > Oh, that's interesting.  So that implies that a read-barrier would be
    > needed here even on non-Alpha.
    
    That is my understanding. At source code level the address being
    referenced is dependant on i, but at assembly level it's possible i has
    been optimised away altogether.
    
    I think the relevent example is here:
    http://www.mjmwired.net/kernel/Documentation/memory-barriers.txt (line 725)
    
    Where A = q->items[0] and B = q->num_items.
    
    There is no data dependancy here, so inserting such a barrier won't
    help. You need a normal read barrier.
    
    OTOH, if the list already has an entry in it, the problem (probably)
    goes away, although with loop unrolling you can't really be sure.
    
    Have a nice day,
    -- 
    Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
    > He who writes carelessly confesses thereby at the very outset that he does
    > not attach much importance to his own thoughts.
       -- Arthur Schopenhauer