Thread

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Introduce compact WAL record for the common case of commit (non-DDL).

  1. WALInsertLock tuning

    Simon Riggs <simon@2ndquadrant.com> — 2011-06-06T16:15:28Z

    In earlier discussions of how to improve WALInsertLock contention, it
    was observed that we must zero each new page before we advance the WAL
    insertion point.
    http://postgresql.1045698.n5.nabble.com/Reworking-WAL-locking-td1983647.html
    
    IMHO the page zeroing is completely unnecessary, and replication works
    perfectly well without that (as a test of recovery logic). It is
    unnecessary because we already allow non-zeroed parts of WAL files,
    and provide a mechanism to detect stale data.
    
    The following trivial patch removes the page zeroing, which reduces
    the lock duration.
    
    Comments?
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
  2. Re: WALInsertLock tuning

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-06-06T16:47:44Z

    Simon Riggs <simon@2ndQuadrant.com> writes:
    > In earlier discussions of how to improve WALInsertLock contention, it
    > was observed that we must zero each new page before we advance the WAL
    > insertion point.
    > http://postgresql.1045698.n5.nabble.com/Reworking-WAL-locking-td1983647.html
    
    > IMHO the page zeroing is completely unnecessary,
    
    I don't believe it's "completely unnecessary".  It does in fact offer
    additional protection against mistakenly taking stale data as valid.
    You could maybe argue that the degree of safety increase isn't
    sufficient to warrant the cost of zeroing the page, but you've not
    offered any quantification of either the risk or the cost savings.
    
    			regards, tom lane
    
    
  3. Re: WALInsertLock tuning

    Simon Riggs <simon@2ndquadrant.com> — 2011-06-06T18:27:50Z

    On Mon, Jun 6, 2011 at 5:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > Simon Riggs <simon@2ndQuadrant.com> writes:
    >> In earlier discussions of how to improve WALInsertLock contention, it
    >> was observed that we must zero each new page before we advance the WAL
    >> insertion point.
    >> http://postgresql.1045698.n5.nabble.com/Reworking-WAL-locking-td1983647.html
    >
    >> IMHO the page zeroing is completely unnecessary,
    >
    > I don't believe it's "completely unnecessary".  It does in fact offer
    > additional protection against mistakenly taking stale data as valid.
    > You could maybe argue that the degree of safety increase isn't
    > sufficient to warrant the cost of zeroing the page, but you've not
    > offered any quantification of either the risk or the cost savings.
    
    If we did ever reference stale data, it would need to have a value of
    xl_prev that exactly matched what we expected AND would also need a
    CRC that exactly matched also. That would be fairly difficult to
    arrange deliberately and pretty close to zero chance of happening
    otherwise.
    
    But that even assumes we write the unzeroed data at the end of the
    buffer. We don't. We only write data up to the end of the WAL record
    on the current page, unless we do a continuation record, which means
    only replay we would read in another page and check page headers. So
    for this to cause an error, we'd have to have an overall matching CRC,
    a matching xl_prev and at least one matching page header, which
    contains a pageaddress.
    
    But that even assumes we would read data in a different way to which
    it was written.
    
    So the only way we could ever reference the stale data is if the WAL
    reading code didn't match the WAL writing code (1) because of a bug or
    (2) because of a corrupt pg_control record that caused random access
    to an otherwise unreachable part of WAL
    AND
    the CRC matched, and the xl_prev matched and the next page header matched.
    
    Risk == zero. If it wasn't zero I would even mention it because this
    is not a trade-off optimization.
    
    Cost savings are those already identified by yourself in our previous
    discussion on this, link given upthread. It's the biggest single
    removable item from within WALInsertLock. We can measure them if you
    like, but I don't see the value in that. It will clearly be a useful
    saving on what we all agree is a heavily contended lock.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  4. Re: WALInsertLock tuning

    Jeff Janes <jeff.janes@gmail.com> — 2011-06-06T21:09:35Z

    On Mon, Jun 6, 2011 at 11:27 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >
    > But that even assumes we write the unzeroed data at the end of the
    > buffer. We don't. We only write data up to the end of the WAL record
    > on the current page, unless we do a continuation record,
    
    I see no codepath in XLogWrite which writes anything other than full
    buffer pages.
    
    Cheers,
    
    Jeff
    
    
  5. Re: WALInsertLock tuning

    Simon Riggs <simon@2ndquadrant.com> — 2011-06-06T22:05:59Z

    On Mon, Jun 6, 2011 at 10:09 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
    > On Mon, Jun 6, 2011 at 11:27 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >>
    >> But that even assumes we write the unzeroed data at the end of the
    >> buffer. We don't. We only write data up to the end of the WAL record
    >> on the current page, unless we do a continuation record,
    >
    > I see no codepath in XLogWrite which writes anything other than full
    > buffer pages.
    
    Second line, boolean variable called "ispartialpage".
    
    As I mentioned, even if spare bytes at the end of page were written,
    they aren't ever read except in corner case bugs that would be trapped
    by other logic put there to protect us.
    
    We're safe. If I didn't believe it and hadn't tested it, I wouldn't speak.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  6. Re: WALInsertLock tuning

    Robert Haas <robertmhaas@gmail.com> — 2011-06-06T22:25:08Z

    On Mon, Jun 6, 2011 at 6:05 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
    > On Mon, Jun 6, 2011 at 10:09 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
    >> On Mon, Jun 6, 2011 at 11:27 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >>>
    >>> But that even assumes we write the unzeroed data at the end of the
    >>> buffer. We don't. We only write data up to the end of the WAL record
    >>> on the current page, unless we do a continuation record,
    >>
    >> I see no codepath in XLogWrite which writes anything other than full
    >> buffer pages.
    >
    > Second line, boolean variable called "ispartialpage".
    
    That affects our tracking of the write and flush positions, but not
    what we actually write to the file:
    
                            /* OK to write the page(s) */
                            from = XLogCtl->pages + startidx * (Size) XLOG_BLCKSZ;
                            nbytes = npages * (Size) XLOG_BLCKSZ;
                            errno = 0;
                            if (write(openLogFile, from, nbytes) != nbytes)
    
    As you can see, nbytes is always a multiple of XLOG_BLCKSZ.
    
    > As I mentioned, even if spare bytes at the end of page were written,
    > they aren't ever read except in corner case bugs that would be trapped
    > by other logic put there to protect us.
    >
    > We're safe. If I didn't believe it and hadn't tested it, I wouldn't speak.
    
    I think you need to do some analysis of how much this actually
    improves performance.  I don't like the idea of applying performance
    patches without performance testing; sometimes the results are
    counterintuitive.  And even when they're not, it's nice to know how
    much you've gained.
    
    As to the question of whether it's safe, I think I'd agree that the
    chances of this backfiring are pretty remote.  I think that with the
    zeroing they are exactly zero, because (now that we start XLOG
    positions at 0/1) there is now way that xl_prev = {0, 0} can ever be
    valid.  Without the zeroing, well, it's remotely possible that xl_prev
    could happen to appear valid and that xl_crc could happen to match...
    but the chances are presumably quite remote.  Just the chances of the
    CRC matching should be around 2^-32.  The chances of an xl_prev match
    are harder to predict, because the matching values for CRCs should be
    pretty much uniformly distributed, while xl_prev isn't random.  But
    even given that the chance of a match is should be very small, so in
    practice there is likely no harm.  It strikes me, though, that we
    could probably get nearly all of the benefit of this patch by being
    willing to zero the first sizeof(XLogRecord) bytes following a record,
    but not the rest of the buffer.  That would pretty much wipe out any
    chance of an xl_prev match, I think, and would likely still get nearly
    all of the performance benefit.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  7. Re: WALInsertLock tuning

    Simon Riggs <simon@2ndquadrant.com> — 2011-06-07T07:21:32Z

    On Mon, Jun 6, 2011 at 11:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
    
    > As to the question of whether it's safe, I think I'd agree that the
    > chances of this backfiring are pretty remote.  I think that with the
    > zeroing they are exactly zero, because (now that we start XLOG
    > positions at 0/1) there is now way that xl_prev = {0, 0} can ever be
    > valid.  Without the zeroing, well, it's remotely possible that xl_prev
    > could happen to appear valid and that xl_crc could happen to match...
    > but the chances are presumably quite remote.  Just the chances of the
    > CRC matching should be around 2^-32.  The chances of an xl_prev match
    > are harder to predict, because the matching values for CRCs should be
    > pretty much uniformly distributed, while xl_prev isn't random.  But
    > even given that the chance of a match is should be very small, so in
    > practice there is likely no harm.
    
    And if such a thing did actually happen we would also need to have an
    accidentally correct value of all of the rest of the header values.
    And even if we did we would apply at most one junk WAL record. Then we
    are onto the next WAL record where we would need have to the same luck
    all over again.
    
    The probability of these occurrences is well below the acceptable
    threshold for other problems occurring.
    
    > It strikes me, though, that we
    > could probably get nearly all of the benefit of this patch by being
    > willing to zero the first sizeof(XLogRecord) bytes following a record,
    > but not the rest of the buffer.  That would pretty much wipe out any
    > chance of an xl_prev match, I think, and would likely still get nearly
    > all of the performance benefit.
    
    Which adds something onto the path of every XlogInsert(), rather than
    once per page, so I'm a little hesitant to agree.
    
    If we did that, we would only need to write out an additional 12 bytes
    per WAL record, not the full sizeof(XLogRecord).
    
    But even so, I think its wasted effort.
    
    Measuring the benefit of a performance patch is normal, but I'm not
    proposing this as a risk trade-off. It's just a straight removal of
    multiple cycles from a hot code path. The exact benefit will depend
    upon whether the WALInsertLock is the hot lock, which it likely will
    be when other patches are applied.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  8. Re: WALInsertLock tuning

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-06-07T07:27:27Z

    On 07.06.2011 10:21, Simon Riggs wrote:
    > On Mon, Jun 6, 2011 at 11:25 PM, Robert Haas<robertmhaas@gmail.com>  wrote:
    >> It strikes me, though, that we
    >> could probably get nearly all of the benefit of this patch by being
    >> willing to zero the first sizeof(XLogRecord) bytes following a record,
    >> but not the rest of the buffer.  That would pretty much wipe out any
    >> chance of an xl_prev match, I think, and would likely still get nearly
    >> all of the performance benefit.
    >
    > Which adds something onto the path of every XlogInsert(), rather than
    > once per page, so I'm a little hesitant to agree.
    
    You would only need to do it just before you write out the WAL. I guess 
    you'd need to grab WALInsertLock in XLogWrite() to prevent more WAL 
    records from being inserted on the page until you're done zeroing it, 
    though.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  9. Re: WALInsertLock tuning

    Simon Riggs <simon@2ndquadrant.com> — 2011-06-07T07:55:37Z

    On Tue, Jun 7, 2011 at 8:27 AM, Heikki Linnakangas
    <heikki.linnakangas@enterprisedb.com> wrote:
    > On 07.06.2011 10:21, Simon Riggs wrote:
    >>
    >> On Mon, Jun 6, 2011 at 11:25 PM, Robert Haas<robertmhaas@gmail.com>
    >>  wrote:
    >>>
    >>> It strikes me, though, that we
    >>> could probably get nearly all of the benefit of this patch by being
    >>> willing to zero the first sizeof(XLogRecord) bytes following a record,
    >>> but not the rest of the buffer.  That would pretty much wipe out any
    >>> chance of an xl_prev match, I think, and would likely still get nearly
    >>> all of the performance benefit.
    >>
    >> Which adds something onto the path of every XlogInsert(), rather than
    >> once per page, so I'm a little hesitant to agree.
    >
    > You would only need to do it just before you write out the WAL. I guess
    > you'd need to grab WALInsertLock in XLogWrite() to prevent more WAL records
    > from being inserted on the page until you're done zeroing it, though.
    
    How would that help?
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  10. Re: WALInsertLock tuning

    Robert Haas <robertmhaas@gmail.com> — 2011-06-07T12:24:47Z

    On Tue, Jun 7, 2011 at 3:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >> It strikes me, though, that we
    >> could probably get nearly all of the benefit of this patch by being
    >> willing to zero the first sizeof(XLogRecord) bytes following a record,
    >> but not the rest of the buffer.  That would pretty much wipe out any
    >> chance of an xl_prev match, I think, and would likely still get nearly
    >> all of the performance benefit.
    >
    > Which adds something onto the path of every XlogInsert(), rather than
    > once per page, so I'm a little hesitant to agree.
    
    Urk.  Well, we don't want that, for sure.   The previous discussion
    was talking about moving the zeroing around somehow, rather than
    getting rid of it, so maybe there's some way to make it work...
    
    One other thought is that I think that this patch might cause a
    user-visible behavior change.  Right now, when you hit the end of
    recovery, you most typically get a message saying - record with zero
    length.  Not always, but often.  If we adopt this approach, you'll get
    a wider variety of error messages there, depending on exactly how the
    new record fails validation.  I dunno if that's important to be worth
    caring about, or not.
    
    > If we did that, we would only need to write out an additional 12 bytes
    > per WAL record, not the full sizeof(XLogRecord).
    >
    > But even so, I think its wasted effort.
    >
    > Measuring the benefit of a performance patch is normal, but I'm not
    > proposing this as a risk trade-off. It's just a straight removal of
    > multiple cycles from a hot code path. The exact benefit will depend
    > upon whether the WALInsertLock is the hot lock, which it likely will
    > be when other patches are applied.
    
    I don't think it's too hard to construct a test case where it is, even
    now.  pgbench on a medium-sized machine ought to do it, with
    synchronous_commit turned off.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  11. Re: WALInsertLock tuning

    Simon Riggs <simon@2ndquadrant.com> — 2011-06-07T12:54:11Z

    On Tue, Jun 7, 2011 at 1:24 PM, Robert Haas <robertmhaas@gmail.com> wrote:
    
    > One other thought is that I think that this patch might cause a
    > user-visible behavior change.  Right now, when you hit the end of
    > recovery, you most typically get a message saying - record with zero
    > length.  Not always, but often.  If we adopt this approach, you'll get
    > a wider variety of error messages there, depending on exactly how the
    > new record fails validation.  I dunno if that's important to be worth
    > caring about, or not.
    
    Not.
    
    We've never said what the message would be, only that it would fail.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  12. Re: WALInsertLock tuning

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-06-07T15:36:29Z

    On 07.06.2011 10:55, Simon Riggs wrote:
    > On Tue, Jun 7, 2011 at 8:27 AM, Heikki Linnakangas
    > <heikki.linnakangas@enterprisedb.com>  wrote:
    >> You would only need to do it just before you write out the WAL. I guess
    >> you'd need to grab WALInsertLock in XLogWrite() to prevent more WAL records
    >> from being inserted on the page until you're done zeroing it, though.
    >
    > How would that help?
    
    It doesn't matter whether the pages are zeroed while they sit in memory. 
    And if you write a full page of WAL data, any wasted bytes at the end of 
    the page don't matter, because they're ignored at replay anyway. The 
    possibility of mistaking random garbage for valid WAL only occurs when 
    we write a partial WAL page to disk. So, it is enough to zero the 
    remainder of the partial WAL page (or just the next few words) when we 
    write it out.
    
    That's a lot cheaper than fully zeroing every page. (except for the fact 
    that you'd need to hold WALInsertLock while you do it)
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  13. Re: WALInsertLock tuning

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-06-07T15:57:37Z

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
    > On 07.06.2011 10:55, Simon Riggs wrote:
    >> How would that help?
    
    > It doesn't matter whether the pages are zeroed while they sit in memory. 
    > And if you write a full page of WAL data, any wasted bytes at the end of 
    > the page don't matter, because they're ignored at replay anyway. The 
    > possibility of mistaking random garbage for valid WAL only occurs when 
    > we write a partial WAL page to disk. So, it is enough to zero the 
    > remainder of the partial WAL page (or just the next few words) when we 
    > write it out.
    
    > That's a lot cheaper than fully zeroing every page. (except for the fact 
    > that you'd need to hold WALInsertLock while you do it)
    
    I think avoiding the need to hold both locks at once is probably exactly
    why the zeroing was done where it is.
    
    An interesting alternative is to have XLogInsert itself just plop down a
    few more zeroes immediately after the record it's inserted, before it
    releases WALInsertLock.  This will be redundant work once the next
    record gets added, but it's cheap enough to not matter IMO.  As was
    mentioned upthread, zeroing out the bytes that will eventually hold the
    next record's xl_prev field ought to be enough to maintain a guarantee
    that we won't believe the next record is valid.
    
    			regards, tom lane
    
    
  14. Re: WALInsertLock tuning

    Simon Riggs <simon@2ndquadrant.com> — 2011-06-07T16:12:27Z

    On Tue, Jun 7, 2011 at 4:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
    >> On 07.06.2011 10:55, Simon Riggs wrote:
    >>> How would that help?
    >
    >> It doesn't matter whether the pages are zeroed while they sit in memory.
    >> And if you write a full page of WAL data, any wasted bytes at the end of
    >> the page don't matter, because they're ignored at replay anyway. The
    >> possibility of mistaking random garbage for valid WAL only occurs when
    >> we write a partial WAL page to disk. So, it is enough to zero the
    >> remainder of the partial WAL page (or just the next few words) when we
    >> write it out.
    >
    >> That's a lot cheaper than fully zeroing every page. (except for the fact
    >> that you'd need to hold WALInsertLock while you do it)
    >
    > I think avoiding the need to hold both locks at once is probably exactly
    > why the zeroing was done where it is.
    >
    > An interesting alternative is to have XLogInsert itself just plop down a
    > few more zeroes immediately after the record it's inserted, before it
    > releases WALInsertLock.  This will be redundant work once the next
    > record gets added, but it's cheap enough to not matter IMO.  As was
    > mentioned upthread, zeroing out the bytes that will eventually hold the
    > next record's xl_prev field ought to be enough to maintain a guarantee
    > that we won't believe the next record is valid.
    
    Lets see what the overheads are with a continuous stream of short WAL
    records, say xl_heap_delete records.
    
    xl header is 32 bytes, xl_heap_delete is 24 bytes.
    
    So there would be ~145 records per page. 12 byte zeroing overhead per
    record gives 1740 total zero bytes written per page.
    
    The overhead is at worst case less than 25% of current overhead, plus
    its spread out across multiple records.
    
    When we get lots of full pages into WAL just after checkpoint we don't
    get as much overhead - nearly every full page forces a page switch. So
    we're removing overhead from where it hurts the most and amortising
    across other records.
    
    Maths work for me.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  15. Re: WALInsertLock tuning

    Fujii Masao <masao.fujii@gmail.com> — 2011-06-08T02:18:05Z

    On Tue, Jun 7, 2011 at 9:54 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
    > On Tue, Jun 7, 2011 at 1:24 PM, Robert Haas <robertmhaas@gmail.com> wrote:
    >
    >> One other thought is that I think that this patch might cause a
    >> user-visible behavior change.  Right now, when you hit the end of
    >> recovery, you most typically get a message saying - record with zero
    >> length.  Not always, but often.  If we adopt this approach, you'll get
    >> a wider variety of error messages there, depending on exactly how the
    >> new record fails validation.  I dunno if that's important to be worth
    >> caring about, or not.
    >
    > Not.
    >
    > We've never said what the message would be, only that it would fail.
    
    BTW, walreceiver doesn't zero the page before writing the WAL. So,
    if zeroing the page is *really* required for safe recovery, we might need
    to change walreceiver.
    
    Regards,
    
    -- 
    Fujii Masao
    NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    NTT Open Source Software Center
    
    
  16. Re: WALInsertLock tuning

    Bruce Momjian <bruce@momjian.us> — 2011-10-14T01:35:06Z

    I assume this was addressed with this commit:
    
    	commit 465883b0a2b4236ba6b31b648a9eabef3b7cdddb
    	Author: Simon Riggs <simon@2ndQuadrant.com>
    	Date:   Tue Jun 28 22:58:17 2011 +0100
    	
    	    Introduce compact WAL record for the common case of commit (non-DDL).
    	    XLOG_XACT_COMMIT_COMPACT leaves out invalidation messages and relfilenodes,
    	    saving considerable space for the vast majority of transaction commits.
    	    XLOG_XACT_COMMIT keeps same definition as XLOG_PAGE_MAGIC 0xD067 and earlier.
    	
    	    Leonardo Francalanci and Simon Riggs	
    
    ---------------------------------------------------------------------------
    
    Simon Riggs wrote:
    > On Mon, Jun 6, 2011 at 11:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
    > 
    > > As to the question of whether it's safe, I think I'd agree that the
    > > chances of this backfiring are pretty remote. ?I think that with the
    > > zeroing they are exactly zero, because (now that we start XLOG
    > > positions at 0/1) there is now way that xl_prev = {0, 0} can ever be
    > > valid. ?Without the zeroing, well, it's remotely possible that xl_prev
    > > could happen to appear valid and that xl_crc could happen to match...
    > > but the chances are presumably quite remote. ?Just the chances of the
    > > CRC matching should be around 2^-32. ?The chances of an xl_prev match
    > > are harder to predict, because the matching values for CRCs should be
    > > pretty much uniformly distributed, while xl_prev isn't random. ?But
    > > even given that the chance of a match is should be very small, so in
    > > practice there is likely no harm.
    > 
    > And if such a thing did actually happen we would also need to have an
    > accidentally correct value of all of the rest of the header values.
    > And even if we did we would apply at most one junk WAL record. Then we
    > are onto the next WAL record where we would need have to the same luck
    > all over again.
    > 
    > The probability of these occurrences is well below the acceptable
    > threshold for other problems occurring.
    > 
    > > It strikes me, though, that we
    > > could probably get nearly all of the benefit of this patch by being
    > > willing to zero the first sizeof(XLogRecord) bytes following a record,
    > > but not the rest of the buffer. ?That would pretty much wipe out any
    > > chance of an xl_prev match, I think, and would likely still get nearly
    > > all of the performance benefit.
    > 
    > Which adds something onto the path of every XlogInsert(), rather than
    > once per page, so I'm a little hesitant to agree.
    > 
    > If we did that, we would only need to write out an additional 12 bytes
    > per WAL record, not the full sizeof(XLogRecord).
    > 
    > But even so, I think its wasted effort.
    > 
    > Measuring the benefit of a performance patch is normal, but I'm not
    > proposing this as a risk trade-off. It's just a straight removal of
    > multiple cycles from a hot code path. The exact benefit will depend
    > upon whether the WALInsertLock is the hot lock, which it likely will
    > be when other patches are applied.
    > 
    > -- 
    > ?Simon Riggs?????????????????? http://www.2ndQuadrant.com/
    > ?PostgreSQL Development, 24x7 Support, Training & Services
    > 
    > -- 
    > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
    > To make changes to your subscription:
    > http://www.postgresql.org/mailpref/pgsql-hackers
    
    -- 
      Bruce Momjian  <bruce@momjian.us>        http://momjian.us
      EnterpriseDB                             http://enterprisedb.com
    
      + It's impossible for everything to be true. +
    
    
  17. Re: WALInsertLock tuning

    Robert Haas <robertmhaas@gmail.com> — 2011-10-14T12:34:44Z

    On Thu, Oct 13, 2011 at 9:35 PM, Bruce Momjian <bruce@momjian.us> wrote:
    >
    > I assume this was addressed with this commit:
    >
    >        commit 465883b0a2b4236ba6b31b648a9eabef3b7cdddb
    >        Author: Simon Riggs <simon@2ndQuadrant.com>
    >        Date:   Tue Jun 28 22:58:17 2011 +0100
    >
    >            Introduce compact WAL record for the common case of commit (non-DDL).
    >            XLOG_XACT_COMMIT_COMPACT leaves out invalidation messages and relfilenodes,
    >            saving considerable space for the vast majority of transaction commits.
    >            XLOG_XACT_COMMIT keeps same definition as XLOG_PAGE_MAGIC 0xD067 and earlier.
    >
    >            Leonardo Francalanci and Simon Riggs
    
    No, that's completely unrelated.  I think it just never quite made it
    to prime time - it was analyzed theoretically but some of the testing
    discussed on the thread doesn't seem to have been done.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company