Thread

  1. Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-09-15T22:53:57Z

    As discussed previously...
    
    Currently the bgwriter process performs both background writing,
    checkpointing and some other duties. This means that we can't perform
    the final checkpoint fsync without stopping background writing, so
    there is a negative performance effect from doing both things in one
    process.
    
    Additionally, our aim in 9.2 is to replace polling loops with latches
    for power reduction. The complexity of the bgwriter loops is high and
    it seems unlikely to come up with a clean approach using latches.
    
    This patch splits bgwriter into 2 processes: checkpointer and
    bgwriter, seeking to avoid contentious changes. Additional changes are
    expected in this release to build upon these changes for both new
    processes, though this patch stands on its own as both a performance
    vehicle and in some ways a refcatoring to simplify the code.
    
    Checkpointer does the important things, "new bgwriter" just does
    background writing and so is much less important than before.
    
    Current patch has a bug at shutdown I've not located yet, but seems
    likely is a simple error. That is mainly because for personal reasons
    I've not been able to work on the patch recently. I expect to be able
    to fix that later in the CF.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
  2. Re: Separating bgwriter and checkpointer

    Fujii Masao <masao.fujii@gmail.com> — 2011-09-16T01:38:55Z

    On Fri, Sep 16, 2011 at 7:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    > This patch splits bgwriter into 2 processes: checkpointer and
    > bgwriter, seeking to avoid contentious changes. Additional changes are
    > expected in this release to build upon these changes for both new
    > processes, though this patch stands on its own as both a performance
    > vehicle and in some ways a refcatoring to simplify the code.
    
    I like this idea to simplify the code. How much performance gain can we
    expect by this patch?
    
    > Current patch has a bug at shutdown I've not located yet, but seems
    > likely is a simple error. That is mainly because for personal reasons
    > I've not been able to work on the patch recently. I expect to be able
    > to fix that later in the CF.
    
    You seem to have forgotten to include checkpointor.c and .h in the patch.
    
    Regards,
    
    -- 
    Fujii Masao
    NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    NTT Open Source Software Center
    
    
  3. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-09-20T07:48:37Z

    On Fri, Sep 16, 2011 at 2:38 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
    > On Fri, Sep 16, 2011 at 7:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >> This patch splits bgwriter into 2 processes: checkpointer and
    >> bgwriter, seeking to avoid contentious changes. Additional changes are
    >> expected in this release to build upon these changes for both new
    >> processes, though this patch stands on its own as both a performance
    >> vehicle and in some ways a refcatoring to simplify the code.
    >
    > I like this idea to simplify the code. How much performance gain can we
    > expect by this patch?
    
    On heavily I/O bound systems, this is likely to make a noticeable
    difference, since bgwriter reduces I/O in user processes.
    
    The overhead of sending signals between processes is much less than I
    had previously thought, so I expect no problems there, even on highly
    loaded systems.
    
    
    >> Current patch has a bug at shutdown I've not located yet, but seems
    >> likely is a simple error. That is mainly because for personal reasons
    >> I've not been able to work on the patch recently. I expect to be able
    >> to fix that later in the CF.
    >
    > You seem to have forgotten to include checkpointor.c and .h in the patch.
    
    I confirm this error. I'll repost full patch later in the week when I
    have more time.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  4. Re: Separating bgwriter and checkpointer

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-09-20T08:06:41Z

    On 20.09.2011 10:48, Simon Riggs wrote:
    > On Fri, Sep 16, 2011 at 2:38 AM, Fujii Masao<masao.fujii@gmail.com>  wrote:
    >> On Fri, Sep 16, 2011 at 7:53 AM, Simon Riggs<simon@2ndquadrant.com>  wrote:
    >>> This patch splits bgwriter into 2 processes: checkpointer and
    >>> bgwriter, seeking to avoid contentious changes. Additional changes are
    >>> expected in this release to build upon these changes for both new
    >>> processes, though this patch stands on its own as both a performance
    >>> vehicle and in some ways a refcatoring to simplify the code.
    >>
    >> I like this idea to simplify the code. How much performance gain can we
    >> expect by this patch?
    >
    > On heavily I/O bound systems, this is likely to make a noticeable
    > difference, since bgwriter reduces I/O in user processes.
    
    Hmm. If the system is I/O bound, it doesn't matter which process 
    performs the I/O. It's still the same amount of I/O in total, and in an 
    I/O bound system, that's what determines the overall throughput.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  5. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-09-20T08:18:59Z

    On Tue, Sep 20, 2011 at 9:06 AM, Heikki Linnakangas
    <heikki.linnakangas@enterprisedb.com> wrote:
    > On 20.09.2011 10:48, Simon Riggs wrote:
    >>
    >> On Fri, Sep 16, 2011 at 2:38 AM, Fujii Masao<masao.fujii@gmail.com>
    >>  wrote:
    >>>
    >>> On Fri, Sep 16, 2011 at 7:53 AM, Simon Riggs<simon@2ndquadrant.com>
    >>>  wrote:
    >>>>
    >>>> This patch splits bgwriter into 2 processes: checkpointer and
    >>>> bgwriter, seeking to avoid contentious changes. Additional changes are
    >>>> expected in this release to build upon these changes for both new
    >>>> processes, though this patch stands on its own as both a performance
    >>>> vehicle and in some ways a refcatoring to simplify the code.
    >>>
    >>> I like this idea to simplify the code. How much performance gain can we
    >>> expect by this patch?
    >>
    >> On heavily I/O bound systems, this is likely to make a noticeable
    >> difference, since bgwriter reduces I/O in user processes.
    >
    > Hmm. If the system is I/O bound, it doesn't matter which process performs
    > the I/O. It's still the same amount of I/O in total, and in an I/O bound
    > system, that's what determines the overall throughput.
    
    That's true, but not relevant.
    
    The bgwriter avoids I/O, if it is operating correctly. This patch
    ensures it continues to operate even during heavy checkpoints. So it
    helps avoid extra I/O during a period of very high I/O activity.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  6. Re: Separating bgwriter and checkpointer

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-09-20T09:03:51Z

    On 20.09.2011 11:18, Simon Riggs wrote:
    > The bgwriter avoids I/O, if it is operating correctly. This patch
    > ensures it continues to operate even during heavy checkpoints. So it
    > helps avoid extra I/O during a period of very high I/O activity.
    
    I don't see what difference it makes which process does the I/O. If a 
    write() by checkpointer process blocks, any write()s by the separate 
    bgwriter process at that time will block too. If the I/O is not 
    saturated, and the checkpoint write()s don't block, then even without 
    this patch, the bgwriter process can handle its usual bgwriter duties 
    during checkpoint just fine. (And if the I/O is not saturated, it's not 
    an I/O bound system anyway.)
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  7. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-09-20T10:03:56Z

    On Tue, Sep 20, 2011 at 10:03 AM, Heikki Linnakangas
    <heikki.linnakangas@enterprisedb.com> wrote:
    > On 20.09.2011 11:18, Simon Riggs wrote:
    >>
    >> The bgwriter avoids I/O, if it is operating correctly. This patch
    >> ensures it continues to operate even during heavy checkpoints. So it
    >> helps avoid extra I/O during a period of very high I/O activity.
    >
    > I don't see what difference it makes which process does the I/O. If a
    > write() by checkpointer process blocks, any write()s by the separate
    > bgwriter process at that time will block too. If the I/O is not saturated,
    > and the checkpoint write()s don't block, then even without this patch, the
    > bgwriter process can handle its usual bgwriter duties during checkpoint just
    > fine. (And if the I/O is not saturated, it's not an I/O bound system
    > anyway.)
    
    Whatever value you assign to the bgwriter, then this patch makes sure
    that happens during heavy fsyncs.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  8. Re: Separating bgwriter and checkpointer

    Greg Stark <stark@mit.edu> — 2011-09-20T13:29:35Z

    On Tue, Sep 20, 2011 at 11:03 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >> I don't see what difference it makes which process does the I/O. If a
    >> write() by checkpointer process blocks, any write()s by the separate
    >> bgwriter process at that time will block too. If the I/O is not saturated,
    >> and the checkpoint write()s don't block, then even without this patch, the
    >> bgwriter process can handle its usual bgwriter duties during checkpoint just
    >> fine. (And if the I/O is not saturated, it's not an I/O bound system
    >> anyway.)
    >
    > Whatever value you assign to the bgwriter, then this patch makes sure
    > that happens during heavy fsyncs.
    
    I think his point is that it doesn't because if the heavy fsyncs cause
    the system to be i/o bound it then bgwriter will just block issuing
    the writes instead of the fsyncs.
    
    I'm not actually convinced. Writes will only block if the kernel
    decides to block. We don't really know how the kernel makes this
    decision but it's entirely possible that having pending physical i/o
    issued due to an fsync doesn't influence the decision if there is
    still a reasonable number of dirty pages in the buffer cache.  In a
    sense, "I/O bound" means different things for write and fsync. Or to
    put it another way fsync is latency sensitive but write is only
    bandwidth sensitive.
    
    All that said my question is which way is the code more legible and
    easier to follow?
    
    
  9. Re: Separating bgwriter and checkpointer

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-09-20T13:35:59Z

    On 20.09.2011 16:29, Greg Stark wrote:
    > On Tue, Sep 20, 2011 at 11:03 AM, Simon Riggs<simon@2ndquadrant.com>  wrote:
    >>> I don't see what difference it makes which process does the I/O. If a
    >>> write() by checkpointer process blocks, any write()s by the separate
    >>> bgwriter process at that time will block too. If the I/O is not saturated,
    >>> and the checkpoint write()s don't block, then even without this patch, the
    >>> bgwriter process can handle its usual bgwriter duties during checkpoint just
    >>> fine. (And if the I/O is not saturated, it's not an I/O bound system
    >>> anyway.)
    >>
    >> Whatever value you assign to the bgwriter, then this patch makes sure
    >> that happens during heavy fsyncs.
    >
    > I think his point is that it doesn't because if the heavy fsyncs cause
    > the system to be i/o bound it then bgwriter will just block issuing
    > the writes instead of the fsyncs.
    >
    > I'm not actually convinced. Writes will only block if the kernel
    > decides to block. We don't really know how the kernel makes this
    > decision but it's entirely possible that having pending physical i/o
    > issued due to an fsync doesn't influence the decision if there is
    > still a reasonable number of dirty pages in the buffer cache.  In a
    > sense, "I/O bound" means different things for write and fsync. Or to
    > put it another way fsync is latency sensitive but write is only
    > bandwidth sensitive.
    
    Yeah, I was thinking of write()s, not fsyncs. I agree this might have 
    some effect during fsync phase.
    
    > All that said my question is which way is the code more legible and
    > easier to follow?
    
    Hear hear. If we're going to give the bgwriter more responsibilities, 
    this might make sense even if it has no effect on performance.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  10. Re: Separating bgwriter and checkpointer

    Magnus Hagander <magnus@hagander.net> — 2011-09-20T13:49:11Z

    On Tue, Sep 20, 2011 at 15:35, Heikki Linnakangas
    <heikki.linnakangas@enterprisedb.com> wrote:
    > On 20.09.2011 16:29, Greg Stark wrote:
    >>
    >> On Tue, Sep 20, 2011 at 11:03 AM, Simon Riggs<simon@2ndquadrant.com>
    >>  wrote:
    >>>>
    >>>> I don't see what difference it makes which process does the I/O. If a
    >>>> write() by checkpointer process blocks, any write()s by the separate
    >>>> bgwriter process at that time will block too. If the I/O is not
    >>>> saturated,
    >>>> and the checkpoint write()s don't block, then even without this patch,
    >>>> the
    >>>> bgwriter process can handle its usual bgwriter duties during checkpoint
    >>>> just
    >>>> fine. (And if the I/O is not saturated, it's not an I/O bound system
    >>>> anyway.)
    >>>
    >>> Whatever value you assign to the bgwriter, then this patch makes sure
    >>> that happens during heavy fsyncs.
    >>
    >> I think his point is that it doesn't because if the heavy fsyncs cause
    >> the system to be i/o bound it then bgwriter will just block issuing
    >> the writes instead of the fsyncs.
    >>
    >> I'm not actually convinced. Writes will only block if the kernel
    >> decides to block. We don't really know how the kernel makes this
    >> decision but it's entirely possible that having pending physical i/o
    >> issued due to an fsync doesn't influence the decision if there is
    >> still a reasonable number of dirty pages in the buffer cache.  In a
    >> sense, "I/O bound" means different things for write and fsync. Or to
    >> put it another way fsync is latency sensitive but write is only
    >> bandwidth sensitive.
    >
    > Yeah, I was thinking of write()s, not fsyncs. I agree this might have some
    > effect during fsync phase.
    >
    >> All that said my question is which way is the code more legible and
    >> easier to follow?
    >
    > Hear hear. If we're going to give the bgwriter more responsibilities, this
    > might make sense even if it has no effect on performance.
    
    Isn't there also the advantage of that work put in two different
    processes can use two different CPU cores? Or is that likely to never
    ever come in play here?
    
    -- 
     Magnus Hagander
     Me: http://www.hagander.net/
     Work: http://www.redpill-linpro.com/
    
    
  11. Re: Separating bgwriter and checkpointer

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-09-20T14:13:41Z

    On 20.09.2011 16:49, Magnus Hagander wrote:
    > Isn't there also the advantage of that work put in two different
    > processes can use two different CPU cores? Or is that likely to never
    > ever come in play here?
    
    You would need one helluva I/O system to saturate even a single CPU, 
    just by doing write+fsync.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  12. Re: Separating bgwriter and checkpointer

    Cédric Villemain <cedric.villemain.debian@gmail.com> — 2011-09-20T14:31:56Z

    2011/9/20 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>:
    > On 20.09.2011 16:49, Magnus Hagander wrote:
    >>
    >> Isn't there also the advantage of that work put in two different
    >> processes can use two different CPU cores? Or is that likely to never
    >> ever come in play here?
    >
    > You would need one helluva I/O system to saturate even a single CPU, just by
    > doing write+fsync.
    
    The point of Magnus is valid. There are possible throttling done by
    linux per node, per process/task.
    Since ..2.6.37 (32 ?) I believe .. there are more temptation to have
    have per cgroup io/sec limits, and there exists some promising work
    done to have a better IO bandwith throttling per process.
    
    IMO, splitting the type of IO workload per process allows the
    administrators to have more control on the IO limits they want to have
    (and it may help the kernels() to have a better strategy ?)
    
    >
    > --
    >  Heikki Linnakangas
    >  EnterpriseDB   http://www.enterprisedb.com
    >
    > --
    > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
    > To make changes to your subscription:
    > http://www.postgresql.org/mailpref/pgsql-hackers
    >
    
    
    
    -- 
    Cédric Villemain +33 (0)6 20 30 22 52
    http://2ndQuadrant.fr/
    PostgreSQL: Support 24x7 - Développement, Expertise et Formation
    
    
  13. Re: Separating bgwriter and checkpointer

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-09-20T14:42:14Z

    On 20.09.2011 17:31, Cédric Villemain wrote:
    > 2011/9/20 Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>:
    >> On 20.09.2011 16:49, Magnus Hagander wrote:
    >>>
    >>> Isn't there also the advantage of that work put in two different
    >>> processes can use two different CPU cores? Or is that likely to never
    >>> ever come in play here?
    >>
    >> You would need one helluva I/O system to saturate even a single CPU, just by
    >> doing write+fsync.
    >
    > The point of Magnus is valid. There are possible throttling done by
    > linux per node, per process/task.
    > Since ..2.6.37 (32 ?) I believe .. there are more temptation to have
    > have per cgroup io/sec limits, and there exists some promising work
    > done to have a better IO bandwith throttling per process.
    >
    > IMO, splitting the type of IO workload per process allows the
    > administrators to have more control on the IO limits they want to have
    > (and it may help the kernels() to have a better strategy ?)
    
    That is a separate issue from being able to use different CPU cores. But 
    cool! I didn't know Linux can do that nowadays. That could be highly 
    useful, if you can put e.g autovacuum on a different cgroup from regular 
    backends.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  14. Re: Separating bgwriter and checkpointer

    Robert Haas <robertmhaas@gmail.com> — 2011-09-20T14:53:17Z

    On Tue, Sep 20, 2011 at 9:35 AM, Heikki Linnakangas
    <heikki.linnakangas@enterprisedb.com> wrote:
    >> All that said my question is which way is the code more legible and
    >> easier to follow?
    >
    > Hear hear. If we're going to give the bgwriter more responsibilities, this
    > might make sense even if it has no effect on performance.
    
    I agree.  I don't think this change needs to be justified on
    performance grounds; there are enough collateral benefits to make it
    worthwhile.  If the checkpoint process handles all the stuff with
    highly variable latency (i.e. fsyncs), then the background writer work
    will happen more regularly and predictably.  The code will also be
    simpler, which I think will open up opportunities for additional
    optimizations such as (perhaps) making the background writer only wake
    up when there are dirty buffers to write, which ties in to
    longstanding concerns about power consumption.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  15. Re: Separating bgwriter and checkpointer

    Marti Raudsepp <marti@juffo.org> — 2011-09-20T15:01:51Z

    On Fri, Sep 16, 2011 at 01:53, Simon Riggs <simon@2ndquadrant.com> wrote:
    > This patch splits bgwriter into 2 processes: checkpointer and
    > bgwriter, seeking to avoid contentious changes. Additional changes are
    > expected in this release to build upon these changes for both new
    > processes, though this patch stands on its own as both a performance
    > vehicle and in some ways a refcatoring to simplify the code.
    
    While you're already splitting up bgwriter, could there be any benefit
    to spawning a separate bgwriter process for each tablespace?
    
    If your database has one tablespace on a fast I/O system and another
    on a slow one, the slow tablespace would also bog down background
    writing for the fast tablespace. But I don't know whether that's
    really a problem or not.
    
    Regards,
    Marti
    
    
  16. Re: Separating bgwriter and checkpointer

    Robert Haas <robertmhaas@gmail.com> — 2011-09-20T16:20:10Z

    On Tue, Sep 20, 2011 at 11:01 AM, Marti Raudsepp <marti@juffo.org> wrote:
    > On Fri, Sep 16, 2011 at 01:53, Simon Riggs <simon@2ndquadrant.com> wrote:
    >> This patch splits bgwriter into 2 processes: checkpointer and
    >> bgwriter, seeking to avoid contentious changes. Additional changes are
    >> expected in this release to build upon these changes for both new
    >> processes, though this patch stands on its own as both a performance
    >> vehicle and in some ways a refcatoring to simplify the code.
    >
    > While you're already splitting up bgwriter, could there be any benefit
    > to spawning a separate bgwriter process for each tablespace?
    >
    > If your database has one tablespace on a fast I/O system and another
    > on a slow one, the slow tablespace would also bog down background
    > writing for the fast tablespace. But I don't know whether that's
    > really a problem or not.
    
    I doubt it.  Most of the time the writes are going to be absorbed by
    the OS write cache anyway.
    
    I think there's probably more performance to be squeezed out of the
    background writer, but maybe not that exact thing, and in any case it
    seems like material for a separate patch.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  17. Re: Separating bgwriter and checkpointer

    Greg Smith <greg@2ndquadrant.com> — 2011-09-22T01:06:40Z

    On 09/20/2011 09:35 AM, Heikki Linnakangas wrote:
    > Yeah, I was thinking of write()s, not fsyncs. I agree this might have 
    > some effect during fsync phase.
    
    Right; that's where the most serious problems seem to pop up at anyway 
    now.  All the testing I did earlier this year suggested Linux at least 
    is happy to do a granular fsync, and it can also use things like 
    barriers when appropriate to schedule I/O.  The hope here is that the 
    background writer work to clean ahead of the strategy point is helpful 
    to backends, and that should keep going even during the sync 
    phase--which currently doesn't pause for anything else once it's 
    started.  The cleaner writes should all queue up into RAM in a lazy way 
    rather than block the true I/O, which is being driven by sync calls.
    
    There is some risk here that the cleaner writes happen faster than the 
    true rate at which backends really need buffers, since it has a 
    predictive component it can be wrong about.  Those could in theory 
    result in the write cache filling faster than it would in the current 
    environment, such that writes truly block that would have been cached in 
    the current code.  If you're that close to the edge though, backends 
    should really benefit from the cleaner--that same write done by a client 
    would turn into a serious stall.  From that perspective, when things 
    have completely filled the write cache, any writes the cleaner can get 
    out of the way in advance of when a backend needs it should be the 
    biggest win most of the time.
    
    -- 
    Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
    PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
    
    
    
  18. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-10-02T12:10:27Z

    On Fri, Sep 16, 2011 at 2:38 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
    > On Fri, Sep 16, 2011 at 7:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >> This patch splits bgwriter into 2 processes: checkpointer and
    >> bgwriter, seeking to avoid contentious changes. Additional changes are
    >> expected in this release to build upon these changes for both new
    >> processes, though this patch stands on its own as both a performance
    >> vehicle and in some ways a refcatoring to simplify the code.
    >
    > I like this idea to simplify the code. How much performance gain can we
    > expect by this patch?
    >
    >> Current patch has a bug at shutdown I've not located yet, but seems
    >> likely is a simple error. That is mainly because for personal reasons
    >> I've not been able to work on the patch recently. I expect to be able
    >> to fix that later in the CF.
    >
    > You seem to have forgotten to include checkpointor.c and .h in the patch.
    
    Original patch included here.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
  19. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-10-02T12:10:34Z

    On Thu, Sep 15, 2011 at 11:53 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
    
    > Current patch has a bug at shutdown I've not located yet, but seems
    > likely is a simple error. That is mainly because for personal reasons
    > I've not been able to work on the patch recently. I expect to be able
    > to fix that later in the CF.
    
    Full patch, with bug fixed. (v2)
    
    I'm now free to take review comments and make changes.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
  20. Re: Separating bgwriter and checkpointer

    Dickson S. Guedes <listas@guedesoft.net> — 2011-10-02T22:45:14Z

    2011/10/2 Simon Riggs <simon@2ndquadrant.com>:
    > On Thu, Sep 15, 2011 at 11:53 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >
    >> Current patch has a bug at shutdown I've not located yet, but seems
    >> likely is a simple error. That is mainly because for personal reasons
    >> I've not been able to work on the patch recently. I expect to be able
    >> to fix that later in the CF.
    >
    > Full patch, with bug fixed. (v2)
    >
    > I'm now free to take review comments and make changes.
    
    
    Hi Simon,
    
    I'm trying your patch, it was applied cleanly to master and compiled
    ok. But since I started postgres I'm seeing a  99% of CPU usage:
    
    guedes@betelgeuse:/srv/postgres/bgwriter_split$ ps -ef | grep postgres
    guedes   14878     1  0 19:37 pts/0    00:00:00
    /srv/postgres/bgwriter_split/bin/postgres -D data
    guedes   14880 14878  0 19:37 ?        00:00:00 postgres: writer
    process
    guedes   14881 14878 99 19:37 ?        00:03:07 postgres: checkpointer
    process
    guedes   14882 14878  0 19:37 ?        00:00:00 postgres: wal writer
    process
    guedes   14883 14878  0 19:37 ?        00:00:00 postgres: autovacuum
    launcher process
    guedes   14884 14878  0 19:37 ?        00:00:00 postgres: stats
    collector process
    
    Best regards.
    -- 
    Dickson S. Guedes
    mail/xmpp: guedes@guedesoft.net - skype: guediz
    http://guedesoft.net - http://www.postgresql.org.br
    
    
  21. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-10-03T07:26:16Z

    On Sun, Oct 2, 2011 at 11:45 PM, Dickson S. Guedes <listas@guedesoft.net> wrote:
    
    > I'm trying your patch, it was applied cleanly to master and compiled
    > ok. But since I started postgres I'm seeing a  99% of CPU usage:
    
    Oh, thanks. I see what happened. I was toying with the idea of going
    straight to a WaitLatch implementation for the loop but decided to
    leave it out for a later patch, and then skipped the sleep as well.
    
    New version attached.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
  22. Re: Separating bgwriter and checkpointer

    Dickson S. Guedes <listas@guedesoft.net> — 2011-10-04T01:51:52Z

    2011/10/3 Simon Riggs <simon@2ndquadrant.com>:
    > On Sun, Oct 2, 2011 at 11:45 PM, Dickson S. Guedes <listas@guedesoft.net> wrote:
    >> I'm trying your patch, it was applied cleanly to master and compiled
    >> ok. But since I started postgres I'm seeing a  99% of CPU usage:
    >
    > Oh, thanks. I see what happened. I was toying with the idea of going
    > straight to a WaitLatch implementation for the loop but decided to
    > leave it out for a later patch, and then skipped the sleep as well.
    >
    > New version attached.
    
    Working now but even passing all tests for make check, the
    regress_database's postmaster doesn't shutdown properly.
    
    $ make check
    ...
    ...
    ============== creating temporary installation        ==============
    ============== initializing database system           ==============
    ============== starting postmaster                    ==============
    running on port 57432 with PID 20094
    ============== creating database "regression"         ==============
    ...
    ============== shutting down postmaster               ==============
    pg_ctl: server does not shut down
    pg_regress: could not stop postmaster: exit code was 256
    
    $ uname -a
    Linux betelgeuse 2.6.38-11-generic-pae #50-Ubuntu SMP Mon Sep 12
    22:21:04 UTC 2011 i686 i686 i386 GNU/Linux
    
    $ grep "$ ./configure" config.log
      $ ./configure --enable-debug --enable-cassert
    --prefix=/srv/postgres/bgwriter_split
    
    Best regards,
    -- 
    Dickson S. Guedes
    mail/xmpp: guedes@guedesoft.net - skype: guediz
    http://guedesoft.net - http://www.postgresql.org.br
    
    
  23. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-10-04T09:05:59Z

    On Tue, Oct 4, 2011 at 2:51 AM, Dickson S. Guedes <listas@guedesoft.net> wrote:
    > 2011/10/3 Simon Riggs <simon@2ndquadrant.com>:
    >> On Sun, Oct 2, 2011 at 11:45 PM, Dickson S. Guedes <listas@guedesoft.net> wrote:
    >>> I'm trying your patch, it was applied cleanly to master and compiled
    >>> ok. But since I started postgres I'm seeing a  99% of CPU usage:
    >>
    >> Oh, thanks. I see what happened. I was toying with the idea of going
    >> straight to a WaitLatch implementation for the loop but decided to
    >> leave it out for a later patch, and then skipped the sleep as well.
    >>
    >> New version attached.
    >
    > Working now but even passing all tests for make check, the
    > regress_database's postmaster doesn't shutdown properly.
    >
    > $ make check
    > ...
    > ...
    > ============== creating temporary installation        ==============
    > ============== initializing database system           ==============
    > ============== starting postmaster                    ==============
    > running on port 57432 with PID 20094
    > ============== creating database "regression"         ==============
    > ...
    > ============== shutting down postmaster               ==============
    > pg_ctl: server does not shut down
    > pg_regress: could not stop postmaster: exit code was 256
    >
    > $ uname -a
    > Linux betelgeuse 2.6.38-11-generic-pae #50-Ubuntu SMP Mon Sep 12
    > 22:21:04 UTC 2011 i686 i686 i386 GNU/Linux
    >
    > $ grep "$ ./configure" config.log
    >  $ ./configure --enable-debug --enable-cassert
    > --prefix=/srv/postgres/bgwriter_split
    
    
    Yes, I see this also. At the same time, pg_ctl start and stop seem to
    work fine in every mode, which is what I tested. Which seems a little
    weird.
    
    I seem to be having problems with HEAD as well.
    
    Investigating further.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  24. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-10-04T18:08:44Z

    On Tue, Oct 4, 2011 at 10:05 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    
    >> ============== shutting down postmaster               ==============
    >> pg_ctl: server does not shut down
    >> pg_regress: could not stop postmaster: exit code was 256
    >>
    >
    > Yes, I see this also. At the same time, pg_ctl start and stop seem to
    > work fine in every mode, which is what I tested. Which seems a little
    > weird.
    >
    > I seem to be having problems with HEAD as well.
    >
    > Investigating further.
    
    
    Doh.
    
    The problem is the *same* one I fixed in v2, yet now I see I managed
    to somehow exclude that fix from the earlier patch. Slap. Anyway,
    fixed again now.
    
    Problem observed in head was because of this bug causing later make
    checks to fail on port 57432, so it looked like a problem in head at
    first. Nothing actual bug there at all.
    
    Thanks for your patience.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
  25. Re: Separating bgwriter and checkpointer

    Dickson S. Guedes <listas@guedesoft.net> — 2011-10-05T04:10:09Z

    2011/10/4 Simon Riggs <simon@2ndquadrant.com>:
    > The problem is the *same* one I fixed in v2, yet now I see I managed
    > to somehow exclude that fix from the earlier patch. Slap. Anyway,
    > fixed again now.
    
    Ah ok! I started reviewing the v4 patch version, this is my comments:
    
    Submission review
    ===============
    
    1. The patch applies cleanly to current master (fa56a0c3e) but isn't
    in context diff format;
    
    Feature test
    ==========
    
    1. Since I patched and make installed it, I can see the expected
    processes: writer and checkpointer;
    
    2. I did the following tests with the following results:
    
    2.1 Running a long time pgbench didn't emit any assertion failure or
    crash and the checkpoint works as before patch:
    
      LOG:  checkpoint starting: xlog
      LOG:  checkpoint complete: wrote 300 buffers (9.8%); 0 transaction
    log file(s) added, 0 removed, 0 recycled; write=26.103 s, sync=6.492
    s, total=34.000 s; sync files=13, longest=4.684 s, average=0.499 s
      LOG:  checkpoint starting: time
      LOG:  checkpoint complete: wrote 257 buffers (8.4%); 0 transaction
    log file(s) added, 0 removed, 3 recycled; write=21.819 s, sync=9.610
    s, total=32.076 s; sync files=7, longest=6.452 s, average=1.372 s
    
    2.2 Forcing a checkpoint when filesystem has enough free space works
    fine while pgbench is running:
    
      LOG:  checkpoint starting: immediate force wait
      LOG:  checkpoint complete: wrote 1605 buffers (52.2%); 0 transaction
    log file(s) added, 0 removed, 2 recycled; write=0.344 s, sync=22.750
    s, total=23.700 s; sync files=10, longest=15.586 s, average=2.275 s
    
    2.3 Forcing a checkpoint when filesystem are full, works as expected:
    
      LOG:  checkpoint starting: immediate force wait time
      ERROR:  could not write to file "pg_xlog/xlogtemp.4380": Não há
    espaço disponível no dispositivo
      ERROR:  checkpoint request failed
      HINT:  Consult recent messages in the server log for details.
      STATEMENT:  CHECKPOINT ;
      ...
      ERROR:  could not extend file "base/16384/16405": wrote only 4096 of
    8192 bytes at block 10
      HINT:  Check free disk space.
      STATEMENT:  INSERT INTO pgbench_history (tid, bid, aid, delta,
    mtime) VALUES (69, 3, 609672, -3063,  CURRENT_TIMESTAMP);
      PANIC:  could not write to file "pg_xlog/xlogtemp.4528": Não há
    espaço disponível no dispositivo
      STATEMENT:  END;
      LOG:  server process (PID 4528) was terminated by signal 6: Aborted
    
    Then I freed some space and started it again and the server ran properly:
    
       LOG:  database system was shut down at 2011-10-05 00:46:33 BRT
       LOG:  database system is ready to accept connections
       LOG:  autovacuum launcher started
       ...
       LOG:  checkpoint starting: immediate force wait
       LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 transaction
    log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s,
    total=0.181 s; sync files=0, longest=0.000 s, average=0.000 s
    
    2.2 Running a pgbench and interrupting postmaster a few seconds later,
    seems to work as expected, returning the output:
    
    ... cut ...
    LOG:  statement: SELECT abalance FROM pgbench_accounts WHERE aid = 148253;
    ^CLOG:  statement: UPDATE pgbench_tellers SET tbalance = tbalance +
    934 WHERE tid = 85;
    DEBUG:  postmaster received signal 2
    LOG:  received fast shutdown request
    LOG:  aborting any active transactions
    FATAL:  terminating connection due to administrator command
    FATAL:  terminating connection due to administrator command
    ... cut ...
    LOG:  disconnection: session time: 0:00:14.917 user=guedes
    database=bench host=[local]
    LOG:  disconnection: session time: 0:00:14.773 user=guedes
    database=bench host=[local]
    DEBUG:  server process (PID 1910) exited with exit code 1
    DEBUG:  server process (PID 1941) exited with exit code 1
    LOG:  shutting down
    LOG:  checkpoint starting: shutdown immediate
    DEBUG:  SlruScanDirectory invoking callback on pg_multixact/offsets/0000
    DEBUG:  SlruScanDirectory invoking callback on pg_multixact/members/0000
    DEBUG:  checkpoint sync: number=1 file=base/16384/16398 time=1075.227 msec
    DEBUG:  checkpoint sync: number=2 file=base/16384/16406 time=16.832 msec
    DEBUG:  checkpoint sync: number=3 file=base/16384/16397 time=12306.204 msec
    DEBUG:  checkpoint sync: number=4 file=base/16384/16397.1 time=2122.141 msec
    DEBUG:  checkpoint sync: number=5 file=base/16384/16406_fsm time=32.278 msec
    DEBUG:  checkpoint sync: number=6 file=base/16384/16385_fsm time=11.248 msec
    DEBUG:  checkpoint sync: number=7 file=base/16384/16388 time=11.083 msec
    DEBUG:  checkpoint sync: number=8 file=base/16384/11712 time=11.314 msec
    DEBUG:  checkpoint sync: number=9 file=base/16384/16397_vm time=11.103 msec
    DEBUG:  checkpoint sync: number=10 file=base/16384/16385 time=19.308 msec
    DEBUG:  attempting to remove WAL segments older than log file
    000000010000000000000000
    DEBUG:  SlruScanDirectory invoking callback on pg_subtrans/0000
    LOG:  checkpoint complete: wrote 1659 buffers (54.0%); 0 transaction
    log file(s) added, 0 removed, 0 recycled; write=0.025 s, sync=15.617
    s, total=15.898 s; sync files=10, longest=12.306 s, average=1.561 s
    LOG:  database system is shut down
    
    Then I started the server again and it ran properly.
    
    
    Well, all the tests was running with the default postgresql.conf in my
    laptop but I'll setup a more "real world" environment to test for
    performance regression. Until now I couldn't notice any significant
    difference in TPS before and after patch in a small environment. I'll
    post something soon.
    
    
    Best regards,
    -- 
    Dickson S. Guedes
    mail/xmpp: guedes@guedesoft.net - skype: guediz
    http://guedesoft.net - http://www.postgresql.org.br
    
    
  26. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-10-05T07:02:15Z

    On Wed, Oct 5, 2011 at 5:10 AM, Dickson S. Guedes <listas@guedesoft.net> wrote:
    
    > Ah ok! I started reviewing the v4 patch version, this is my comments:
    
    ...
    
    > Well, all the tests was running with the default postgresql.conf in my
    > laptop but I'll setup a more "real world" environment to test for
    > performance regression. Until now I couldn't notice any significant
    > difference in TPS before and after patch in a small environment. I'll
    > post something soon.
    
    Great testing, thanks. Likely will have no effect in non-I/O swamped
    environment, but no regression expected either.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  27. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-10-18T13:18:26Z

    On Wed, Oct 5, 2011 at 8:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    > On Wed, Oct 5, 2011 at 5:10 AM, Dickson S. Guedes <listas@guedesoft.net> wrote:
    >
    >> Ah ok! I started reviewing the v4 patch version, this is my comments:
    >
    > ...
    >
    >> Well, all the tests was running with the default postgresql.conf in my
    >> laptop but I'll setup a more "real world" environment to test for
    >> performance regression. Until now I couldn't notice any significant
    >> difference in TPS before and after patch in a small environment. I'll
    >> post something soon.
    >
    > Great testing, thanks. Likely will have no effect in non-I/O swamped
    > environment, but no regression expected either.
    
    
    Any reason or objection to committing this patch?
    
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  28. Re: Separating bgwriter and checkpointer

    Robert Haas <robertmhaas@gmail.com> — 2011-10-18T16:39:06Z

    On Tue, Oct 18, 2011 at 9:18 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    > Any reason or objection to committing this patch?
    
    Not on my end, though I haven't reviewed it in detail.  One minor note
    - I was mildly surprised to see that you moved this to the
    checkpointer rather than leaving it in the bgwriter:
    
    +	/* Do this once before starting the loop, then just at SIGHUP time. */
    +	SyncRepUpdateSyncStandbysDefined();
    
    My preference would probably have been to leave that in the background
    writer, on the theory that the checkpointer's work is likely to be
    more bursty and therefore it might be less responsive.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  29. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-10-18T16:53:47Z

    On Tue, Oct 18, 2011 at 5:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:
    > On Tue, Oct 18, 2011 at 9:18 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >> Any reason or objection to committing this patch?
    >
    > Not on my end, though I haven't reviewed it in detail.  One minor note
    > - I was mildly surprised to see that you moved this to the
    > checkpointer rather than leaving it in the bgwriter:
    >
    > +       /* Do this once before starting the loop, then just at SIGHUP time. */
    > +       SyncRepUpdateSyncStandbysDefined();
    >
    > My preference would probably have been to leave that in the background
    > writer, on the theory that the checkpointer's work is likely to be
    > more bursty and therefore it might be less responsive.
    
    That needs to be in the checkpointer because that is the process that
    shuts down last.
    
    The bgwriter is now more like the walwriter. It shuts down early in
    the shutdown process, while the checkpointer is last out.
    
    So it wasn't preference, it was a requirement of the new role definitions.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  30. Re: Separating bgwriter and checkpointer

    Robert Haas <robertmhaas@gmail.com> — 2011-10-18T16:59:23Z

    On Tue, Oct 18, 2011 at 12:53 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
    > On Tue, Oct 18, 2011 at 5:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:
    >> On Tue, Oct 18, 2011 at 9:18 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >>> Any reason or objection to committing this patch?
    >>
    >> Not on my end, though I haven't reviewed it in detail.  One minor note
    >> - I was mildly surprised to see that you moved this to the
    >> checkpointer rather than leaving it in the bgwriter:
    >>
    >> +       /* Do this once before starting the loop, then just at SIGHUP time. */
    >> +       SyncRepUpdateSyncStandbysDefined();
    >>
    >> My preference would probably have been to leave that in the background
    >> writer, on the theory that the checkpointer's work is likely to be
    >> more bursty and therefore it might be less responsive.
    >
    > That needs to be in the checkpointer because that is the process that
    > shuts down last.
    >
    > The bgwriter is now more like the walwriter. It shuts down early in
    > the shutdown process, while the checkpointer is last out.
    >
    > So it wasn't preference, it was a requirement of the new role definitions.
    
    Oh, I see.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  31. Re: Separating bgwriter and checkpointer

    Fujii Masao <masao.fujii@gmail.com> — 2011-10-19T10:36:08Z

    On Tue, Oct 18, 2011 at 10:18 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
    > Any reason or objection to committing this patch?
    
    The checkpointer doesn't call pgstat_send_bgwriter(), but it should.
    Otherwise, some entries in pg_stat_bgwriter will never be updated.
    
    If we adopt the patch, checkpoint is performed by checkpointer. So
    it looks odd that information related to checkpoint exist in
    pg_stat_bgwriter. We should move them to new catalog even if
    it breaks the compatibility?
    
    Regards,
    
    -- 
    Fujii Masao
    NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    NTT Open Source Software Center
    
    
  32. Re: Separating bgwriter and checkpointer

    Dickson S. Guedes <listas@guedesoft.net> — 2011-10-19T11:40:10Z

    2011/10/18 Simon Riggs <simon@2ndquadrant.com>:
    > On Wed, Oct 5, 2011 at 8:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >> On Wed, Oct 5, 2011 at 5:10 AM, Dickson S. Guedes <listas@guedesoft.net> wrote:
    >>
    >>> Ah ok! I started reviewing the v4 patch version, this is my comments:
    >>
    >> ...
    >>
    >>> Well, all the tests was running with the default postgresql.conf in my
    >>> laptop but I'll setup a more "real world" environment to test for
    >>> performance regression. Until now I couldn't notice any significant
    >>> difference in TPS before and after patch in a small environment. I'll
    >>> post something soon.
    >>
    >> Great testing, thanks. Likely will have no effect in non-I/O swamped
    >> environment, but no regression expected either.
    >
    >
    > Any reason or objection to committing this patch?
    
    I didn't see any performance regression (as expected) in the
    environments that I tested. About the code, I prefer someone with more
    experience to review it.
    
    Thanks.
    -- 
    Dickson S. Guedes
    mail/xmpp: guedes@guedesoft.net - skype: guediz
    http://guedesoft.net - http://www.postgresql.org.br
    
    
  33. Re: Separating bgwriter and checkpointer

    Dickson S. Guedes <listas@guedesoft.net> — 2011-10-19T12:43:07Z

    2011/10/19 Fujii Masao <masao.fujii@gmail.com>:
    > On Tue, Oct 18, 2011 at 10:18 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >> Any reason or objection to committing this patch?
    >
    > The checkpointer doesn't call pgstat_send_bgwriter(), but it should.
    > Otherwise, some entries in pg_stat_bgwriter will never be updated.
    
    Yes, checkpoints_req, checkpoints_timed and buffer_checkpoint are not
    being updated with this patch.
    
    > If we adopt the patch, checkpoint is performed by checkpointer. So
    > it looks odd that information related to checkpoint exist in
    > pg_stat_bgwriter. We should move them to new catalog even if
    > it breaks the compatibility?
    
    Splitting pg_stat_bgwriter into pg_stat_bgwriter and
    pg_stat_checkpointer will break something internal?
    
    With this modification we'll see applications like monitoring tools
    breaking, but they could use a view to put data back together in a
    compatible way, IMHO.
    
    -- 
    Dickson S. Guedes
    mail/xmpp: guedes@guedesoft.net - skype: guediz
    http://guedesoft.net - http://www.postgresql.org.br
    
    
  34. Re: Separating bgwriter and checkpointer

    Robert Haas <robertmhaas@gmail.com> — 2011-10-19T12:45:25Z

    On Wed, Oct 19, 2011 at 8:43 AM, Dickson S. Guedes <listas@guedesoft.net> wrote:
    > 2011/10/19 Fujii Masao <masao.fujii@gmail.com>:
    >> On Tue, Oct 18, 2011 at 10:18 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >>> Any reason or objection to committing this patch?
    >>
    >> The checkpointer doesn't call pgstat_send_bgwriter(), but it should.
    >> Otherwise, some entries in pg_stat_bgwriter will never be updated.
    >
    > Yes, checkpoints_req, checkpoints_timed and buffer_checkpoint are not
    > being updated with this patch.
    >
    >> If we adopt the patch, checkpoint is performed by checkpointer. So
    >> it looks odd that information related to checkpoint exist in
    >> pg_stat_bgwriter. We should move them to new catalog even if
    >> it breaks the compatibility?
    >
    > Splitting pg_stat_bgwriter into pg_stat_bgwriter and
    > pg_stat_checkpointer will break something internal?
    >
    > With this modification we'll see applications like monitoring tools
    > breaking, but they could use a view to put data back together in a
    > compatible way, IMHO.
    
    I don't really see any reason to break the monitoring view just
    because we did some internal refactoring.  I'd rather have backward
    compatibility.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  35. Re: Separating bgwriter and checkpointer

    Fujii Masao <masao.fujii@gmail.com> — 2011-10-19T14:29:12Z

    On Wed, Oct 19, 2011 at 9:45 PM, Robert Haas <robertmhaas@gmail.com> wrote:
    > I don't really see any reason to break the monitoring view just
    > because we did some internal refactoring.  I'd rather have backward
    > compatibility.
    
    Fair enough.
    
    The patch doesn't change any document, but at least the description
    of pg_stat_bgwriter seems to need to be changed.
    
    Regards,
    
    -- 
    Fujii Masao
    NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    NTT Open Source Software Center
    
    
  36. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-10-19T14:58:09Z

    On Wed, Oct 19, 2011 at 3:29 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
    > On Wed, Oct 19, 2011 at 9:45 PM, Robert Haas <robertmhaas@gmail.com> wrote:
    >> I don't really see any reason to break the monitoring view just
    >> because we did some internal refactoring.  I'd rather have backward
    >> compatibility.
    >
    > Fair enough.
    >
    > The patch doesn't change any document, but at least the description
    > of pg_stat_bgwriter seems to need to be changed.
    
    Thanks for the review.
    
    Will follow up on suggestions.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  37. Re: Separating bgwriter and checkpointer

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-10-24T10:40:53Z

    On 19.10.2011 17:58, Simon Riggs wrote:
    > On Wed, Oct 19, 2011 at 3:29 PM, Fujii Masao<masao.fujii@gmail.com>  wrote:
    >> On Wed, Oct 19, 2011 at 9:45 PM, Robert Haas<robertmhaas@gmail.com>  wrote:
    >>> I don't really see any reason to break the monitoring view just
    >>> because we did some internal refactoring.  I'd rather have backward
    >>> compatibility.
    >>
    >> Fair enough.
    >>
    >> The patch doesn't change any document, but at least the description
    >> of pg_stat_bgwriter seems to need to be changed.
    >
    > Thanks for the review.
    >
    > Will follow up on suggestions.
    
    The patch looks sane, it's mostly just moving existing code around, but 
    there's one thing that's been bothering me about this whole idea from 
    the get-go:
    
    If the bgwriter and checkpointer are two different processes, whenever 
    bgwriter writes out a page it needs to send an fsync-request to the 
    checkpointer. We avoided that when both functions were performed by the 
    same process, but now we have to send and absorb a fsync-request message 
    for every single write() that happens in the system, except for those 
    done at checkpoints. Isn't that very expensive? Does it make the 
    fsync-request queue a bottleneck on some workloads?
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  38. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-10-25T05:23:34Z

    On Mon, Oct 24, 2011 at 11:40 AM, Heikki Linnakangas
    <heikki.linnakangas@enterprisedb.com> wrote:
    
    > The patch looks sane, it's mostly just moving existing code around, but
    > there's one thing that's been bothering me about this whole idea from the
    > get-go:
    >
    > If the bgwriter and checkpointer are two different processes, whenever
    > bgwriter writes out a page it needs to send an fsync-request to the
    > checkpointer. We avoided that when both functions were performed by the same
    > process, but now we have to send and absorb a fsync-request message for
    > every single write() that happens in the system, except for those done at
    > checkpoints. Isn't that very expensive? Does it make the fsync-request queue
    > a bottleneck on some workloads?
    
    That is a reasonable question and one I considered.
    
    I did some benchmarking earlier to see the overhead of that.
    Basically, its very small, much, much smaller than I thought.
    
    The benefit of allowing the bgwriter to continue working during long
    fsyncs easily outweighs the loss of doing more fsync-requests. Both of
    those overheads/problems occur at the same time so there is the
    overhead is always covered.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  39. Re: Separating bgwriter and checkpointer

    Simon Riggs <simon@2ndquadrant.com> — 2011-11-01T16:14:06Z

    On Wed, Oct 19, 2011 at 3:58 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
    > On Wed, Oct 19, 2011 at 3:29 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
    >> On Wed, Oct 19, 2011 at 9:45 PM, Robert Haas <robertmhaas@gmail.com> wrote:
    >>> I don't really see any reason to break the monitoring view just
    >>> because we did some internal refactoring.  I'd rather have backward
    >>> compatibility.
    >>
    >> Fair enough.
    >>
    >> The patch doesn't change any document, but at least the description
    >> of pg_stat_bgwriter seems to need to be changed.
    >
    > Thanks for the review.
    >
    > Will follow up on suggestions.
    
    I'll add this in as a separate item after commit.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services