Thread

  1. Re: Cascade replication

    Fujii Masao <masao.fujii@gmail.com> — 2011-07-11T06:28:12Z

    On Mon, Jul 11, 2011 at 10:26 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
    > On Mon, Jul 11, 2011 at 3:30 AM, Josh Berkus <josh@agliodbs.com> wrote:
    >> Do you think you'll submit a new version of the patch this commitfest?
    >
    > Yes. I'm now updating the patch according to Simon's comments.
    > I will submit it today.
    
    Attached is the updated version which addresses all the issues raised by Simon.
    
    > The risk you describe already exists in current code.
    >
    > I regard it as a non-risk. The unlink() and the rename() are executed
    > consecutively, so the gap between them is small, so the chance of a
    > SIGKILL in that gap at the same time as losing the archive seems low,
    > and we can always get that file from the master again if we are
    > streaming. Any code you add to "fix" this will get executed so rarely
    > it probably won't work when we need it to.
    >
    > In the current scheme we restart archiving from the last restartpoint,
    > which exists only on the archive. This new patch improves upon this by
    > keeping the most recent files locally, so we are less expose in the
    > case of archive unavailability. So this patch already improves things
    > and we don't need any more than that. No extra code please, IMHO.
    
    Yes, I added no extra code for the risk I raised upthread.
    
    > In #2, there is another problem; walsender might have the pre-existing file
    > open, so the startup process would need to request walsenders to close the
    > file before removing (or renaming) it, wait for new file to appear and open it
    > again.
    
    I implemented this.
    
    Regards,
    
    -- 
    Fujii Masao
    NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    NTT Open Source Software Center