Thread

  1. bug of recovery?

    Fujii Masao <masao.fujii@gmail.com> — 2011-09-26T09:50:06Z

    Hi,
    
    Currently, if a reference to an invalid page is found during recovery,
    its information
    is saved in hash table "invalid_page_tab". Then, if such a reference
    is resolved,
    its information is removed from the hash table. If there is unresolved
    reference to
    an invalid page in the hash table at the end of recovery, PANIC error occurs.
    
    What I'm worried about is that the hash table is volatile. If a user restarts
    the server before reaching end of recovery, any information in the
    hash table is lost,
    and we wrongly miss the PANIC error case because we cannot find any unresolved
    reference. That is, even if database is corrupted at the end of recovery,
    a user might not be able to notice that. This looks like a serious problem. No?
    
    To prevent the above problem, we should write the contents of the hash table to
    the disk for every restartpoints, I think. Then, when the server
    starts recovery,
    it should reload the hash table from the disk. Thought? Am I missing something?
    
    Regards,
    
    -- 
    Fujii Masao
    NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    NTT Open Source Software Center
    
    
  2. Re: bug of recovery?

    Florian G. Pflug <fgp@phlo.org> — 2011-09-26T17:15:18Z

    On Sep26, 2011, at 11:50 , Fujii Masao wrote:
    > Currently, if a reference to an invalid page is found during recovery,
    > its information
    > is saved in hash table "invalid_page_tab". Then, if such a reference
    > is resolved,
    > its information is removed from the hash table. If there is unresolved
    > reference to
    > an invalid page in the hash table at the end of recovery, PANIC error occurs.
    > 
    > What I'm worried about is that the hash table is volatile. If a user restarts
    > the server before reaching end of recovery, any information in the
    > hash table is lost,
    > and we wrongly miss the PANIC error case because we cannot find any unresolved
    > reference. That is, even if database is corrupted at the end of recovery,
    > a user might not be able to notice that. This looks like a serious problem. No?
    > 
    > To prevent the above problem, we should write the contents of the hash table to
    > the disk for every restartpoints, I think. Then, when the server
    > starts recovery,
    > it should reload the hash table from the disk. Thought? Am I missing something?
    
    Shouldn't references to invalid pages only occur before we reach a consistent
    state? If so, the right fix would be to check whether all invalid page references
    have been resolved after we've reached a consistent state, and to skip creating
    restart points while there're unresolved page references.
    
    best regards,
    Florian Pflug
    
    
    
  3. Re: bug of recovery?

    Simon Riggs <simon@2ndquadrant.com> — 2011-09-26T19:06:41Z

    On Mon, Sep 26, 2011 at 10:50 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
    
    > Currently, if a reference to an invalid page is found during recovery,
    > its information
    > is saved in hash table "invalid_page_tab". Then, if such a reference
    > is resolved,
    > its information is removed from the hash table. If there is unresolved
    > reference to
    > an invalid page in the hash table at the end of recovery, PANIC error occurs.
    >
    > What I'm worried about is that the hash table is volatile. If a user restarts
    > the server before reaching end of recovery, any information in the
    > hash table is lost,
    > and we wrongly miss the PANIC error case because we cannot find any unresolved
    > reference. That is, even if database is corrupted at the end of recovery,
    > a user might not be able to notice that. This looks like a serious problem. No?
    >
    > To prevent the above problem, we should write the contents of the hash table to
    > the disk for every restartpoints, I think. Then, when the server
    > starts recovery,
    > it should reload the hash table from the disk. Thought? Am I missing something?
    
    That doesn't happen because the when we stop the server it will
    restart from a valid restartpoint - one where there is no in-progress
    multi-phase operation.
    
    So when it replays it will always replay both parts of the operation.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  4. Re: bug of recovery?

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-09-26T20:39:16Z

    Simon Riggs <simon@2ndquadrant.com> writes:
    > On Mon, Sep 26, 2011 at 10:50 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
    >> To prevent the above problem, we should write the contents of the hash table to
    >> the disk for every restartpoints, I think. Then, when the server
    >> starts recovery,
    >> it should reload the hash table from the disk. Thought? Am I missing something?
    
    > That doesn't happen because the when we stop the server it will
    > restart from a valid restartpoint - one where there is no in-progress
    > multi-phase operation.
    
    Not clear that that's true.  The larger point though is that the
    invalid-page table is only interesting during crash recovery --- once
    you've reached a consistent state, it should be empty and remain so.
    So I see no particular value in Fujii's proposal of logging the table to
    disk during standby mode.
    
    It might be worthwhile to invoke XLogCheckInvalidPages() as soon as
    we (think we) have reached consistency, rather than leaving it to be
    done only when we exit recovery mode.
    
    			regards, tom lane
    
    
  5. Re: bug of recovery?

    Florian G. Pflug <fgp@phlo.org> — 2011-09-26T22:28:33Z

    On Sep26, 2011, at 22:39 , Tom Lane wrote:
    > It might be worthwhile to invoke XLogCheckInvalidPages() as soon as
    > we (think we) have reached consistency, rather than leaving it to be
    > done only when we exit recovery mode.
    
    I believe we also need to prevent the creation of restart points before
    we've reached consistency. If we're starting from an online backup,
    and a checkpoint occurred between pg_start_backup() and pg_stop_backup(),
    we currently create a restart point upon replaying that checkpoint's
    xlog record. At that point, however, unresolved page references are
    not an error, since a truncation that happened after the checkpoint
    (but before pg_stop_backup()) might or might not be reflected in the
    online backup.
    
    best regards,
    Florian Pflug
    
    
    
  6. Re: bug of recovery?

    Fujii Masao <masao.fujii@gmail.com> — 2011-09-27T00:49:51Z

    On Tue, Sep 27, 2011 at 7:28 AM, Florian Pflug <fgp@phlo.org> wrote:
    > On Sep26, 2011, at 22:39 , Tom Lane wrote:
    >> It might be worthwhile to invoke XLogCheckInvalidPages() as soon as
    >> we (think we) have reached consistency, rather than leaving it to be
    >> done only when we exit recovery mode.
    >
    > I believe we also need to prevent the creation of restart points before
    > we've reached consistency. If we're starting from an online backup,
    > and a checkpoint occurred between pg_start_backup() and pg_stop_backup(),
    > we currently create a restart point upon replaying that checkpoint's
    > xlog record. At that point, however, unresolved page references are
    > not an error, since a truncation that happened after the checkpoint
    > (but before pg_stop_backup()) might or might not be reflected in the
    > online backup.
    
    Preventing the creation of restartpoints before reaching consistent point
    sounds fragile to the case where the backup takes very long time. It might
    also take very long time to reach consistent point when replaying from that
    backup. Which prevents also the removal of WAL files (e.g., streamed from
    the master server) for a long time, and then might cause disk full failure.
    
    ISTM that writing an invalid-page table to the disk for every restartpoints is
    better approach. If an invalid-page table is never updated after we've
    reached consistency point, we probably should make restartpoints write
    that table only after that point. And, if a reference to an invalid
    page is found
    after the consistent point, we should emit error and cancel a recovery.
    
    Regards,
    
    -- 
    Fujii Masao
    NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    NTT Open Source Software Center
    
    
  7. Re: bug of recovery?

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-09-27T04:05:44Z

    Fujii Masao <masao.fujii@gmail.com> writes:
    > ISTM that writing an invalid-page table to the disk for every restartpoints is
    > better approach.
    
    I still say that's uncalled-for overkill.  The invalid-page table is not
    necessary for recovery, it's only a debugging cross-check.  You're more
    likely to introduce bugs than fix any by adding a mechanism like that.
    
    			regards, tom lane
    
    
  8. Re: bug of recovery?

    Fujii Masao <masao.fujii@gmail.com> — 2011-09-27T05:00:14Z

    On Tue, Sep 27, 2011 at 1:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > Fujii Masao <masao.fujii@gmail.com> writes:
    >> ISTM that writing an invalid-page table to the disk for every restartpoints is
    >> better approach.
    >
    > I still say that's uncalled-for overkill.  The invalid-page table is not
    > necessary for recovery, it's only a debugging cross-check.
    
    If so, there is no risk even if the invalid-page table is lost and the check
    is skipped unexpectedly?
    
    Regards,
    
    -- 
    Fujii Masao
    NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    NTT Open Source Software Center
    
    
  9. Re: bug of recovery?

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-09-27T05:54:58Z

    On 26.09.2011 21:06, Simon Riggs wrote:
    > On Mon, Sep 26, 2011 at 10:50 AM, Fujii Masao<masao.fujii@gmail.com>  wrote:
    >
    >> Currently, if a reference to an invalid page is found during recovery,
    >> its information
    >> is saved in hash table "invalid_page_tab". Then, if such a reference
    >> is resolved,
    >> its information is removed from the hash table. If there is unresolved
    >> reference to
    >> an invalid page in the hash table at the end of recovery, PANIC error occurs.
    >>
    >> What I'm worried about is that the hash table is volatile. If a user restarts
    >> the server before reaching end of recovery, any information in the
    >> hash table is lost,
    >> and we wrongly miss the PANIC error case because we cannot find any unresolved
    >> reference. That is, even if database is corrupted at the end of recovery,
    >> a user might not be able to notice that. This looks like a serious problem. No?
    >>
    >> To prevent the above problem, we should write the contents of the hash table to
    >> the disk for every restartpoints, I think. Then, when the server
    >> starts recovery,
    >> it should reload the hash table from the disk. Thought? Am I missing something?
    >
    > That doesn't happen because the when we stop the server it will
    > restart from a valid restartpoint - one where there is no in-progress
    > multi-phase operation.
    >
    > So when it replays it will always replay both parts of the operation.
    
    I think you're mixing this up with the multi-page page split operations 
    in b-tree. This is different from that. What the "invalid_page_tab" is 
    for is the situation where you for example, insert to a page on table X, 
    and later table X is dropped, and then you crash. On WAL replay, you 
    will see the insert record, but the file for the table doesn't exist, 
    because the table was dropped. In that case we skip the insert, note 
    what happened in invalid_page_tab, and move on with recovery. When we 
    see the later record to drop the table, we know it was OK that the file 
    was missing earlier. But if we don't see it before end of recovery, we 
    PANIC, because then the file should've been there.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  10. Re: bug of recovery?

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-09-27T05:59:44Z

    On 27.09.2011 00:28, Florian Pflug wrote:
    > On Sep26, 2011, at 22:39 , Tom Lane wrote:
    >> It might be worthwhile to invoke XLogCheckInvalidPages() as soon as
    >> we (think we) have reached consistency, rather than leaving it to be
    >> done only when we exit recovery mode.
    >
    > I believe we also need to prevent the creation of restart points before
    > we've reached consistency.
    
    Seems reasonable. We could still allow restartpoints when the hash table 
    is empty, though. And once we've reached consistency, we can throw an 
    error immediately in log_invalid_page(), instead of adding the entry in 
    the hash table.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  11. Re: bug of recovery?

    Simon Riggs <simon@2ndquadrant.com> — 2011-09-27T07:00:50Z

    On Tue, Sep 27, 2011 at 6:54 AM, Heikki Linnakangas
    <heikki.linnakangas@enterprisedb.com> wrote:
    
    > I think you're mixing this up with the multi-page page split operations in
    > b-tree. This is different from that. What the "invalid_page_tab" is for is
    > the situation where you for example, insert to a page on table X, and later
    > table X is dropped, and then you crash. On WAL replay, you will see the
    > insert record, but the file for the table doesn't exist, because the table
    > was dropped. In that case we skip the insert, note what happened in
    > invalid_page_tab, and move on with recovery. When we see the later record to
    > drop the table, we know it was OK that the file was missing earlier. But if
    > we don't see it before end of recovery, we PANIC, because then the file
    > should've been there.
    
    OK, yes, I see. Thanks.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  12. Re: bug of recovery?

    Florian G. Pflug <fgp@phlo.org> — 2011-09-27T11:06:39Z

    On Sep27, 2011, at 07:59 , Heikki Linnakangas wrote:
    > On 27.09.2011 00:28, Florian Pflug wrote:
    >> On Sep26, 2011, at 22:39 , Tom Lane wrote:
    >>> It might be worthwhile to invoke XLogCheckInvalidPages() as soon as
    >>> we (think we) have reached consistency, rather than leaving it to be
    >>> done only when we exit recovery mode.
    >> 
    >> I believe we also need to prevent the creation of restart points before
    >> we've reached consistency.
    > 
    > Seems reasonable. We could still allow restartpoints when the hash table is empty, though. And once we've reached consistency, we can throw an error immediately in log_invalid_page(), instead of adding the entry in the hash table.
    
    That mimics the way the rm_safe_restartpoint callbacks work, which is good.
    
    Actually, why don't we use that machinery to implement this? There's currently no rm_safe_restartpoint callback for RM_XLOG_ID, so we'd just need to create one that checks whether invalid_page_tab is empty.
    
    best regards,
    Florian Pflug
    
    
    
  13. Re: bug of recovery?

    Fujii Masao <masao.fujii@gmail.com> — 2011-09-29T11:31:11Z

    On Tue, Sep 27, 2011 at 8:06 PM, Florian Pflug <fgp@phlo.org> wrote:
    > On Sep27, 2011, at 07:59 , Heikki Linnakangas wrote:
    >> On 27.09.2011 00:28, Florian Pflug wrote:
    >>> On Sep26, 2011, at 22:39 , Tom Lane wrote:
    >>>> It might be worthwhile to invoke XLogCheckInvalidPages() as soon as
    >>>> we (think we) have reached consistency, rather than leaving it to be
    >>>> done only when we exit recovery mode.
    >>>
    >>> I believe we also need to prevent the creation of restart points before
    >>> we've reached consistency.
    >>
    >> Seems reasonable. We could still allow restartpoints when the hash table is empty, though. And once we've reached consistency, we can throw an error immediately in log_invalid_page(), instead of adding the entry in the hash table.
    >
    > That mimics the way the rm_safe_restartpoint callbacks work, which is good.
    >
    > Actually, why don't we use that machinery to implement this? There's currently no rm_safe_restartpoint callback for RM_XLOG_ID, so we'd just need to create one that checks whether invalid_page_tab is empty.
    
    Okay, the attached patch prevents the creation of restartpoints by using
    rm_safe_restartpoint callback if we've not reached a consistent state yet
    and the invalid-page table is not empty. But the invalid-page table is not
    tied to the specific resource manager, so using rm_safe_restartpoint for
    that seems to slightly odd. Is this OK?
    
    Also, according to other suggestions, the patch changes XLogCheckInvalidPages()
    so that it's called as soon as we've reached a consistent state, and changes
    log_invalid_page() so that it emits PANIC immediately if consistency is already
    reached. These are very good changes, I think. Because they enable us to
    notice serious problem which causes PANIC error immediately. Without these
    changes, you unfortunately might notice that the standby database is corrupted
    when failover happens. Though such a problem might rarely happen, I think it's
    worth doing those changes.
    
    Regards,
    
    -- 
    Fujii Masao
    NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    NTT Open Source Software Center
    
  14. Re: bug of recovery?

    Simon Riggs <simon@2ndquadrant.com> — 2011-09-29T11:49:11Z

    On Thu, Sep 29, 2011 at 12:31 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
    > On Tue, Sep 27, 2011 at 8:06 PM, Florian Pflug <fgp@phlo.org> wrote:
    >> On Sep27, 2011, at 07:59 , Heikki Linnakangas wrote:
    >>> On 27.09.2011 00:28, Florian Pflug wrote:
    >>>> On Sep26, 2011, at 22:39 , Tom Lane wrote:
    >>>>> It might be worthwhile to invoke XLogCheckInvalidPages() as soon as
    >>>>> we (think we) have reached consistency, rather than leaving it to be
    >>>>> done only when we exit recovery mode.
    >>>>
    >>>> I believe we also need to prevent the creation of restart points before
    >>>> we've reached consistency.
    >>>
    >>> Seems reasonable. We could still allow restartpoints when the hash table is empty, though. And once we've reached consistency, we can throw an error immediately in log_invalid_page(), instead of adding the entry in the hash table.
    >>
    >> That mimics the way the rm_safe_restartpoint callbacks work, which is good.
    >>
    >> Actually, why don't we use that machinery to implement this? There's currently no rm_safe_restartpoint callback for RM_XLOG_ID, so we'd just need to create one that checks whether invalid_page_tab is empty.
    >
    > Okay, the attached patch prevents the creation of restartpoints by using
    > rm_safe_restartpoint callback if we've not reached a consistent state yet
    > and the invalid-page table is not empty. But the invalid-page table is not
    > tied to the specific resource manager, so using rm_safe_restartpoint for
    > that seems to slightly odd. Is this OK?
    >
    > Also, according to other suggestions, the patch changes XLogCheckInvalidPages()
    > so that it's called as soon as we've reached a consistent state, and changes
    > log_invalid_page() so that it emits PANIC immediately if consistency is already
    > reached. These are very good changes, I think. Because they enable us to
    > notice serious problem which causes PANIC error immediately. Without these
    > changes, you unfortunately might notice that the standby database is corrupted
    > when failover happens. Though such a problem might rarely happen, I think it's
    > worth doing those changes.
    
    Patch does everything we agreed it should.
    
    Good suggestion from Florian.
    
    This worries me slightly now though because the patch makes us PANIC
    in a place we didn't used to and once we do that we cannot restart the
    server at all. Are we sure we want that? It's certainly a great way to
    shake down errors in other code...
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  15. Re: bug of recovery?

    Florian G. Pflug <fgp@phlo.org> — 2011-09-29T14:12:30Z

    On Sep29, 2011, at 13:49 , Simon Riggs wrote:
    > This worries me slightly now though because the patch makes us PANIC
    > in a place we didn't used to and once we do that we cannot restart the
    > server at all. Are we sure we want that? It's certainly a great way to
    > shake down errors in other code...
    
    The patch only introduces a new PANIC condition during archive recovery,
    though. Crash recovery is unaffected, except that we no longer create
    restart points before we reach consistency.
    
    Also, if we hit an invalid page reference after reaching consistency,
    the cause is probably either a bug in our recovery code, or (quite unlikely)
    a corrupted WAL that passed the CRC check. In both cases, the likelyhood
    of data-corruption seems high, so PANICing seems like the right thing to do.
    
    best regards,
    Florian Pflug
    
    
    
  16. Re: bug of recovery?

    Fujii Masao <masao.fujii@gmail.com> — 2011-09-30T01:09:07Z

    On Thu, Sep 29, 2011 at 11:12 PM, Florian Pflug <fgp@phlo.org> wrote:
    > On Sep29, 2011, at 13:49 , Simon Riggs wrote:
    >> This worries me slightly now though because the patch makes us PANIC
    >> in a place we didn't used to and once we do that we cannot restart the
    >> server at all. Are we sure we want that? It's certainly a great way to
    >> shake down errors in other code...
    >
    > The patch only introduces a new PANIC condition during archive recovery,
    > though. Crash recovery is unaffected, except that we no longer create
    > restart points before we reach consistency.
    >
    > Also, if we hit an invalid page reference after reaching consistency,
    > the cause is probably either a bug in our recovery code, or (quite unlikely)
    > a corrupted WAL that passed the CRC check. In both cases, the likelyhood
    > of data-corruption seems high, so PANICing seems like the right thing to do.
    
    Fair enough.
    
    We might be able to use FATAL or ERROR instead of PANIC because they
    also cause all processes to exit when the startup process emits them.
    For example, we now use FATAL to stop the server in recovery mode
    when recovery is about to end before we've reached a consistent state.
    
    Regards,
    
    -- 
    Fujii Masao
    NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    NTT Open Source Software Center
    
    
  17. Re: bug of recovery?

    Simon Riggs <simon@2ndquadrant.com> — 2011-09-30T06:57:02Z

    On Fri, Sep 30, 2011 at 2:09 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
    > On Thu, Sep 29, 2011 at 11:12 PM, Florian Pflug <fgp@phlo.org> wrote:
    >> On Sep29, 2011, at 13:49 , Simon Riggs wrote:
    >>> This worries me slightly now though because the patch makes us PANIC
    >>> in a place we didn't used to and once we do that we cannot restart the
    >>> server at all. Are we sure we want that? It's certainly a great way to
    >>> shake down errors in other code...
    >>
    >> The patch only introduces a new PANIC condition during archive recovery,
    >> though. Crash recovery is unaffected, except that we no longer create
    >> restart points before we reach consistency.
    >>
    >> Also, if we hit an invalid page reference after reaching consistency,
    >> the cause is probably either a bug in our recovery code, or (quite unlikely)
    >> a corrupted WAL that passed the CRC check. In both cases, the likelyhood
    >> of data-corruption seems high, so PANICing seems like the right thing to do.
    >
    > Fair enough.
    >
    > We might be able to use FATAL or ERROR instead of PANIC because they
    > also cause all processes to exit when the startup process emits them.
    > For example, we now use FATAL to stop the server in recovery mode
    > when recovery is about to end before we've reached a consistent state.
    
    I think we should issue PANIC if the source is a critical rmgr, or
    just WARNING if from a non-critical rmgr, such as indexes.
    
    Ideally, I think we should have a mechanism to allow indexes to be
    marked corrupt. For example, a file that if present shows that the
    index is corrupt and would be marked not valid. We can then create the
    file and send a sinval message to force the index relcache to be
    rebuilt showing valid set to false.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  18. Re: bug of recovery?

    Fujii Masao <masao.fujii@gmail.com> — 2011-10-03T05:23:25Z

    On Fri, Sep 30, 2011 at 3:57 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
    > I think we should issue PANIC if the source is a critical rmgr, or
    > just WARNING if from a non-critical rmgr, such as indexes.
    >
    > Ideally, I think we should have a mechanism to allow indexes to be
    > marked corrupt. For example, a file that if present shows that the
    > index is corrupt and would be marked not valid. We can then create the
    > file and send a sinval message to force the index relcache to be
    > rebuilt showing valid set to false.
    
    This seems not to be specific to the invalid-page table problem.
    All error cases from a non-critical rmgr should be treated not-PANIC.
    So I think that the idea should be implemented separately from
    the patch I've posted.
    
    Anyway what if read-only query accesses the index marked invalid?
    Just emit ERROR?
    
    Regards,
    
    -- 
    Fujii Masao
    NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    NTT Open Source Software Center
    
    
  19. Re: bug of recovery?

    Simon Riggs <simon@2ndquadrant.com> — 2011-10-03T07:07:10Z

    On Mon, Oct 3, 2011 at 6:23 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
    
    > So I think that the idea should be implemented separately from
    > the patch I've posted.
    
    Agreed. I'll do a final review and commit today.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  20. Re: bug of recovery?

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-10-03T07:21:26Z

    On 29.09.2011 14:31, Fujii Masao wrote:
    > On Tue, Sep 27, 2011 at 8:06 PM, Florian Pflug<fgp@phlo.org>  wrote:
    >> Actually, why don't we use that machinery to implement this? There's currently no rm_safe_restartpoint callback for RM_XLOG_ID, so we'd just need to create one that checks whether invalid_page_tab is empty.
    >
    > Okay, the attached patch prevents the creation of restartpoints by using
    > rm_safe_restartpoint callback if we've not reached a consistent state yet
    > and the invalid-page table is not empty. But the invalid-page table is not
    > tied to the specific resource manager, so using rm_safe_restartpoint for
    > that seems to slightly odd. Is this OK?
    
    I don't think this should use the rm_safe_restartpoint machinery. As you 
    said, it's not tied to any specific resource manager. And I've actually 
    been thinking that we will get rid of rm_safe_restartpoint altogether in 
    the future. The two things that still use it are the b-tree and gin, and 
    I'd like to change both of those to not require any post-recovery 
    cleanup step to finish multi-page operations, similar to what I did with 
    GiST in 9.1.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com
    
    
  21. Re: bug of recovery?

    Simon Riggs <simon@2ndquadrant.com> — 2011-10-03T07:32:04Z

    On Mon, Oct 3, 2011 at 8:21 AM, Heikki Linnakangas
    <heikki.linnakangas@enterprisedb.com> wrote:
    > On 29.09.2011 14:31, Fujii Masao wrote:
    >>
    >> On Tue, Sep 27, 2011 at 8:06 PM, Florian Pflug<fgp@phlo.org>  wrote:
    >>>
    >>> Actually, why don't we use that machinery to implement this? There's
    >>> currently no rm_safe_restartpoint callback for RM_XLOG_ID, so we'd just need
    >>> to create one that checks whether invalid_page_tab is empty.
    >>
    >> Okay, the attached patch prevents the creation of restartpoints by using
    >> rm_safe_restartpoint callback if we've not reached a consistent state yet
    >> and the invalid-page table is not empty. But the invalid-page table is not
    >> tied to the specific resource manager, so using rm_safe_restartpoint for
    >> that seems to slightly odd. Is this OK?
    >
    > I don't think this should use the rm_safe_restartpoint machinery. As you
    > said, it's not tied to any specific resource manager. And I've actually been
    > thinking that we will get rid of rm_safe_restartpoint altogether in the
    > future. The two things that still use it are the b-tree and gin, and I'd
    > like to change both of those to not require any post-recovery cleanup step
    > to finish multi-page operations, similar to what I did with GiST in 9.1.
    
    I thought that was quite neat doing it that way, but there's no
    specific reason to do it that way I guess. If you're happy to rewrite
    the patch then I guess we're OK.
    
    I certainly would like to get rid of rm_safe_restartpoint in the
    longer term, hopefully sooner.
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  22. Re: bug of recovery?

    Fujii Masao <masao.fujii@gmail.com> — 2011-10-04T06:43:30Z

    On Mon, Oct 3, 2011 at 4:32 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >> I don't think this should use the rm_safe_restartpoint machinery. As you
    >> said, it's not tied to any specific resource manager. And I've actually been
    >> thinking that we will get rid of rm_safe_restartpoint altogether in the
    >> future. The two things that still use it are the b-tree and gin, and I'd
    >> like to change both of those to not require any post-recovery cleanup step
    >> to finish multi-page operations, similar to what I did with GiST in 9.1.
    >
    > I thought that was quite neat doing it that way, but there's no
    > specific reason to do it that way I guess. If you're happy to rewrite
    > the patch then I guess we're OK.
    >
    > I certainly would like to get rid of rm_safe_restartpoint in the
    > longer term, hopefully sooner.
    
    Though Heikki might be already working on that,... anyway,
    the attached patch is the version which doesn't use rm_safe_restartpoint
    machinery.
    
    Regards,
    
    -- 
    Fujii Masao
    NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    NTT Open Source Software Center
    
  23. Re: bug of recovery?

    Simon Riggs <simon@2ndquadrant.com> — 2011-11-24T13:59:23Z

    On Tue, Oct 4, 2011 at 7:43 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
    > On Mon, Oct 3, 2011 at 4:32 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
    >>> I don't think this should use the rm_safe_restartpoint machinery. As you
    >>> said, it's not tied to any specific resource manager. And I've actually been
    >>> thinking that we will get rid of rm_safe_restartpoint altogether in the
    >>> future. The two things that still use it are the b-tree and gin, and I'd
    >>> like to change both of those to not require any post-recovery cleanup step
    >>> to finish multi-page operations, similar to what I did with GiST in 9.1.
    >>
    >> I thought that was quite neat doing it that way, but there's no
    >> specific reason to do it that way I guess. If you're happy to rewrite
    >> the patch then I guess we're OK.
    >>
    >> I certainly would like to get rid of rm_safe_restartpoint in the
    >> longer term, hopefully sooner.
    >
    > Though Heikki might be already working on that,... anyway,
    > the attached patch is the version which doesn't use rm_safe_restartpoint
    > machinery.
    
    
    Heikki - I see you are down on the CF app to review this.
    
    I'd been working on it as well, just forgot to let Greg know.
    
    Did you start already? Should I stop?
    
    -- 
     Simon Riggs                   http://www.2ndQuadrant.com/
     PostgreSQL Development, 24x7 Support, Training & Services
    
    
  24. Re: bug of recovery?

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-12-02T08:59:01Z

    On 04.10.2011 09:43, Fujii Masao wrote:
    > On Mon, Oct 3, 2011 at 4:32 PM, Simon Riggs<simon@2ndquadrant.com>  wrote:
    >>> I don't think this should use the rm_safe_restartpoint machinery. As you
    >>> said, it's not tied to any specific resource manager. And I've actually been
    >>> thinking that we will get rid of rm_safe_restartpoint altogether in the
    >>> future. The two things that still use it are the b-tree and gin, and I'd
    >>> like to change both of those to not require any post-recovery cleanup step
    >>> to finish multi-page operations, similar to what I did with GiST in 9.1.
    >>
    >> I thought that was quite neat doing it that way, but there's no
    >> specific reason to do it that way I guess. If you're happy to rewrite
    >> the patch then I guess we're OK.
    >>
    >> I certainly would like to get rid of rm_safe_restartpoint in the
    >> longer term, hopefully sooner.
    >
    > Though Heikki might be already working on that,...
    
    Just haven't gotten around to it. It's a fair amount of work with little 
    user-visible benefit.
    
    > anyway,
    > the attached patch is the version which doesn't use rm_safe_restartpoint
    > machinery.
    
    Thanks, committed.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com