Thread

  1. Re: Deadlock detector fails to activate on a hot standby replica

    Fujii Masao <masao.fujii@gmail.com> — 2026-05-26T15:01:38Z

    On Mon, May 25, 2026 at 11:20 PM Vitaly Davydov
    <v.davydov@postgrespro.ru> wrote:
    >
    > Dear Fujii Masao,
    >
    > Thank you for the review of the patch. Based on your comments I propose
    > a new version of the patch.
    
    Thanks for updating the patch!
    
    + if (got_standby_delay_timeout)
    + SendRecoveryConflictWithBufferPin(RECOVERY_CONFLICT_BUFFERPIN);
    + else if (got_standby_deadlock_timeout)
    + {
    
    Shouldn't we break out of the loop when either got_standby_delay_timeout or
    got_standby_deadlock_timeout becomes true? Otherwise, the loop continues with
    those flags still set, which could cause SendRecoveryConflictWithBufferPin() to
    be called unnecessarily in the subsequent cycles.
    
    
    + if (BufferGetRefCount(buffer) <= 1)
    
    Should this be "BufferGetRefCount(buffer) == 1" instead? I don't think
    BufferGetRefCount(buffer) should ever return 0 here. If that's correct,
    would it make sense to explicitly detect that case, for example:
    
        -----------------
        uint32 refcount = BufferGetRefCount(buffer);
    
        Assert(refcount > 0);
    
        if (refcount == 0)
            elog(ERROR, "buffer refcount dropped to zero while waiting for
    cleanup lock");
    
        if (refcount == 1)
            break;
        -----------------
    
    > I also added a new tap-test as a part of the patch. I did some changes in the
    > tap test to make it stable. Let me know, please, if it should be in a separate
    > commit.
    
    Do we really need a new TAP test file for this? We already have
    a startup/backend deadlock test in t/031_recovery_conflict.pl. Extending
    the deadlock section there to cover this test case seems simpler and easier to
    follow than introducing a new t/053_* test.
    
    Regards,
    
    -- 
    Fujii Masao