Thread

  1. Experimental patch for terminating VACUUM freeze blockers

    wenhui qiu <qiuwenhuifx@gmail.com> — 2026-05-13T11:56:43Z

    Hi all,
    
    When a table’s age reaches vacuum_failsafe_age, VACUUM enters failsafe mode
    and bypasses nonessential throttling such as autovacuum_vacuum_cost_delay so
    it can make progress as aggressively as possible.
    
    However, if VACUUM’s freeze horizon is blocked by an old transaction, the
    failsafe behavior alone cannot reduce the table age. In that case, VACUUM
    may run at full speed but still be unable to advance relfrozenxid.
    
    I have an experimental patch to explore handling this situation. The patch
    adds a GUC, vacuum_freeze_terminate_blockers_pid, which allows VACUUM to
    terminate regular client backends whose transaction horizon blocks VACUUM
    from advancing its freeze cutoff. The intended targets are
    idle-in-transaction sessions and long-running active transactions that are
    holding an old xmin or assigned XID.
    
    One possible question is why not rely on idle_in_transaction_session_timeout.
    In practice, this parameter is often not configured, and even when it is,
    it only addresses idle-in-transaction sessions. It does not help with
    active long-running transactions that can also hold back VACUUM’s freeze
    horizon.
    
    The patch deliberately does not try to handle other causes of freeze
    horizon retention, such as replication slots, hot standby feedback, or
    prepared transactions.
    
    This is experimental and intended for discussion.
    
  2. Re: Experimental patch for terminating VACUUM freeze blockers

    Nathan Bossart <nathandbossart@gmail.com> — 2026-05-13T14:23:26Z

    On Wed, May 13, 2026 at 07:56:43PM +0800, wenhui qiu wrote:
    > I have an experimental patch to explore handling this situation. The patch
    > adds a GUC, vacuum_freeze_terminate_blockers_pid, which allows VACUUM to
    > terminate regular client backends whose transaction horizon blocks VACUUM
    > from advancing its freeze cutoff. The intended targets are
    > idle-in-transaction sessions and long-running active transactions that are
    > holding an old xmin or assigned XID.
    
    Thanks for sharing.  I certainly agree that this area has room for
    improvement in Postgres.
    
    > The patch deliberately does not try to handle other causes of freeze
    > horizon retention, such as replication slots, hot standby feedback, or
    > prepared transactions.
    
    My experience is a bit dated, but I remember the two main issues being
    replication slots and temporary tables.  We now have
    idle_replication_slot_timeout, and there's a somewhat active thread on an
    XID version of that parameter [0], but I'm not aware of any recent ideas
    about how to deal with stranded temporary tables.  I'd encourage you to
    think about these problems, too.
    
    [0] https://postgr.es/m/CA%2B-JvFsMHckBMzsu5Ov9HCG3AFbMh056hHy1FiXazBRtZ9pFBg%40mail.gmail.com
    
    -- 
    nathan
    
    
    
    
  3. Re: Experimental patch for terminating VACUUM freeze blockers

    Sami Imseih <samimseih@gmail.com> — 2026-05-13T19:55:00Z

    > On Wed, May 13, 2026 at 07:56:43PM +0800, wenhui qiu wrote:
    > > I have an experimental patch to explore handling this situation. The patch
    > > adds a GUC, vacuum_freeze_terminate_blockers_pid, which allows VACUUM to
    > > terminate regular client backends whose transaction horizon blocks VACUUM
    > > from advancing its freeze cutoff. The intended targets are
    > > idle-in-transaction sessions and long-running active transactions that are
    > > holding an old xmin or assigned XID.
    >
    > Thanks for sharing.  I certainly agree that this area has room for
    > improvement in Postgres.
    
    My 2c. Using something like the proposed
    vacuum_freeze_terminate_blockers_pid (GUC name is misleading, since
    it's a bool )
    seems backwards to me. It does not address the root cause, which is
    the long-running
    transaction, etc and attempts to deal with the symptom rather than the problem.
    This also means a poor configuration of this parameter will more
    likely lead to a system
    silently getting into wraparound, as a DBA may relax a bit on monitoring, maybe.
    
    I do think we need better visibility into what is blocking vacuum, which was
    discussed here [1], but ultimately it is up to the DBA to properly monitor
    and mitigate workloads that are impacting their vacuum.
    
    [1] https://www.postgresql.org/message-id/flat/CAOzEurSgy-gDtwFmEbj5%2BR9PL0_G3qYB6nnzJtNStyuf87VSVg%40mail.gmail.com
    
    --
    Sami Imseih
    Amazon Web Services (AWS)
    
    
    
    
  4. Re: Experimental patch for terminating VACUUM freeze blockers

    wenhui qiu <qiuwenhuifx@gmail.com> — 2026-05-14T03:59:47Z

    Hi Sami
    
    > My 2c. Using something like the proposed
    > vacuum_freeze_terminate_blockers_pid (GUC name is misleading, since
    > it's a bool )
    > seems backwards to me. It does not address the root cause, which is
    > the long-running
    > transaction, etc and attempts to deal with the symptom rather than the
    problem.
    > This also means a poor configuration of this parameter will more
    > likely lead to a system
    > silently getting into wraparound, as a DBA may relax a bit on monitoring,
    maybe.
    
    Thanks for your comments. I agree that the real problem is the long-running
    transaction or other blocker, and this patch is not meant to replace
    monitoring or fixing that root cause. The idea was inspired by the
    replay/query conflict mechanism. Normally we do not want to cancel
    long-running queries that block replay either; alternatively, we can enable
    hot_standby_feedback, but that brings other problems. So this is a
    trade-off left to the DBA. I think this  is similar: the right solution is
    still to prevent xid wraparound by managing the blockers, but when the
    system is already approaching wraparound, terminating blockers may be
    preferable to letting the system continue toward a dangerous state. So this
    is intended as a last-resort option, not as the primary solution.
    
    > I do think we need better visibility into what is blocking vacuum, which
    was
    > discussed here [1], but ultimately it is up to the DBA to properly monitor
    > and mitigate workloads that are impacting their vacuum.
    
    > [1]
    https://www.postgresql.org/message-id/flat/CAOzEurSgy-gDtwFmEbj5%2BR9PL0_G3qYB6nnzJtNStyuf87VSVg%40mail.gmail.com
    
    I'm in this thread,I know you're good at adding relevant system views.
    Often, checking system views is more convenient than looking at logs. If
    this request is approved, I’d appreciate your help in implementing the
    corresponding system views.
    .
    
    Thanks
    
    On Thu, May 14, 2026 at 3:55 AM Sami Imseih <samimseih@gmail.com> wrote:
    
    > > On Wed, May 13, 2026 at 07:56:43PM +0800, wenhui qiu wrote:
    > > > I have an experimental patch to explore handling this situation. The
    > patch
    > > > adds a GUC, vacuum_freeze_terminate_blockers_pid, which allows VACUUM
    > to
    > > > terminate regular client backends whose transaction horizon blocks
    > VACUUM
    > > > from advancing its freeze cutoff. The intended targets are
    > > > idle-in-transaction sessions and long-running active transactions that
    > are
    > > > holding an old xmin or assigned XID.
    > >
    > > Thanks for sharing.  I certainly agree that this area has room for
    > > improvement in Postgres.
    >
    > My 2c. Using something like the proposed
    > vacuum_freeze_terminate_blockers_pid (GUC name is misleading, since
    > it's a bool )
    > seems backwards to me. It does not address the root cause, which is
    > the long-running
    > transaction, etc and attempts to deal with the symptom rather than the
    > problem.
    > This also means a poor configuration of this parameter will more
    > likely lead to a system
    > silently getting into wraparound, as a DBA may relax a bit on monitoring,
    > maybe.
    >
    > I do think we need better visibility into what is blocking vacuum, which
    > was
    > discussed here [1], but ultimately it is up to the DBA to properly monitor
    > and mitigate workloads that are impacting their vacuum.
    >
    > [1]
    > https://www.postgresql.org/message-id/flat/CAOzEurSgy-gDtwFmEbj5%2BR9PL0_G3qYB6nnzJtNStyuf87VSVg%40mail.gmail.com
    >
    > --
    > Sami Imseih
    > Amazon Web Services (AWS)
    >