Thread
-
Experimental patch for terminating VACUUM freeze blockers
wenhui qiu <qiuwenhuifx@gmail.com> — 2026-05-13T11:56:43Z
Hi all, When a table’s age reaches vacuum_failsafe_age, VACUUM enters failsafe mode and bypasses nonessential throttling such as autovacuum_vacuum_cost_delay so it can make progress as aggressively as possible. However, if VACUUM’s freeze horizon is blocked by an old transaction, the failsafe behavior alone cannot reduce the table age. In that case, VACUUM may run at full speed but still be unable to advance relfrozenxid. I have an experimental patch to explore handling this situation. The patch adds a GUC, vacuum_freeze_terminate_blockers_pid, which allows VACUUM to terminate regular client backends whose transaction horizon blocks VACUUM from advancing its freeze cutoff. The intended targets are idle-in-transaction sessions and long-running active transactions that are holding an old xmin or assigned XID. One possible question is why not rely on idle_in_transaction_session_timeout. In practice, this parameter is often not configured, and even when it is, it only addresses idle-in-transaction sessions. It does not help with active long-running transactions that can also hold back VACUUM’s freeze horizon. The patch deliberately does not try to handle other causes of freeze horizon retention, such as replication slots, hot standby feedback, or prepared transactions. This is experimental and intended for discussion.
-
Re: Experimental patch for terminating VACUUM freeze blockers
Nathan Bossart <nathandbossart@gmail.com> — 2026-05-13T14:23:26Z
On Wed, May 13, 2026 at 07:56:43PM +0800, wenhui qiu wrote: > I have an experimental patch to explore handling this situation. The patch > adds a GUC, vacuum_freeze_terminate_blockers_pid, which allows VACUUM to > terminate regular client backends whose transaction horizon blocks VACUUM > from advancing its freeze cutoff. The intended targets are > idle-in-transaction sessions and long-running active transactions that are > holding an old xmin or assigned XID. Thanks for sharing. I certainly agree that this area has room for improvement in Postgres. > The patch deliberately does not try to handle other causes of freeze > horizon retention, such as replication slots, hot standby feedback, or > prepared transactions. My experience is a bit dated, but I remember the two main issues being replication slots and temporary tables. We now have idle_replication_slot_timeout, and there's a somewhat active thread on an XID version of that parameter [0], but I'm not aware of any recent ideas about how to deal with stranded temporary tables. I'd encourage you to think about these problems, too. [0] https://postgr.es/m/CA%2B-JvFsMHckBMzsu5Ov9HCG3AFbMh056hHy1FiXazBRtZ9pFBg%40mail.gmail.com -- nathan
-
Re: Experimental patch for terminating VACUUM freeze blockers
Sami Imseih <samimseih@gmail.com> — 2026-05-13T19:55:00Z
> On Wed, May 13, 2026 at 07:56:43PM +0800, wenhui qiu wrote: > > I have an experimental patch to explore handling this situation. The patch > > adds a GUC, vacuum_freeze_terminate_blockers_pid, which allows VACUUM to > > terminate regular client backends whose transaction horizon blocks VACUUM > > from advancing its freeze cutoff. The intended targets are > > idle-in-transaction sessions and long-running active transactions that are > > holding an old xmin or assigned XID. > > Thanks for sharing. I certainly agree that this area has room for > improvement in Postgres. My 2c. Using something like the proposed vacuum_freeze_terminate_blockers_pid (GUC name is misleading, since it's a bool ) seems backwards to me. It does not address the root cause, which is the long-running transaction, etc and attempts to deal with the symptom rather than the problem. This also means a poor configuration of this parameter will more likely lead to a system silently getting into wraparound, as a DBA may relax a bit on monitoring, maybe. I do think we need better visibility into what is blocking vacuum, which was discussed here [1], but ultimately it is up to the DBA to properly monitor and mitigate workloads that are impacting their vacuum. [1] https://www.postgresql.org/message-id/flat/CAOzEurSgy-gDtwFmEbj5%2BR9PL0_G3qYB6nnzJtNStyuf87VSVg%40mail.gmail.com -- Sami Imseih Amazon Web Services (AWS)
-
Re: Experimental patch for terminating VACUUM freeze blockers
wenhui qiu <qiuwenhuifx@gmail.com> — 2026-05-14T03:59:47Z
Hi Sami > My 2c. Using something like the proposed > vacuum_freeze_terminate_blockers_pid (GUC name is misleading, since > it's a bool ) > seems backwards to me. It does not address the root cause, which is > the long-running > transaction, etc and attempts to deal with the symptom rather than the problem. > This also means a poor configuration of this parameter will more > likely lead to a system > silently getting into wraparound, as a DBA may relax a bit on monitoring, maybe. Thanks for your comments. I agree that the real problem is the long-running transaction or other blocker, and this patch is not meant to replace monitoring or fixing that root cause. The idea was inspired by the replay/query conflict mechanism. Normally we do not want to cancel long-running queries that block replay either; alternatively, we can enable hot_standby_feedback, but that brings other problems. So this is a trade-off left to the DBA. I think this is similar: the right solution is still to prevent xid wraparound by managing the blockers, but when the system is already approaching wraparound, terminating blockers may be preferable to letting the system continue toward a dangerous state. So this is intended as a last-resort option, not as the primary solution. > I do think we need better visibility into what is blocking vacuum, which was > discussed here [1], but ultimately it is up to the DBA to properly monitor > and mitigate workloads that are impacting their vacuum. > [1] https://www.postgresql.org/message-id/flat/CAOzEurSgy-gDtwFmEbj5%2BR9PL0_G3qYB6nnzJtNStyuf87VSVg%40mail.gmail.com I'm in this thread,I know you're good at adding relevant system views. Often, checking system views is more convenient than looking at logs. If this request is approved, I’d appreciate your help in implementing the corresponding system views. . Thanks On Thu, May 14, 2026 at 3:55 AM Sami Imseih <samimseih@gmail.com> wrote: > > On Wed, May 13, 2026 at 07:56:43PM +0800, wenhui qiu wrote: > > > I have an experimental patch to explore handling this situation. The > patch > > > adds a GUC, vacuum_freeze_terminate_blockers_pid, which allows VACUUM > to > > > terminate regular client backends whose transaction horizon blocks > VACUUM > > > from advancing its freeze cutoff. The intended targets are > > > idle-in-transaction sessions and long-running active transactions that > are > > > holding an old xmin or assigned XID. > > > > Thanks for sharing. I certainly agree that this area has room for > > improvement in Postgres. > > My 2c. Using something like the proposed > vacuum_freeze_terminate_blockers_pid (GUC name is misleading, since > it's a bool ) > seems backwards to me. It does not address the root cause, which is > the long-running > transaction, etc and attempts to deal with the symptom rather than the > problem. > This also means a poor configuration of this parameter will more > likely lead to a system > silently getting into wraparound, as a DBA may relax a bit on monitoring, > maybe. > > I do think we need better visibility into what is blocking vacuum, which > was > discussed here [1], but ultimately it is up to the DBA to properly monitor > and mitigate workloads that are impacting their vacuum. > > [1] > https://www.postgresql.org/message-id/flat/CAOzEurSgy-gDtwFmEbj5%2BR9PL0_G3qYB6nnzJtNStyuf87VSVg%40mail.gmail.com > > -- > Sami Imseih > Amazon Web Services (AWS) >