Re: another autovacuum scheduling thread
David Rowley <dgrowleyml@gmail.com>
From: David Rowley <dgrowleyml@gmail.com>
To: wenhui qiu <qiuwenhuifx@gmail.com>
Cc: Nathan Bossart <nathandbossart@gmail.com>,
Sami Imseih <samimseih@gmail.com>, Robert Haas <robertmhaas@gmail.com>,
Jeremy Schneider <schneider@ardentperf.com>, pgsql-hackers@postgresql.org
Date: 2025-10-30T03:41:56Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Add rudimentary table prioritization to autovacuum.
- d7965d65fc5b 19 (unreleased) landed
-
Trigger more frequent autovacuums with relallfrozen
- 06eae9e6218a 18.0 cited
-
Harden nbtree page deletion.
- c34787f91058 14.0 cited
-
Check for interrupts inside the nbtree page deletion code.
- 3a01f68e35a3 12.0 cited
On Thu, 30 Oct 2025 at 15:58, wenhui qiu <qiuwenhuifx@gmail.com> wrote: > In fact, with the introduction of the vacuum_max_eager_freeze_failure_rate feature, if a table’s age still exceeds more than 1.x times the autovacuum_freeze_max_age, it suggests that the vacuum freeze process is not functioning properly. Once the age surpasses vacuum_failsafe_age, wraparound issues are likely to occur soon.Taking the average of vacuum_failsafe_age and autovacuum_freeze_max_age is not a complex approach. Under the default configuration, this average already exceeds four times the autovacuum_freeze_max_age. At that stage, a DBA should have already intervened to investigate and resolve why the table age is not decreasing. I don't think anyone would like to modify PostgreSQL in any way that increases the chances that a table gets as old as vacuum_failsafe_age. Regardless of the order in which tables are vacuumed, if a table gets as old as that then vacuum is configured to run too slowly, or there are not enough workers configured to cope with the given amount of work. I think we need to tackle prioritisation and rate limiting as two separate items. Nathan is proposing to improve the prioritisation in this thread and it seems to me that your concerns are with rate limiting. I've suggested an idea that might help with reducing the cost_delay based on the score of the table in this thread. I'd rather not introduce that as a topic for further discussion here (I imagine Nathan agrees). It's not as if the server is going to consume 1 billion xids in 5 mins. It's at least going to take a day to days or longer for that to happen and if autovacuum has not managed to get on top of the workload in that time, then it's configured to run too slowly and the cost_limit or delay needs to be adjusted. My concern is that there are countless problems with autovacuum and if you try and lump them all into a single thread to fix them all at once, we'll get nowhere. Autovacuum was added to core in 8.1, 20 years ago and I don't believe we've done anything to change the ratelimiting aside from reducing the default cost_delay since then. It'd be good to fix that at some point, just not here, please. FWIW, I agree with Nathan about keeping the score calculation non-magical. The score should be simple and easy to document. We can introduce complexity to it as and when it's needed and when the supporting evidence arrives, rather than from people waving their hands. David