Re: another autovacuum scheduling thread
Nathan Bossart <nathandbossart@gmail.com>
From: Nathan Bossart <nathandbossart@gmail.com>
To: Robert Treat <rob@xzilla.net>
Cc: David Rowley <dgrowleyml@gmail.com>, Sami Imseih <samimseih@gmail.com>, Robert Haas <robertmhaas@gmail.com>, Jeremy Schneider <schneider@ardentperf.com>, pgsql-hackers@postgresql.org
Date: 2025-11-12T20:10:16Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Add rudimentary table prioritization to autovacuum.
- d7965d65fc5b 19 (unreleased) landed
-
Trigger more frequent autovacuums with relallfrozen
- 06eae9e6218a 18.0 cited
-
Harden nbtree page deletion.
- c34787f91058 14.0 cited
-
Check for interrupts inside the nbtree page deletion code.
- 3a01f68e35a3 12.0 cited
On Tue, Nov 11, 2025 at 06:22:36PM -0500, Robert Treat wrote: > On Tue, Nov 11, 2025 at 3:27 PM David Rowley <dgrowleyml@gmail.com> wrote: >> On Wed, 12 Nov 2025 at 09:13, Nathan Bossart <nathandbossart@gmail.com> wrote: >> > My concern is that this might add already-processed tables back to the >> > list, so a worker might never be able to clear it. Maybe that's not a real >> > problem in practice for some reason, but it does feel like a step too far >> > for stage 1, as you said above. >> >> Oh, that's a good point. That's a very valid concern. I guess that >> could be fixed with a hashtable of vacuumed tables and skipping tables >> that exist in there, but the problem with that is that the table might >> genuinely need to be vacuumed again. It's a bit tricky to know when a >> 2nd vacuum is a legit requirement and when it's not. Figuring that out >> might me more logic that this code wants to know about. > > Yeah, there is a common theoretical pattern that always comes up in > these discussions where autovacuum gets stuck behind N big tables + > (AVMW - N) small tables that keep filtering up to the top of the list, > and I'm not saying that would never be a problem, but assuming the > algorithm is working correctly, this should be fairly avoidable, > because the use of xid age essentially works as a "hash of vacuumed > tables" equivalent for tracking purposes. I do think re-prioritization is worth considering, but IMHO we should leave it out of phase 1. I think it's pretty easy to reason about one round of prioritization being okay. The order is completely arbitrary today, so how could ordering by vacuum-related criteria make things any worse? In my view, changing the list contents in fancier ways (e.g., adding just-processed tables back to the list) is a step further that requires more discussion and testing. To be clear, I am totally for serious consideration of reprioritization, adjusting cost delay settings, etc., but as David has repeatedly stressed, we are unlikely to get anything committed if we try to boil the ocean. I'd love for this thread to spin off into all kinds of other autovacuum-related threads, but we should be taking baby steps if we want to accomplish anything here. -- nathan