Re: another autovacuum scheduling thread
Peter Geoghegan <pg@bowt.ie>
From: Peter Geoghegan <pg@bowt.ie>
To: Andres Freund <andres@anarazel.de>
Cc: Nathan Bossart <nathandbossart@gmail.com>, pgsql-hackers@postgresql.org
Date: 2025-10-09T19:45:32Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Add rudimentary table prioritization to autovacuum.
- d7965d65fc5b 19 (unreleased) landed
-
Trigger more frequent autovacuums with relallfrozen
- 06eae9e6218a 18.0 cited
-
Harden nbtree page deletion.
- c34787f91058 14.0 cited
-
Check for interrupts inside the nbtree page deletion code.
- 3a01f68e35a3 12.0 cited
On Thu, Oct 9, 2025 at 12:15 PM Andres Freund <andres@anarazel.de> wrote: > > Each worker would consult this table before processing. If the table is > > there, it would remove it from the shared table and skip processing it. > > Then the next worker would try processing the table again. > > > > I also wonder how hard it would be to gracefully catch the error and let > > the worker continue with the rest of its list... > > The main set of cases I've seen are when workers get hung up permanently in > corrupt indexes. How recently was this? I'm aware of problems like that that we discussed around 2018, but they were greatly mitigated. First by your commit 3a01f68e, then by my commit c34787f9. In general, there's no particularly good reason why (at least with nbtree indexes) VACUUM should ever hang forever. The access pattern is overwhelmingly simple, sequential access. The only exception is nbtree page deletion (plus backtracking), where it isn't particularly hard to just be very careful about self-deadlock. > There never is actually an error, the autovacuums just get > terminated as part of whatever independent reason there is to restart. What do you mean? In general I'd expect nbtree VACUUM of a corrupt index to either not fail at all (we'll soldier on to the best of our ability when page deletion encounters an inconsistency), or to get permanently stuck due to locking the same page twice/self-deadlock (though as I said, those problems were mitigated, and might even be almost impossible these days). Every other case involves some kind of error (e.g., an OOM is just about possible). I agree with you about using a perfectly deterministic order coming with real downsides, without any upside. Don't interpret what I've said as expressing opposition to that idea. -- Peter Geoghegan