Re: another autovacuum scheduling thread

Robert Haas <robertmhaas@gmail.com>

From: Robert Haas <robertmhaas@gmail.com>

To: Nathan Bossart <nathandbossart@gmail.com>

Cc: David Rowley <dgrowleyml@gmail.com>, Jeremy Schneider <schneider@ardentperf.com>, Sami Imseih <samimseih@gmail.com>, pgsql-hackers@postgresql.org

Date: 2025-10-10T18:42:57Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Add rudimentary table prioritization to autovacuum.
- d7965d65fc5b 19 (unreleased) landed
Trigger more frequent autovacuums with relallfrozen
- 06eae9e6218a 18.0 cited
Harden nbtree page deletion.
- c34787f91058 14.0 cited
Check for interrupts inside the nbtree page deletion code.
- 3a01f68e35a3 12.0 cited

On Fri, Oct 10, 2025 at 1:31 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> Here's a prototype of a "score" approach.  Two notes:
>
> * I've given special priority to anti-wraparound vacuums.  I think this is
> important to avoid focusing too much on bloat when wraparound is imminent.
> In any case, we need a separate wraparound score in case autovacuum is
> disabled.
>
> * I didn't include the analyze threshold in the score because it doesn't
> apply to TOAST tables, and therefore would artificially lower their
> prioritiy.  Perhaps there is another way to deal with this.
>
> This is very much just a prototype of the basic idea.  As-is, I think it'll
> favor processing tables with lots of bloat unless we're in an
> anti-wraparound scenario.  Maybe that's okay.  I'm not sure how scientific
> we want to be about all of this, but I do intend to try some long-running
> tests.

I think this is a reasonable starting point, although I'm surprised
that you chose to combine the sub-scores using + rather than Max.

I think it will take a lot of experimentation to figure out whether
this particular algorithm (or any other) works well in practice. My
intuition (for whatever that is worth to you, which may not be much)
is that what will anger users is cases when we ignore a horrible
problem to deal with a routine problem. Figuring out how to design the
scoring system to avoid such outcomes is the hard part of this
problem, IMHO. For this particular algorithm, the main hazards that
spring to mind for me are:

- The wraparound score can't be more than about 10, but the bloat
score could be arbitrarily large, especially for tables with few
tuples, so there may be lots of cases in which the wraparound score
has no impact on the behavior.

- The patch attempts to guard against this by disregarding the
non-wraparound portion of the score once the wraparound portion
reaches 1.0, but that results in an abrupt behavior shift at that
point. Suddenly we go from mostly ignoring the wraparound score to
entirely ignoring the bloat score. This might result in the system
abruptly ignoring tables that are bloating extremely rapidly in favor
of trying to catch up in a wraparound situation that is not yet
terribly urgent.

When I've thought about this problem -- and I can't claim to have
thought about it very hard -- it's seemed to me that we need to (1)
somehow normalize everything to somewhat similar units and (2) make
sure that severe wraparound danger always wins over every other
consideration, but mild wraparound danger can lose to severe bloat.

-- 
Robert Haas
EDB: http://www.enterprisedb.com