Re: another autovacuum scheduling thread

Jeremy Schneider <schneider@ardentperf.com>

From: Jeremy Schneider <schneider@ardentperf.com>
To: David Rowley <dgrowleyml@gmail.com>
Cc: Sami Imseih <samimseih@gmail.com>, Nathan Bossart <nathandbossart@gmail.com>, pgsql-hackers@postgresql.org
Date: 2025-10-09T00:30:30Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Add rudimentary table prioritization to autovacuum.

  2. Trigger more frequent autovacuums with relallfrozen

  3. Harden nbtree page deletion.

  4. Check for interrupts inside the nbtree page deletion code.

On Wed, 8 Oct 2025 17:27:27 -0700
Jeremy Schneider <schneider@ardentperf.com> wrote:

> On Thu, 9 Oct 2025 12:59:23 +1300
> David Rowley <dgrowleyml@gmail.com> wrote:
> 
> > I believe that is methodology for processing work applies much
> > better in scenarios where there's no new work continually arriving
> > and there's no adverse effects from giving a lower priority to
> > certain portions of the work. I don't think you can apply that so
> > easily to autovacuum as there are scenarios where the work can pile
> > up faster than it can be handled.  Also, smaller tables can bloat
> > in terms of growth proportional to the original table size much
> > more quickly than larger tables and that could have huge
> > consequences for queries to small tables which are not indexed
> > sufficiently to handle being becoming bloated and large.
> 
> I'm arguing that it works well with autovacuum. Not saying there
> aren't going to be certain workloads that it's suboptimal for. We're
> talking about sorting by (M)XID age. As the clock continues to move
> forward any table that doesn't get processed naturally moves up the
> queue for the next autovac run. I think the concerns are minimal here
> and this would be a good change in general.

Hmm, doesn't work quite like that if the full queue needs to be
processed before the next iteration ~ but at steady state these small
tables are going to get processed at the same rate whether they were
top of bottom of the queue right?

And in non-steady-state conditions, this seems like a better order than
pg_class ordering?

-Jeremy