Re: another autovacuum scheduling thread

Sami Imseih <samimseih@gmail.com>

From: Sami Imseih <samimseih@gmail.com>

To: David Rowley <dgrowleyml@gmail.com>

Cc: Nathan Bossart <nathandbossart@gmail.com>, Robert Haas <robertmhaas@gmail.com>, Jeremy Schneider <schneider@ardentperf.com>, pgsql-hackers@postgresql.org

Date: 2025-11-11T20:25:36Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Add rudimentary table prioritization to autovacuum.
- d7965d65fc5b 19 (unreleased) landed
Trigger more frequent autovacuums with relallfrozen
- 06eae9e6218a 18.0 cited
Harden nbtree page deletion.
- c34787f91058 14.0 cited
Check for interrupts inside the nbtree page deletion code.
- 3a01f68e35a3 12.0 cited

> On Sat, 8 Nov 2025 at 08:23, Sami Imseih <samimseih@gmail.com> wrote:
> > > I'm confused at why we'd have set up our autovacuum trigger points as
> > > they are today because we think those are good times to do a
> > > vacuum/analyze, but then prioritise on something completely different.
> > > Surely if we think 20% dead tuples is worth a vacuum, we must
> > > therefore think that 40% dead tuples are even more worthwhile?!
> >
> > Sure, but thresholds alone don't indicate anything about the how quick
> > the table can be vacuumed, # of indexes, per table a/v settings, etc.
> > The average a/v time is a good proxy to determine this.
> >
> > What I am suggesting here is we think beyond thresholds for
> > prioritization, and to give a chance for more eligible tables to get
> > autovacuumed rather than workers being saturated on some
> > of the slowest-to-vacuum tables.
>
> Can you define "more eligible" here?

What I mean by “more eligible” is that once a worker has its list of tables
that meet the autovacuum thresholds, it’s trying to get through as many
of them as possible within some time window.

If the workers always go after the slowest tables first, they’ll spend most
of that time on just a few heavy ones, and a lot of other eligible tables might
end up waiting much longer to get processed.

Eventually the slow tables will be the bottleneck anyway.

> I think I'm not really grasping this because I don't understand why
> faster-to-vacuum tables should be prioritised over slower-to-vacuum
> tables. Can you explain why you think this is important?

The thing I’m hoping to address is something I’ve seen many times in practice.
Autovacuum workers can get stuck on specific large or slow tables, and when
that happens, users often end up running manual vacuums on those tables
just to keep things moving for the smaller/faster vacuumed tables.

Now, I am not so sure any type of autovacuum prioritization could actually
help in these cases. What does help is adding more autovacuum workers.

> if we have the autovacuum worker refresh the list and scores after
> it's done with a table and autovacuum_naptime has elapsed since the
> list was last refreshed?

That is an interesting idea, but refreshing the list that often may not
be such a good idea, it could be quite expensive on large catalogs.

--
Sami Imseih
Amazon Web Services (AWS)