Re: another autovacuum scheduling thread

David Rowley <dgrowleyml@gmail.com>

From: David Rowley <dgrowleyml@gmail.com>

To: Nathan Bossart <nathandbossart@gmail.com>

Cc: Robert Haas <robertmhaas@gmail.com>, Jeremy Schneider <schneider@ardentperf.com>, Sami Imseih <samimseih@gmail.com>, pgsql-hackers@postgresql.org

Date: 2025-10-22T19:34:49Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Add rudimentary table prioritization to autovacuum.
- d7965d65fc5b 19 (unreleased) landed
Trigger more frequent autovacuums with relallfrozen
- 06eae9e6218a 18.0 cited
Harden nbtree page deletion.
- c34787f91058 14.0 cited
Check for interrupts inside the nbtree page deletion code.
- 3a01f68e35a3 12.0 cited

On Thu, 23 Oct 2025 at 07:58, Nathan Bossart <nathandbossart@gmail.com> wrote:
> > That's a good point.  I wonder if we should try to make the wraparound
> > score independent of the *_freeze_max_age parameters (once the table age
> > surpasses said parameters).  Else, different settings will greatly impact
> > how aggressively tables are prioritized the closer they are to wraparound.
> > Even if autovacuum_freeze_max_age is set to 200M, it's not critically
> > important for autovacuum to pick up tables right away as soon as their age
> > reaches 200M.  But if the parameter is set to 2B, we _do_ want autovacuum
> > to prioritize tables right away once their age reaches 2B.
>
> I'm imagining something a bit like the following:
>
>     select xidage "age(relfrozenxid)",
>     power(1.001, xidage::float8 / (select min_val
>     from pg_settings where name = 'autovacuum_freeze_max_age')::float8)
>     xid_age_score from generate_series(0,2_000_000_000,100_000_000) xidage;
>
>      age(relfrozenxid) |   xid_age_score
>     -------------------+--------------------
>                      0 |                  1
>              100000000 | 2.7169239322355936
>              200000000 |   7.38167565355452
>              300000000 | 20.055451243143093

This does start to put the score > 1 before the table reaches
autovacuum_freeze_max_age. I don't think that's great as the score of
1.0 was meant to represent that the table now requires some autovacuum
work.

The main reason I was trying to keep the score scaling with the
percentage over the given threshold that the table is was that I had
imagined we could use the score number to start reducing the sleep
time between autovacuum_vacuum_cost_limit when the highest scoring
table persists in being high for too long. I was considering this to
fix the misconfigured autovacuum problem that so many people have. If
we scaled it the way similar to the query above, the score would look
high even before it reaches the limit.  This is the reason I was
scaling the score linear with the autovacuum_freeze_max_age with the
version I sent and only scaling exponentially after the failsafe age.
I wanted to talk about the "reducing the cost delay" feature
separately so as not to load up this thread and widen the scope for
varying opinions, but in its most trivial form, the
vacuum_cost_limit() code could be adjusted to only sleep for
autovacuum_vacuum_cost_delay / <the table's score>.

I think the one I proposed in [1] does this quite well. The table
remains eligible to be autovacuumed with any score >= 1.0, and there's
still a huge window of time to freeze a table once it's over
autovacuum_freeze_max_age before there are issues and the exponential
scaling once over failsafe age should ensure that the table is top of
the list for when the failsafe code kicks in and removes the cost
limit. If we had the varying sleep time as I mentioned above, the
failsafe code could even be removed as the
"autovacuum_vacuum_cost_delay / <tables score>" calculation would
effectively zero the sleep time with any table > failsafe age.

David

[1] https://postgr.es/m/CAApHDvqrd=SHVUytdRj55OWnLH98Rvtzqam5zq2f4XKRZa7t9Q@mail.gmail.com