Re: another autovacuum scheduling thread
David Rowley <dgrowleyml@gmail.com>
From: David Rowley <dgrowleyml@gmail.com>
To: Nathan Bossart <nathandbossart@gmail.com>
Cc: Sami Imseih <samimseih@gmail.com>, Robert Haas <robertmhaas@gmail.com>, Jeremy Schneider <schneider@ardentperf.com>, pgsql-hackers@postgresql.org
Date: 2025-10-27T22:47:08Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Add rudimentary table prioritization to autovacuum.
- d7965d65fc5b 19 (unreleased) landed
-
Trigger more frequent autovacuums with relallfrozen
- 06eae9e6218a 18.0 cited
-
Harden nbtree page deletion.
- c34787f91058 14.0 cited
-
Check for interrupts inside the nbtree page deletion code.
- 3a01f68e35a3 12.0 cited
The patch is starting to look good. Here's a review of v5:
1. I think the following code at the bottom of
relation_needs_vacanalyze() can be deleted. You've added the check to
ensure *doanalyze never gets set to true for pg_statistic.
/* ANALYZE refuses to work with pg_statistic */
if (relid == StatisticRelationId)
*doanalyze = false;
2. As #1, but in recheck_relation_needs_vacanalyze(), the following I
think can now be removed:
/* ignore ANALYZE for toast tables */
if (classForm->relkind == RELKIND_TOASTVALUE)
*doanalyze = false;
3. Would you be able to include what the idea behind the * 1.05 in the
preceding comment?
On Tue, 28 Oct 2025 at 05:06, Nathan Bossart <nathandbossart@gmail.com> wrote:
> + effective_xid_failsafe_age = Max(vacuum_failsafe_age,
> + autovacuum_freeze_max_age * 1.05);
> + effective_mxid_failsafe_age = Max(vacuum_multixact_failsafe_age,
> + autovacuum_multixact_freeze_max_age * 1.05);
I assume it's to workaround some strange configuration settings, but
don't know for sure, or why 1.05 is a good value.
4. I think it might be neater to format the following as 3 separate "if" tests:
> + if (force_vacuum ||
> + vactuples > vacthresh ||
> + (vac_ins_base_thresh >= 0 && instuples > vacinsthresh))
> + {
> + *dovacuum = true;
> + *score = Max(*score, (double) vactuples / Max(vacthresh, 1));
> + if (vac_ins_base_thresh >= 0)
> + *score = Max(*score, (double) instuples / Max(vacinsthresh, 1));
> + }
> + else
> + *dovacuum = false;
i.e:
if (force_vacuum)
*dovacuum = true;
if (vactuples > vacthresh)
{
*dovacuum = true;
*score = Max(*score, (double) vactuples / Max(vacthresh, 1));
}
if (vac_ins_base_thresh >= 0 && instuples > vacinsthresh)
{
*dovacuum = true;
*score = Max(*score, (double) instuples / Max(vacinsthresh, 1));
}
and also get rid of all the "else *dovacuum = false;" (and *dovacuum =
false) in favour of setting those to false at the top of the function.
It's just getting harder to track that those parameters are getting
set in all cases when they're meant to be.
doing that also gets rid of the duplicative "if (vac_ins_base_thresh
>= 0)" check and also saves doing the score calc when the inputs to it
don't make sense. The current code is relying on Max always picking
the current *score when the threshold isn't met.
David