Thread

  1. Re: Track skipped tables during autovacuum and autoanalyze

    Sami Imseih <samimseih@gmail.com> — 2026-05-15T16:12:36Z

    > > If the table is dropped, there are no stats to update. right?
    >
    > Ops, right. I focused too much on "all warnings should be visible in
    > the statistic, so the sum of warnings and statistics should match",
    > but of course that's not the case.
    >
    > > However, I'm not entirely sure whether this behavior is always guaranteed.
    > > Could anyone clarify this?
    >
    > There's another different corner-case if I move the injection point
    > inside pgstat_report_skipped_vacuum_analyze, after releasing the
    > syscache.
    >
    > If the drop happens at that point, we insert an orphaned record into
    > the statistics, and it will be visible with the internal functions
    > (e.g. pg_stat_get_skipped_autoanalyze_count).
    >
    > It is still invisible in the pg_stat_all_tables view, but now that
    > I've looked more at the code, I think internally it will stay there
    > permanently, even surviving pg_stat_reset?
    
    Yeah, you’re right. I just realized that pgstat_get_entry_ref_locked()
    creates an entry if one does not already exist, which means we can
    leak entries if the table is dropped concurrently. After releasing the
    syscache entry, we still have the relid, but the table may be dropped
    before the stats update runs. In that case, the stats update
    recreates the relation entry, and that entry is then leaked
    permanently.
    
    The tricky part of this feature is that we cannot hold a lock while updating
    the entry, because the whole point of the stat is to track cases where
    we failed to acquire a lock.
    
    So, it seems pgstat_init_function_usage() deals with the same pattern
    and they deal with it by doing a second syscache lookup after creating
    the entry, and if the relation is gone, drop the entry. We can do the same
    thing here. See v8-0001.
    
    I also attached a v8-0002 building on Zsolt's injection point test and his
    earlier suggestion to "move the injection point inside
    pgstat_report_skipped_vacuum_analyze, after releasing the syscache".
    These are isolation tests that rely on injection points. The tests are:
    
    1. Table Drop + rollback. Lock is skipped and table is dropped but rolled
    back. The stats are updated.
    2. Table Drop + commit. Lock is skipped and table is dropped and committed.
    No table entry to record a stat for.
    
    The tests use pg_stat_get_skipped_vacuum_count to verify that no orphaned
    entry is left behind.
    
    --
    Sami