Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

Tom Lane <tgl@sss.pgh.pa.us>

From: Tom Lane <tgl@sss.pgh.pa.us>

To: Robert Haas <robertmhaas@gmail.com>

Cc: Kevin Grittner <Kevin.Grittner@wicourts.gov>, Florian Helmberger <fh@25th-floor.com>, Alvaro Herrera <alvherre@commandprompt.com>, Greg Stark <gsstark@mit.edu>, pgsql-hackers@postgresql.org

Date: 2011-05-28T19:01:35Z

Lists: pgsql-hackers

Attachments

vac-analyze-reltuple-changes.patch (text/x-patch) patch

Robert Haas <robertmhaas@gmail.com> writes:
> Still, maybe we don't have a better option.  If it were me, I'd add an
> additional safety valve: use your formula if the percentage of the
> relation scanned is above some threshold where there's unlikely to be
> too much skew.  But if the percentage scanned is too small, then don't
> use that formula.  Instead, only update relpages/reltuples if the
> relation is now larger; set relpages to the new actual value, and
> scale up reltuples proportionately.

> However, I just work here.  It's possible that I'm worrying about a
> problem that won't materialize in practice.

Attached is a proposed patch to fix these issues.  Experimentation
convinced me that including a fudge factor for VACUUM's results made
things *less* accurate, not more so.  The reason seems to be bound up in
Greg Stark's observation that the unmodified calculation is equivalent
to assuming that the old average tuple density still applies to the
unscanned pages.  In a VACUUM, we know that the unscanned pages are
exactly those that have had no changes since (at least) the last vacuum,
which means that indeed the old density ought to be a good estimate.
Now, this reasoning can break down if the table's tuple density is
nonuniform, but what I found in my testing is that if you vacuum after a
significant change in a table (such as deleting a lot of rows), and you
don't apply the full unfudged correction, you get a badly wrong result.
I think that's a more significant issue than the possibility of drift
over time.

I also found that Greg was right in thinking that it would help if we
tweaked lazy_scan_heap to not always scan the first
SKIP_PAGES_THRESHOLD-1 pages even if they were
all_visible_according_to_vm.  That seemed to skew the results if those
pages weren't representative.  And, for the case of a useless manual
vacuum on a completely clean table, it would cause the reltuples value
to drift when there was no reason to change it at all.

Lastly, this patch removes a bunch of grotty interconnections between
VACUUM and ANALYZE that were meant to prevent ANALYZE from updating the
stats if VACUUM had done a full-table scan in the same command.  With
the new logic it's relatively harmless if ANALYZE does that, and anyway
autovacuum frequently fires the two cases independently anyway, making
all that logic quite useless in the normal case.  (This simplification
accounts for the bulk of the diff, actually.)

Comments?

			regards, tom lane