Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
Tom Lane <tgl@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Robert Haas <robertmhaas@gmail.com>
Cc: Kevin Grittner <Kevin.Grittner@wicourts.gov>, Florian Helmberger <fh@25th-floor.com>, Alvaro Herrera <alvherre@commandprompt.com>, Greg Stark <gsstark@mit.edu>, pgsql-hackers@postgresql.org
Date: 2011-05-28T19:01:35Z
Lists: pgsql-hackers
Attachments
- vac-analyze-reltuple-changes.patch (text/x-patch) patch
Robert Haas <robertmhaas@gmail.com> writes: > Still, maybe we don't have a better option. If it were me, I'd add an > additional safety valve: use your formula if the percentage of the > relation scanned is above some threshold where there's unlikely to be > too much skew. But if the percentage scanned is too small, then don't > use that formula. Instead, only update relpages/reltuples if the > relation is now larger; set relpages to the new actual value, and > scale up reltuples proportionately. > However, I just work here. It's possible that I'm worrying about a > problem that won't materialize in practice. Attached is a proposed patch to fix these issues. Experimentation convinced me that including a fudge factor for VACUUM's results made things *less* accurate, not more so. The reason seems to be bound up in Greg Stark's observation that the unmodified calculation is equivalent to assuming that the old average tuple density still applies to the unscanned pages. In a VACUUM, we know that the unscanned pages are exactly those that have had no changes since (at least) the last vacuum, which means that indeed the old density ought to be a good estimate. Now, this reasoning can break down if the table's tuple density is nonuniform, but what I found in my testing is that if you vacuum after a significant change in a table (such as deleting a lot of rows), and you don't apply the full unfudged correction, you get a badly wrong result. I think that's a more significant issue than the possibility of drift over time. I also found that Greg was right in thinking that it would help if we tweaked lazy_scan_heap to not always scan the first SKIP_PAGES_THRESHOLD-1 pages even if they were all_visible_according_to_vm. That seemed to skew the results if those pages weren't representative. And, for the case of a useless manual vacuum on a completely clean table, it would cause the reltuples value to drift when there was no reason to change it at all. Lastly, this patch removes a bunch of grotty interconnections between VACUUM and ANALYZE that were meant to prevent ANALYZE from updating the stats if VACUUM had done a full-table scan in the same command. With the new logic it's relatively harmless if ANALYZE does that, and anyway autovacuum frequently fires the two cases independently anyway, making all that logic quite useless in the normal case. (This simplification accounts for the bulk of the diff, actually.) Comments? regards, tom lane